In the article Using Scikit-Surprise to Create a Simple Recipe Collaborative Filtering Recommender System we developed the simplest recommender system using the scikit-surprise package and saw how to use the built-in algorithms it contains, such as KNN or SVD. I’d like to take my recommender systems practice a step further and attempt to create my own prediction algorithm. Surprise allows you to override its core classes and methods in order to tailor your own algorithm and try to improve
Continue readingTag: NLP
NLP: Opinion classification
Let’s perform some classification methods on the same tripadvisor data as in the post https://www.alldatascience.com/nlp/nlp-target-and-aspect-detection-with-python. In this case we are going to read and preprocess the data again, then we are going to vectorize it in different ways, 1. With TF-IDF vectorizer that creates vectors having into account the frequency of words in a document and the frequency of words in all documents, decreasing weight of the words that appear too often (they can bee
Continue readingNLP: Sentiment Analysis with Pytorch.
In this work we build a sentiment analysis model based on a BERT-GRU model on tripadvisor data, in order to try to predict if an opinion is positive or negative. BERT (Bidirectional Encoder Representations from Transformers) is a pretrained model based on transformers that has into account the context of the words. GRU layer is used instead of LSTM in this case.
Continue readingNLP: Target and aspect detection with Python.
In this post we perform target and aspect detection on a dataset about tripadvisor opinions. Target or topic are the words or topics the opinions are about. Aspects are parts or features of the target. Here we explore the target detection using word embeddings (Word2Vec) which extracts similar words by context and try to extract aspects of the target by searching close words wusing the WordNet synsets. First, we perform data preprocessing by removing stopwords
Continue reading