Deep Learning: COVID-19 detection in X-Ray with CNN

In this project we develop a Deep Learning detector of Covid-19 in radiographs. For this purpose, we use images from the “Covid-chestxray-dataset” [3], generated by researchers from the Mila research group and the University of Montreal [4]. We also use images of radiographs of healthy and bacterial pneumonia patients extracted from Kaggle’s “Chest X-Ray Images (Pneumonia)” competition [5]. In total, we have a number of 426 images, divided into training (339 images), validation (42 images)

Continue reading

NLP: Opinion classification

Let’s perform some classification methods on the same tripadvisor data as in the post https://www.alldatascience.com/nlp/nlp-target-and-aspect-detection-with-python. In this case we are going to read and preprocess the data again, then we are going to vectorize it in different ways, 1. With TF-IDF vectorizer that creates vectors having into account the frequency of words in a document and the frequency of words in all documents, decreasing weight of the words that appear too often (they can bee

Continue reading

Data Mining in R

This post describes an analysis performed on an online news dataset. Data cleaning, data transformation, and dimensinality reduction are performed. Next, we try some supervised and unsupervised models such as decision trees, clustering and logistic models to check their accuracy on the prediction of the popularity of the news.

Continue reading

NLP: Sentiment Analysis with Pytorch.

In this work we build a sentiment analysis model based on a BERT-GRU model on tripadvisor data, in order to try to predict if an opinion is positive or negative. BERT (Bidirectional Encoder Representations from Transformers) is a pretrained model based on transformers that has into account the context of the words. GRU layer is used instead of LSTM in this case.

Continue reading

NLP: Target and aspect detection with Python.

In this post we perform target and aspect detection on a dataset about tripadvisor opinions. Target or topic are the words or topics the opinions are about. Aspects are parts or features of the target. Here we explore the target detection using word embeddings (Word2Vec) which extracts similar words by context and try to extract aspects of the target by searching close words wusing the WordNet synsets. First, we perform data preprocessing by removing stopwords

Continue reading

Wine dataset analysis with Python

In this post we explore the wine dataset. First, we perform descriptive and exploratory data analysis. Next, we run dimensionality reduction with PCA and TSNE algorithms in order to check their functionality. Finally a random forest classifier is implemented, comparing different parameter values in order to check how the impact on the classifier results.

Continue reading

Analysis of Variance (ANOVA) with R

In this post we are going to perform an analysis of variance (ANOVA) with R in order to analyze the influences of different variables such as race, education level or job class in the wage. The data is the same as in the post Descriptive Analysis with R, so you can visit that post in order to get more detail about the data used. Let’s start the analyis. Discussion By means of ANOVA we have

Continue reading

Linear and logistic regression with R

This post is an analysis that applies linear and logistic regression on provided data with some health parameters of 2353 patiens who suffered surgeries. We try to discover the relation among some of the parameters and predict the probability of suffering an infection during the surgery. Discussion According to the results obtained, we can see that when studied separately, all the variables have an influence on the probability of suffering a post-surgical infection (diabetes, malnutrition,

Continue reading