Portfolio Project: Predict Rating given Amazon Product Reviews using NLP
Go to main | Course Page
Section 1: Introduction
- Understanding the Dataset
- Download Resources
- Problem objective and framing
- Packages used
- Load data and fix the scores
Section 2: Exploratory Data Analysis (EDA) and Theory
- Data Cleaning - Fix duplicates
- Why convert text to a vector
- Bag of words model
- Similarity
- Disadvantages of BoW
- Binary Bag of Words
Section 3: Text Processing
- Part 1 - Stop words
- Part 2 - Why make it lower case
- Part 3 - Stemming and Lemmatization
- Code Demo - Preprocessing Review Text Data
- Code Demo - Preprocessing Summary Text Data
Section 4: Feature Engineering
- Code Demo - Bag of Words
- Unigram, Bigrams and Trigrams
- Code Demo - Create Bigrams and Trigrams
Section 5: TFIDF
- What is TF-IDF and why.
- Why use log in TFIDF.
- Code Demo - TFIDF
Section 6: Word2Vec
- Introduction to Word2Vec
- Code Demo - Training Word2Vec
- Averaging Word2Vec
- TFIDF weighted Word2Vec