NLP Project on Text Classification

Techniques Used:

Data Preprocessing: Steps include tokenization, stop word removal, and text normalization.
Feature Extraction: Using techniques such as TF-IDF (Term Frequency-Inverse Document Frequency) to convert text into numerical features.
Modeling: Implementing various classification algorithms such as, Logistic Regression, and Support Vector Machines (SVM).
Evaluation: Evaluating the models using metrics like accuracy, precision, recall, and F1-score to select the best-performing model.

Methodology

Results

The code was able to visualize the top 30 words of the Purdue Global website after removing keywords. The lower section of the project finds synonyms and antonyms of words using the wordnet dictionary and a corpus from the natural launguage tool kit (NLTK)

Link to Code

Problem Statement:

The goal of this project is to classify textual data into predefined categories to explore the capabilities of NLP.

Goals:

To learn how NLP can be effectively utilized to achieve meaningful results.