Classification Model for Cancerous Tumor

Classification Model for Cancerous Tumors

Problem Statement:

The goal of this project was to accurately classify tumors as cancerous or non-cancerous based on various features to aid in early detection and treatment.

Goals:

To experiment and build a robust classification model that can achieve high accuracy in predicting a useful and relevant casestudy.

Techniques Used:

Data Preprocessing: Handling missing values, feature scaling, and encoding categorical variables.
Feature Extraction: Using techniques such as TF-IDF (Term Frequency-Inverse Document Frequency) to convert text into numerical features.
Modeling: Implementing various classification algorithms such as Logistic Regression, Decision Trees, and Support Vector Machines (SVM).
Evaluation: Evaluating models using metrics like accuracy, precision, recall, and F1-score to select the best-performing model.

Methodology

Results

The classification models showed high degrees of accuracy with a 98% score on the test data. The dataset included over 50 different estimators to help predict the malignant tumors. Model predicted 39 malignant and 73 benign tumors correctly with 1 missed malignant and 1 missed benign tumor.

Confusion Matix

Link to Code