ML Project Proposal
CS 4641 Group 58 • Georgia Institute of Technology • Nannan Aravazhi, Pranavkrishna Suresh, Henry Lin, Gabriel Ferrer, Mihir Balsara
Introduction & Background
Literature Review
Deep learning architectures play a crucial role in image classification and feature extraction:
- VGG-19: Utilizes stacked 3×3 convolutions, max pooling, and fully connected layers for feature extraction. While effective, it has high computation costs and inefficient gradient flow [1].
- ResNet-50: Addresses vanishing gradients with residual connections, enabling deeper networks and stable convergence [2].
- DenseNet-121: Uses densely connected layers to enhance gradient propagation and reduce redundancy, preserving low-dimensional features for improved performance in data-scarce environments [3].
Dataset Description
The following models are trained on ImageNet, leveraging large-scale labeled datasets for feature learning:
- VGG-19: Trained on ImageNet (1.2M+ labeled images, 1,000 categories), making it a benchmark for deep learning. Supports transfer learning and feature extraction across various applications.
- ResNet-50: Learns hierarchical features using skip connections, improving convergence and mitigating vanishing gradients. Suitable for fine-tuning in detection and segmentation.
- DenseNet-121: Uses densely connected layers for feature reuse, enhancing learning efficiency. Its robust feature extraction benefits image classification and medical analysis.
Dataset Links
Problem Definition
Problem Statement
Distracted driving is among the prevalent causes of road accidents, often resulting in severe injuries and fatalities. Current surveillance measures rely on manual enforcement or post-incident analysis, with no real time intervention. This project intends to use machine learning to detect distracted driving in real time. This will encourage driver awareness, improve vehicle safety, and prevent accidents.
Motivation
With the widespread use of smartphones and other in-car distractions, automated systems for detection of risky driving behaviors are important. Traditional enforcement methods are limited in real-time effectiveness. An AI-driven solution is capable of detecting distractions in real time, aiding driver assist systems, insurance assessments, and regulatory enforcement. By reducing road accidents, this solution can ultimately help save lives.
Methods
Data Preprocessing Methods
- Image Preprocessing with OpenCV (cv2): Standardizes image sizes and prepares them for deep learning models.
- Image Loading, Reshaping, and Flattening: Images are stored as NumPy arrays and can be flattened for classical machine learning.
- Data Augmentation with imgaug.augmenters: Enhances training data using transformations such as:
- Rotation
- Shear
- Flipping
- Scaling
- Feature Engineering with KerasClassifier: Integrated into a scikit-learn pipeline for combining deep learning with classical machine learning. Supports feature extraction using models like VGG16 and ResNet50.
ML Algorithms/Models
- Deep Learning Models:
- Transfer Learning with ResNet50 & VGG16: Uses pretrained convolutional bases with additional dense layers for classification, leveraging prior knowledge from ImageNet.
- CNN: Extracts spatial features through convolutional layers, pooling, and dense layers. Captures spatial hierarchies to help distinguish driver behaviors.
- Dense Neural Network: Uses flattened image input followed by dense layers with ReLU and dropout for non-linear learning and overfitting reduction.
- Classical Models:
- K-Nearest Neighbors: Baseline classifier using nearest neighbors, effective for balanced datasets.
- Logistic Regression: Computes class probabilities efficiently.
Learning Methods
- Supervised Learning: All discussed methods fall under supervised learning, where images are labeled to enable direct class mapping, making it effective for classification tasks.
Potential Results & Discussion
Quantitative Metrics
- Accuracy: Measures overall correctness across all classes.
- Precision: Evaluates true positive predictions among all predicted positives.
- Recall: Assesses the model’s ability to detect relevant instances.
- F1 Score: Balances precision and recall for overall effectiveness.
Project Goals
- Performance Goals: Aim to achieve at least 85% accuracy, with precision and recall above 80%.
- Sustainability Considerations: Optimize models to reduce computational costs and improve efficiency.
- Ethical Considerations: Ensure data privacy, mitigate bias, and monitor fairness in model predictions.
Expected Results
We anticipate strong performance across all metrics, with supervised learning providing the best accuracy. Continuous refinement and parameter tuning will help us surpass our performance targets while ensuring an ethical and fair AI implementation.
References
[1] M. Mateen, J. Wen, Nasrullah, S. Song, and Z. Huang, “Fundus image classification using VGG-19 architecture with PCA and SVD,”MDPI,https://www.mdpi.com/2073-8994/11/1/1 (accessed Feb. 20, 2025).
[2] B. Mandal, A. Okeukwu, and Y. Theis, “Masked face recognition using ResNet-50,”arXiv.org,https://arxiv.org/abs/2104.08997 (accessed Feb. 20, 2025).
[3] B. Li, “Facial expression recognition by DenseNet-121,”Multi-Chaos, Fractal and Multi-Fractional Artificial Intelligence of Different Complex Systems,https://www.sciencedirect.com/science/article/abs/pii/B9780323900324000195 (accessed Feb. 21, 2025).
Additional Information
Gantt Chart
Task | Assigned To | Start Date | End Date | Completion |
---|---|---|---|---|
Introduction & Background | Nannan Aravazhi | 2/14/2024 | 2/21/2024 | 1 |
Problem Definition | Pranavkrishna Suresh | 2/15/2024 | 2/21/2024 | 1 |
Methods | Mihir Balsara, Gabriel Ferrer | 2/16/2024 | 2/21/2024 | 1 |
Potential Dataset | Henry Lin | 2/16/2024 | 2/21/2024 | 1 |
Potential Results & Discussion | All | 2/20/2024 | 2/21/2024 | 1 |
Video Creation & Recording | All | 2/21/2024 | 2/21/2024 | 1 |
GitHub Page | Mihir Balsara | 2/20/2024 | 2/21/2024 | 1 |
Contribution Table
Name | Proposal Contributions |
---|---|
Nannan Aravazhi | Step 1, Step 5, Gantt Chart, Contribution Table |
Pranavkrishna Suresh | Step 2, Website Development, Hosting Github Pages, Logistics |
Mihir Balsara | Step 4, Hosting Github Pages, Logistics |
Gabriel Ferrer | Step 3, Script, Creating Slides |
Henry Lin | Step 3, Creating Slides, Gantt Chart |
GitHub Repository
Here is the link to our Github repository:
https://github.gatech.edu/mbalsara3/mlproposalProject Award Eligibility
Yes, we are interested in being considered for the Project Award.