ML Project Proposal

CS 4641 Group 58 • Georgia Institute of Technology • Nannan Aravazhi, Pranavkrishna Suresh, Henry Lin, Gabriel Ferrer, Mihir Balsara

Introduction & Background

Literature Review

Deep learning architectures play a crucial role in image classification and feature extraction:

  • VGG-19: Utilizes stacked 3×3 convolutions, max pooling, and fully connected layers for feature extraction. While effective, it has high computation costs and inefficient gradient flow [1].
  • ResNet-50: Addresses vanishing gradients with residual connections, enabling deeper networks and stable convergence [2].
  • DenseNet-121: Uses densely connected layers to enhance gradient propagation and reduce redundancy, preserving low-dimensional features for improved performance in data-scarce environments [3].

Dataset Description

The following models are trained on ImageNet, leveraging large-scale labeled datasets for feature learning:

  • VGG-19: Trained on ImageNet (1.2M+ labeled images, 1,000 categories), making it a benchmark for deep learning. Supports transfer learning and feature extraction across various applications.
  • ResNet-50: Learns hierarchical features using skip connections, improving convergence and mitigating vanishing gradients. Suitable for fine-tuning in detection and segmentation.
  • DenseNet-121: Uses densely connected layers for feature reuse, enhancing learning efficiency. Its robust feature extraction benefits image classification and medical analysis.

Problem Definition

Problem Statement

Distracted driving is among the prevalent causes of road accidents, often resulting in severe injuries and fatalities. Current surveillance measures rely on manual enforcement or post-incident analysis, with no real time intervention. This project intends to use machine learning to detect distracted driving in real time. This will encourage driver awareness, improve vehicle safety, and prevent accidents.

Motivation

With the widespread use of smartphones and other in-car distractions, automated systems for detection of risky driving behaviors are important. Traditional enforcement methods are limited in real-time effectiveness. An AI-driven solution is capable of detecting distractions in real time, aiding driver assist systems, insurance assessments, and regulatory enforcement. By reducing road accidents, this solution can ultimately help save lives.

Methods

Data Preprocessing Methods

  • Image Preprocessing with OpenCV (cv2): Standardizes image sizes and prepares them for deep learning models.
  • Image Loading, Reshaping, and Flattening: Images are stored as NumPy arrays and can be flattened for classical machine learning.
  • Data Augmentation with imgaug.augmenters: Enhances training data using transformations such as:
    • Rotation
    • Shear
    • Flipping
    • Scaling
    Improves generalization and reduces overfitting.
  • Feature Engineering with KerasClassifier: Integrated into a scikit-learn pipeline for combining deep learning with classical machine learning. Supports feature extraction using models like VGG16 and ResNet50.

ML Algorithms/Models

  • Deep Learning Models:
    • Transfer Learning with ResNet50 & VGG16: Uses pretrained convolutional bases with additional dense layers for classification, leveraging prior knowledge from ImageNet.
    • CNN: Extracts spatial features through convolutional layers, pooling, and dense layers. Captures spatial hierarchies to help distinguish driver behaviors.
    • Dense Neural Network: Uses flattened image input followed by dense layers with ReLU and dropout for non-linear learning and overfitting reduction.
  • Classical Models:
    • K-Nearest Neighbors: Baseline classifier using nearest neighbors, effective for balanced datasets.
    • Logistic Regression: Computes class probabilities efficiently.

Learning Methods

  • Supervised Learning: All discussed methods fall under supervised learning, where images are labeled to enable direct class mapping, making it effective for classification tasks.

Potential Results & Discussion

Quantitative Metrics

  • Accuracy: Measures overall correctness across all classes.
  • Precision: Evaluates true positive predictions among all predicted positives.
  • Recall: Assesses the model’s ability to detect relevant instances.
  • F1 Score: Balances precision and recall for overall effectiveness.

Project Goals

  • Performance Goals: Aim to achieve at least 85% accuracy, with precision and recall above 80%.
  • Sustainability Considerations: Optimize models to reduce computational costs and improve efficiency.
  • Ethical Considerations: Ensure data privacy, mitigate bias, and monitor fairness in model predictions.

Expected Results

We anticipate strong performance across all metrics, with supervised learning providing the best accuracy. Continuous refinement and parameter tuning will help us surpass our performance targets while ensuring an ethical and fair AI implementation.

References

[1] M. Mateen, J. Wen, Nasrullah, S. Song, and Z. Huang, “Fundus image classification using VGG-19 architecture with PCA and SVD,”MDPI,https://www.mdpi.com/2073-8994/11/1/1 (accessed Feb. 20, 2025).

[2] B. Mandal, A. Okeukwu, and Y. Theis, “Masked face recognition using ResNet-50,”arXiv.org,https://arxiv.org/abs/2104.08997 (accessed Feb. 20, 2025).

[3] B. Li, “Facial expression recognition by DenseNet-121,”Multi-Chaos, Fractal and Multi-Fractional Artificial Intelligence of Different Complex Systems,https://www.sciencedirect.com/science/article/abs/pii/B9780323900324000195 (accessed Feb. 21, 2025).

Additional Information

Gantt Chart

TaskAssigned ToStart DateEnd DateCompletion
Introduction & BackgroundNannan Aravazhi2/14/20242/21/20241
Problem DefinitionPranavkrishna Suresh2/15/20242/21/20241
MethodsMihir Balsara, Gabriel Ferrer2/16/20242/21/20241
Potential DatasetHenry Lin2/16/20242/21/20241
Potential Results & DiscussionAll2/20/20242/21/20241
Video Creation & RecordingAll2/21/20242/21/20241
GitHub PageMihir Balsara2/20/20242/21/20241

Contribution Table

NameProposal Contributions
Nannan AravazhiStep 1, Step 5, Gantt Chart, Contribution Table
Pranavkrishna SureshStep 2, Website Development, Hosting Github Pages, Logistics
Mihir BalsaraStep 4, Hosting Github Pages, Logistics
Gabriel FerrerStep 3, Script, Creating Slides
Henry LinStep 3, Creating Slides, Gantt Chart

Video Presentation

Click here!

https://youtu.be/r_rH_FhN3u8

GitHub Repository

Here is the link to our Github repository:

https://github.gatech.edu/mbalsara3/mlproposal

Project Award Eligibility

Yes, we are interested in being considered for the Project Award.