Projects

TraitViz

2D 3D Depth NYUD Pinhole camera

Advanced depth estimation and 3D reconstruction using RGBD data from pinhole cameras. Implements state-of-the-art algorithms on the NYUD dataset.

Source code

Karma AI

Computer Vision Accessibility Multimodal AI Voice Assistant Fall Detection Google Vertex OpenAI

An AI-powered accessibility solution that helps visually impaired people navigate their surroundings safely. Uses multimodal prompting (image + text) with Google Vertex API for scene descriptions and navigation assistance. Integrates OpenAI-based GPT voice assistant and fall detection using accelerometers for enhanced safety and independence.

Source code

CNN Visualization

PyTorch Activation Maps Feature Maps Filter Visualization Forward Hooks state_dict Layer-wise Visualization Training-time Visualization

Built an interactive system to inspect intermediate layers of CNNs. Given an input image, architecture, and PyTorch it renders per‑layer activation maps and optionally visualizes filter weights. Added forward hooks and epoch snapshots to step through training for real‑time introspection and debugging

Source code

Diseased Plant Detection

Computer Vision Deep Learning ResNet50 VGG16 CNN Plant Disease Classification

Comparative study of deep learning architectures (ResNet50, VGG16, and custom CNN) for plant disease classification across 38 different classes. Implemented and evaluated models on a comprehensive plant disease dataset, achieving state-of-the-art accuracy in disease detection and classification. Documentation available in the project repository.

Source code

Prompt to Perception

NLP Computer Vision Stable Diffusion T5 Text-to-Image

Developed an end-to-end system for text-to-image generation using prompt refinement and Stable Diffusion 2.1. Improved prompt quality 3× (ROUGE metrics) via T5-Small, with an average prompt-image alignment of 0.72

Source code