Projects
2023
Language-Driven 3D Stylization
Advisor: Prof. Yue Wang
Course Project
Code
- Developed a language-driven 3D stylization pipeline.
- Trained a radiance field model for 3D object rendering.
- Applied GroundingedSAM to generate segmentation masks for any given object query inputs, allowing object-level selection for targeted stylization.
- Applied masked NNFM to fine-tune the pretrained radiance field to achieve 3D stylization.
Multi-object Robust Tracking
Advisor: Prof. Ram Nevatia
Research Program
- Developed a defense pipeline to address multi-object tracking challenges against adversarial attacks.
- Implemented the RAFT optical flow detector combined with homography on localized patch regions, inverted optical flow to original frames to enhance downstream tracker's target detection within patch regions.
- Achieved robust detection and association accuracy in comparison to patch detectors used in object detection.
Multi-modal Masked Adapter
Advisor: Prof. Jesse Thomason
Course Project, Score A
Code
- Developed a masked Adapter for fine-tuning the pre-trained CLIP model.
- Applied LoRA Adapter to downstream multimodal tasks.
- Trained with masked image modeling, masked language modeling and masked image & language modeling to achieve better multimodal representation.
- Achieved comparable performance to fully fine-tuning with only 10% parameters and reduced training time.
- Proved that knowledge acquired from unimodal Adapters can support and assist multimodal tasks.
2022
Robust Object Detection Against Adversarial Attacks
Advisor: Prof. Ram Nevatia
Research Program
- Developed a defense pipeline for RGB-depth object detection against different adversarial attacks (nonadaptive attacks, adaptive attacks).
- Formulated the patch defense problem as a segmentation task for RGB channel to detect adversarial patch.
- Maked an identical mapping for depth channel to leverage depth features and fused with RGB defense outcomes.
- Applied the defense patch detector against various attack initializations, like stars and gems, and different attack optimizers, like PGD and Adam, to test the adaptability of the defense method.
- Achieved robust detection accuracy and cross-optimizer robustness.
Emotion Assessment System Based on Facial Language Understanding
Advisor: Prof. Sheng Zhong
Research Program, Team leader
- Developed a emotion assessment system based on facial language analysis and understanding.
- Completed the training and validation of YOLO v5 and Dual Path Network on PC side, which involves processing multi-frame data, targeting and classifying facial emotion, creating a visualization UI, and utilizing time series methods to predict emotions.
- Deployed the model on embedded platform, accelerate model inference speed in a GPU-free environment, addressing challenges in detection accuracy and model efficiency.
2021
PIV Based on Stereoscopic Gesture Information
Advisor: Prof. Yang Xiao
Course Project, Full Mark
- Implemented PIV task by using RGB and 3D depth gesture information.
- Trained and tested on 3 different models: two-stream ResNet for blur and detail images, two-stream ResNet for RGB and depth images, and PointNet for depth images
- Optimized result by combining some classic methods: guided filter & edge extraction, stacking ensemble algorithm.
Peeping Camera Recognition and Detection
Advisor: Prof. Jie Ma
Course Project, Score A+
- Developed a peeping camera recognition and detection system.
- Collected natural and infrared light images, identified bright spots of camera through algorithms like image difference, OTSU threshold segmentation, region growth and clustering.
- Developed a visualization UI based on MFC.
Research on Semantic Segmentation of Remote Sensing Image
Big Data Competition, National First Prize (30 out of 1000 teams)
- Implemented the semantic segmentation of remote sensing images through K-mean clustering, watershed, mean-shift, graph cuts.
- Focused on improving accuracy by considering more complex semantic segmentation networks, achieved the best segmentation result by applying FCN, U-Net and SegNet.
- Optimized the model by some morphological operations.
2019
Embedded Machine Vision Design
Control Innovation Base Class of Qiming College of HUST
Competition Program, Score A+
- Joined the Control Innovation Base Class of Qiming College of HUST.
- Implemented the object recognition for a small intelligent car based on color recognition and PID control algorithm.
Securities Quantitative Trading Software
Advisor: Prof. Dingxin He
Course Design, Full Mark
Second prize in the C Software Design Competition
Code
- Conducted feasibility analysis, demand analysis, system function design, code implementation, software advantages and shortcomings analysis, and integration testing in the process of software design.
- Set functions into the software, including market condition analysis, market information display, recent ten years’ stock data switching and scientific visualization, fundamental and technical stock selection, user operations, dynamic stock update, etc.
- Completed the software with a report which defined the software system architecture design, software structure and data structure design, interface and call between each module, system interface design, system function design (function listing), specific algorithm design, etc.
- Got the only full mark of the school in the final evaluation, and won the second prize in the C Software Design Competition.