I am pursuing PhD degree in Computer Science (major in Computer Vision) at Florida Atlantic University under the supervision of Prof. Arslan Munir. I work as Graduate Research Assistant at Intelligent Systems, Computer Architecture, Analytics and Security Laboratory (ISCAAS LAB). Earlier, I completed my Master’s degree in 2020 at Sejong University, South Korea, under the supervision of Prof. Jong Weon Lee.

Research Interests

My primary research focuses on multimodal-driven video analytics, including action and activity recognition, temporal action localization, and spatio-temporal action detection in videos. Additionally, I specialize in pretraining, fine-tuning, and zero-shot learning of vision-language models for video analytics tasks. Furthermore, I have expertise in knowledge distillation and enhancing adversarial robustness in machine learning models.

News

  • Oct, 2024: I successfully passed my Ph.D. Candidacy Exam.

Work Experience

Image 2

Florida Atlantic University
Graduate Research Assistant
August 2024 - Present

Image 3

Kansas State University
Graduate Research Assistant
January 2022 - July 2024

Image 3

Sejong University
Full-time Researcher
Feburary 2021 - December 2021

Image 2

NINE VR
Machine Learning Engineer (Intern)
July 2020 - December 2020

Image 3

Sejong University
Graduate Research Assistant
March 2019 - January 2021

Selected Publications

Image 1

DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition

Submitted to IEEE TCSVT

We introduces a computationally efficient VFL-Net model, optimized for spatio-temporal context modeling using nano-scale spatio-temporal focal modulation mechanism. Further, we combine the forward Kullback– Leibler (KL) divergence and spatio-temporal focal modulation to distill the local and global spatio-temporal context from the Video-FocalNet Base (teacher) to our proposed VFL-Net (student) model.

Image 1

Improving Adversarial Robustness Through Adaptive Learning-Driven Multi-Teacher Knowledge Distillation

Submitted to Springer SN Computer Science

We propose a multi-teacher adversarial robustness distillation framework with adaptive weighting. Adversarially trained CNNs on perturbed data act as teachers for a student model trained on clean data. Adaptive weights adjust the teachers' contributions based on precision. This enhances the student's learning and robustness to adversarial attacks. The student model remains resilient without exposure to perturbed data.

Image 1

OD-VIRAT: A Large-Scale Benchmark for Object Detection in Realistic Surveillance Environments

Submitted to ACM Transactions on MCCA

We introduce two object detection benchmarks, OD-VIRAT Large and OD-VIRAT Tiny, for surveillance imagery. Both cover 10 scenes recorded from significant height and distance. OD-VIRAT Large contains 8.7 million instances in 599,996 images, while OD-VIRAT Tiny has 288,901 instances in 19,860 images. Our proposed OD-VIRAT offers rich annotations of bounding boxes and categories.

Image 1

Vision-Based Semantic Segmentation in Scene Understanding for Autonomous Driving: Recent Achievements, Challenges, and Outlooks [PDF]

IEEE Transactions on Intelligent Transportation

This survey reviews the current achievements in scene understanding, focusing on computationally complex deep learning models. It outlines the generic pipeline, evaluates state-of-the-art performance, and analyzes the time complexity of advanced modeling approaches. Additionally, it highlights key successes and limitations in current research efforts.

Image 1

Efficient Fire Segmentation for IoT-Assisted Intelligent Transportation Systems [PDF]

IEEE Transactions on Intelligent Transportation

We propose an efficient and lightweight CNN architecture for early fire detection and segmentation, focusing on IoT-enabled ITS environments. We effectively utilize depth-wise separable convolution, point-wise group convolution, and a channel shuffling strategy with an optimal number of convolution kernels per layer, significantly reducing the model size and computation costs.

Image 1

Light-DehazeNet: A Novel Lightweight CNN Architecture for Single Image Dehazing [PDF]

IEEE Transactions on Image Processing

We present Light-DehazeNet (LD-Net), a lightweight CNN for hazy image reconstruction that jointly estimates the transmission map and atmospheric light using a transformed scattering model. A color visibility restoration method is proposed to avoid color distortion. Extensive experiments are conducted with synthetic and natural hazy images.

Image 1

Cascaded Deep Reinforcement Learning-Based Multi-Revolution Low-Thrust Spacecraft Orbit-Transfer [PDF]

IEEE Access

We introduce a cascaded deep reinforcement learning (DRL) model to guide low-thrust spacecraft toward desired orbits by determining optimal thrust directions. A gradient-aided reward function based on orbital elements ensures mission requirements and optimal flight times. Results demonstrate time-efficient, near-optimal orbit-raising. This approach effectively improves spacecraft trajectory planning.

Professional Services

Reviewer at Journals

  • IEEE Transactions on Image Processing
  • IEEE Transactions on MultiMedia
  • IEEE Transactions on Circuits and Systems for Video Technology
  • IEEE Access
  • Elsevier Journal of Image and Vision Computing

Reviewer at Conferences

  • AAAI’ 2022