Most of my works before 2022 were implementation from scratch. Now a days every day new algorithms are beating the existing ones. It's impossible to implement each them from scratch. For catching up with state of the earth technologies sometimes I work on top of the publicly available source code.
- Abnormal Motion Detection (From Scratch)
- Pose Detection and Tracking
- Semantic segmentation with DeepLabV3
- Yolo-v8 for object detection
- Transformer for translation (From Scratch)
- Diffusion model for car image generation.
I am conducting research on human activity anomaly detection from CCTV footage personally. Here it is a very minimal demonstration on what I wish to achieve through my research. A large part of this work is not public yet. I am experimenting with VAE, VQVAE, Transformer, GCNN and some of my novel ideas on this domain. I am evaluating my methods on public benchmarks i.e. Shanhai-Tech, UBNormal dataset etc. Hopefuly I will be able to publish it soon.
- Code : motion-anomaly
- src/stream_io.py : Takes care of simultanous videoo and audio capture from webcam / ipcam / video file using ffmpeg.
- src/feat_tracker.py : Detects key feature points and tracks them. - Feature Points : good feature to track / human pose key points - Tracker : optical flow based tracker / pose flow tracker
- src/gui.py : Visualize audio intensity and feature point velocity.
- Feature keypoints are being tracked in real time. The keypoints could be human body pose keypoints see the next section for pose tracking or tracked keypoints from classic computer vision algorithms like - Good Feture To Track , SIFT,SURF,ORB etc.
- The change in feature point trajectory is plotted (bottom left plot) and analysed for anomaly.
- Sound intensity (bottom right plot) is added as an additional modality for reinforcing the initial hypothesis for anomaly.
- Code : pose-tracking
- posedet_mnet : OpenPose with mobilenet
- posedet_alpha : AlphaPose
- posedet_alpha/tracker/Pose Flow : Pose Flow tracker.
- Code : deeplabv3 Inspired from
- I implemented UNet from scratch here In this repository I am working on DeeplabV3. Here are some resuts of training on VOC Dataset. I have borrowed some code from the internet since now a days lot's of technology are coming every day. It's not practical to implement everything from scratch and stay up to date at the same time.
- yolo-v8 is a modified version of publicly available source code of Ultralytics
- Code : yolo-v8
- yolo-v4 is my personal implementation of yolo v4 from scratch. It can achieve aroung mAP 25 with mobilenet with alpha = 1 and image size 224x224. I have taken help from other resouces available in the internet for this work.
- Code : yolo-v4 ; notebook
A simple transformer has been trained on Multi30k dataset for German to English transation.
Code : transformer-models
- transforer/transformer.ipynb
-transformer/models/transformer_v1.py
-transformer/models/transformer_v2.py
Note : vision-transformer/ is a on progress work. The code is not organized at all.
- Results:
Epoch: 1, Train loss: 5.342, Val loss: 4.104, Epoch time = 51.305s
Epoch: 2, Train loss: 3.759, Val loss: 3.306, Epoch time = 55.960s
Epoch: 3, Train loss: 3.156, Val loss: 2.888, Epoch time = 65.804s
Epoch: 4, Train loss: 2.767, Val loss: 2.629, Epoch time = 71.922s
Epoch: 5, Train loss: 2.478, Val loss: 2.442, Epoch time = 74.421s
Epoch: 6, Train loss: 2.249, Val loss: 2.307, Epoch time = 72.754s
Epoch: 7, Train loss: 2.056, Val loss: 2.217, Epoch time = 78.290s
Epoch: 8, Train loss: 1.895, Val loss: 2.108, Epoch time = 76.270s
Epoch: 9, Train loss: 1.754, Val loss: 2.053, Epoch time = 79.638s
Epoch: 10, Train loss: 1.632, Val loss: 1.996, Epoch time = 83.422s
Epoch: 11, Train loss: 1.523, Val loss: 1.965, Epoch time = 87.027s
Epoch: 12, Train loss: 1.418, Val loss: 1.939, Epoch time = 86.907s
Epoch: 13, Train loss: 1.328, Val loss: 1.928, Epoch time = 90.075s
Epoch: 14, Train loss: 1.250, Val loss: 1.940, Epoch time = 96.738s
Epoch: 15, Train loss: 1.172, Val loss: 1.936, Epoch time = 96.887s
Epoch: 16, Train loss: 1.101, Val loss: 1.915, Epoch time = 97.977s
Epoch: 17, Train loss: 1.035, Val loss: 1.895, Epoch time = 97.573s
Epoch: 18, Train loss: 0.976, Val loss: 1.911, Epoch time = 97.933s
- Input (German) : Eine Gruppe von Menschen steht vor einem Iglu .
- Output (English) : A group of people standing in front of an igloo .
A simple diffusion model has been trained on car images.
- Code : diffusion-models