Deep-Learning-For-Computer-Vision

Most of my works before 2022 were implementation from scratch. Now a days every day new algorithms are beating the existing ones. It's impossible to implement each them from scratch. For catching up with state of the earth technologies sometimes I work on top of the publicly available source code.

Abnormal Motion Detection (From Scratch)
Pose Detection and Tracking
Semantic segmentation with DeepLabV3
Yolo-v8 for object detection
Transformer for translation (From Scratch)
Diffusion model for car image generation.

Abnormal Motion Detection (Implementation From scratch)

back

I am conducting research on human activity anomaly detection from CCTV footage personally. Here it is a very minimal demonstration on what I wish to achieve through my research. A large part of this work is not public yet. I am experimenting with VAE, VQVAE, Transformer, GCNN and some of my novel ideas on this domain. I am evaluating my methods on public benchmarks i.e. Shanhai-Tech, UBNormal dataset etc. Hopefuly I will be able to publish it soon.

Code : motion-anomaly
src/stream_io.py : Takes care of simultanous videoo and audio capture from webcam / ipcam / video file using ffmpeg.
src/feat_tracker.py : Detects key feature points and tracks them. - Feature Points : good feature to track / human pose key points - Tracker : optical flow based tracker / pose flow tracker
src/gui.py : Visualize audio intensity and feature point velocity.

Results

Feature keypoints are being tracked in real time. The keypoints could be human body pose keypoints see the next section for pose tracking or tracked keypoints from classic computer vision algorithms like - Good Feture To Track , SIFT,SURF,ORB etc.
The change in feature point trajectory is plotted (bottom left plot) and analysed for anomaly.
Sound intensity (bottom right plot) is added as an additional modality for reinforcing the initial hypothesis for anomaly.

``` ```

Pose Tracking (Working on top of existing public source code)

back

Code : pose-tracking
posedet_mnet : OpenPose with mobilenet
posedet_alpha : AlphaPose
posedet_alpha/tracker/Pose Flow : Pose Flow tracker.

Results

``` ```

DeeplabV3 (Working on top of existing public source code)

back

Code : deeplabv3 Inspired from
I implemented UNet from scratch here In this repository I am working on DeeplabV3. Here are some resuts of training on VOC Dataset. I have borrowed some code from the internet since now a days lot's of technology are coming every day. It's not practical to implement everything from scratch and stay up to date at the same time.

``` ```

YOLO-V8 (Working on top of public source code)

back

yolo-v8 is a modified version of publicly available source code of Ultralytics
Code : yolo-v8
yolo-v4 is my personal implementation of yolo v4 from scratch. It can achieve aroung mAP 25 with mobilenet with alpha = 1 and image size 224x224. I have taken help from other resouces available in the internet for this work.
Code : yolo-v4 ; notebook

Results

``` ```

Transformer

back

A simple transformer has been trained on Multi30k dataset for German to English transation.

Code : transformer-models

Usage :

Main note book

transforer/transformer.ipynb

Implementation from scratch

-transformer/models/transformer_v1.py

Implementation using pytorch nn.MultiHeadAttention Module

-transformer/models/transformer_v2.py

Note : vision-transformer/ is a on progress work. The code is not organized at all.

Results:

Epoch: 1, Train loss: 5.342, Val loss: 4.104, Epoch time = 51.305s
Epoch: 2, Train loss: 3.759, Val loss: 3.306, Epoch time = 55.960s
Epoch: 3, Train loss: 3.156, Val loss: 2.888, Epoch time = 65.804s
Epoch: 4, Train loss: 2.767, Val loss: 2.629, Epoch time = 71.922s
Epoch: 5, Train loss: 2.478, Val loss: 2.442, Epoch time = 74.421s
Epoch: 6, Train loss: 2.249, Val loss: 2.307, Epoch time = 72.754s
Epoch: 7, Train loss: 2.056, Val loss: 2.217, Epoch time = 78.290s
Epoch: 8, Train loss: 1.895, Val loss: 2.108, Epoch time = 76.270s
Epoch: 9, Train loss: 1.754, Val loss: 2.053, Epoch time = 79.638s
Epoch: 10, Train loss: 1.632, Val loss: 1.996, Epoch time = 83.422s
Epoch: 11, Train loss: 1.523, Val loss: 1.965, Epoch time = 87.027s
Epoch: 12, Train loss: 1.418, Val loss: 1.939, Epoch time = 86.907s
Epoch: 13, Train loss: 1.328, Val loss: 1.928, Epoch time = 90.075s
Epoch: 14, Train loss: 1.250, Val loss: 1.940, Epoch time = 96.738s
Epoch: 15, Train loss: 1.172, Val loss: 1.936, Epoch time = 96.887s
Epoch: 16, Train loss: 1.101, Val loss: 1.915, Epoch time = 97.977s
Epoch: 17, Train loss: 1.035, Val loss: 1.895, Epoch time = 97.573s
Epoch: 18, Train loss: 0.976, Val loss: 1.911, Epoch time = 97.933s

Sample Result of trained model

Input (German) : Eine Gruppe von Menschen steht vor einem Iglu .
Output (English) : A group of people standing in front of an igloo .

Diffusion

back

A simple diffusion model has been trained on car images.

Code : diffusion-models

Results

``` ```

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
deeplabv3		deeplabv3
diffusion-models		diffusion-models
motion-anomaly		motion-anomaly
pose-tracking		pose-tracking
transformer-models		transformer-models
yolo-v4		yolo-v4
yolo-v8		yolo-v8
.gitignore		.gitignore
README.md		README.md
install - cuda.sh		install - cuda.sh
install.sh		install.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep-Learning-For-Computer-Vision

Abnormal Motion Detection (Implementation From scratch)

Results

Pose Tracking (Working on top of existing public source code)

Results

DeeplabV3 (Working on top of existing public source code)

YOLO-V8 (Working on top of public source code)

Results

Transformer

Usage :

Main note book

Implementation from scratch

Implementation using pytorch nn.MultiHeadAttention Module

Sample Result of trained model

Diffusion

Results

About

Releases

Packages

Languages

irfanhasib0/Deep-Learning-For-Computer-Vision

Folders and files

Latest commit

History

Repository files navigation

Deep-Learning-For-Computer-Vision

Abnormal Motion Detection (Implementation From scratch)

Results

Pose Tracking (Working on top of existing public source code)

Results

DeeplabV3 (Working on top of existing public source code)

YOLO-V8 (Working on top of public source code)

Results

Transformer

Usage :

Main note book

Implementation from scratch

Implementation using pytorch nn.MultiHeadAttention Module

Sample Result of trained model

Diffusion

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages