GitHub

--- Now Updating ---

This repository is the implementation of "Audio-visual Action Recognition using Transformer Fusion Network". This code is based on the GitHub repository of the "Swin Transformer" paper. (https://github.com/microsoft/Swin-Transformer)

Dataset preparation

UCF-sound(The subset of UCF-101)
1. Download UCF-101 dataset
2. The class list provided below includes audio files within video data. Please separate these classes from the UCF-101 dataset. (CliffDiving/Rafting/SoccerPenalty/BabyCrawling/LongJump/Hammering/HandstandWalking/CuttingInKitchen/StillRings/BoxingPunchingBag/PlayingDhol/Surfing/BrushingTeeth/Archery/IceDancing/MoppingFloor/PlayingFlute/BoxingSpeedBag/ParallelBars/UnevenBars/Typing/PlayingCello/TableTennisShot/BasketballDunk/ApplyLipstick/BalanceBeam/PlayingDaf/SumoWrestling/CricketShot/Knitting/FloorGymnastics/Shotput/WritingOnBoard/ShavingBeard/Haircut/BlowingCandles/PlayingSitar/HeadMassage/FrontCrawl/BodyWeightSquats/BandMarching/FrisbeeCatch/FieldHockeyPenalty/HandstandPushups/BlowDryHair/Bowling/WallPushups/CricketBowling/SkyDiving/HammerThrow)
Kinetics-sound(The subset of Kinetics-400)
1. Download Kinetics-400 dataset
2. The class list provided below includes audio files within video data. Please separate these classes from the Kinetics-400 dataset. (playingtrumpet/stompinggrapes/shovelingsnow/playingclarinet/strummingguitar/blowingnose/playingxylophone/blowingoutcandles/rippingpaper/tapdancing/bowling/laughing/playingbassguitar/playingviolin/playingkeyboard/playingtrombone/tappingpen/dribblingbasketball/playingdrums/choppingwood/singing/playingbagpipes/mowinglawn/playingorgan/playingpiano/shufflingcards/playingguitar/playingaccordion/tickling/playingharmonica/tappingguitar/playingsaxophone)
Extract the frames from the video.
Extract the WAV file from the video

The case of model

IVA : all elements, includnig a single frame, T frames, and audio
IV : single frame and T frames without audio
IA : single frame and its corresponding audio
VA : T frames and its corresponding audio

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

kjunhwa/audiovisual_action

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages