Hallo There! 🎥

An advanced AI-powered video generation tool that creates realistic talking avatars from images and audio.

🌟 Features

🎭 Avatar Generation: Create realistic talking avatars from static images
🗣️ Voice Processing: Advanced audio diarization using pyannote.audio
🎬 Video Synthesis: High-quality video generation with customizable settings
🔄 Multi-pose Support: Generate videos with multiple facial poses
🎨 Background Customization: Flexible background handling options

📋 Prerequisites

Python 3.8+
CUDA-compatible GPU (recommended)
FFmpeg
Hugging Face account and access token

🚀 Installation

Set up Python environment:

conda create --name hallo-there
conda activate hallo-there

Clone the repository:

git clone https://github.com/hiktan44/hallo-there.git
cd hallo-there

Install dependencies:

pip install -r requirements.txt
pip install .

Install FFmpeg:

Linux: sudo apt-get install ffmpeg
Windows: Download from official FFmpeg website and add to system PATH

⚙️ Configuration

Create Hugging Face access token:
- Visit Hugging Face Token Settings
- Generate new token with required permissions
Set up diarization:

python diarization.py -access_token <YOUR_HUGGING_FACE_TOKEN>

📁 Project Structure

hallo-there/
├── source_images/      # Input images (512x512)
├── audio/             # Input audio files
├── diarization/       # Diarization output
├── output/           # Generated video clips
└── docs/             # Documentation

🎮 Usage

Prepare source images:
- 512x512 pixel squares
- Face should occupy 50-70% of image
- Place in source_images/ directory
Prepare audio:
- Convert to WAV format
- Place in audio/input_audio.wav
Generate video:

python generate_videos.py
python combine_videos.py

🔧 Advanced Options

-mode full: Enable subtle head movements during silence
-background custom: Use custom background image
-quality high: Generate higher quality output

📚 Documentation

Detailed documentation available in docs/ directory:

🤝 Contributing

Contributions welcome! Please read CONTRIBUTING.md for guidelines.

📄 License

This project is licensed under the MIT License - see LICENSE file for details.

🙏 Acknowledgments

pyannote.audio for audio diarization
Hugging Face for AI models and infrastructure

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.github/workflows		.github/workflows
assets		assets
audio		audio
configs		configs
diarization		diarization
examples		examples
hallo		hallo
scripts		scripts
source_images		source_images
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
accelerate_config.yaml		accelerate_config.yaml
combine_videos.py		combine_videos.py
diarization.py		diarization.py
diarization.rttm		diarization.rttm
generate_videos.py		generate_videos.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hallo There! 🎥

🌟 Features

📋 Prerequisites

🚀 Installation

⚙️ Configuration

📁 Project Structure

🎮 Usage

🔧 Advanced Options

📚 Documentation

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Releases

Packages

Languages

License

hiktan44/hallo-there

Folders and files

Latest commit

History

Repository files navigation

Hallo There! 🎥

🌟 Features

📋 Prerequisites

🚀 Installation

⚙️ Configuration

📁 Project Structure

🎮 Usage

🔧 Advanced Options

📚 Documentation

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages