Pipeline

A high-level interface for building chatbots with RAG (Retrieval-Augmented Generation) capabilities.

Features

Multiple RAG implementations:
- Web content (WebRAG)
- Python code (PyRAG)
- Text files (TxtRAG)
- PDF documents (PdfRAG)
- JSON data (JsonRAG)
- Markdown files (MdRAG)
YouTube caption downloading
Support for multiple LLM backends (OpenAI, Ollama, LM Studio)
Conversation history management
File operations utilities

Installation

Choose the appropriate installation method based on your needs:

Using Pipeline in Your Project

This is the recommended method when using Pipeline as a dependency in your project:

Install from GitHub:

pip install git+https://github.com/babakbandpey/pipeline.git

Or add to requirements.txt:

git+https://github.com/babakbandpey/pipeline.git

This installs:

Core Pipeline package with all RAG implementations
- WebRAG: Web content analysis
- PyRAG: Python code analysis
- TxtRAG: Text file processing
- PdfRAG: PDF document analysis
- JsonRAG: JSON data processing
- MdRAG: Markdown file analysis
Utility modules
- FileUtils: File operations
- ChatbotUtils: Chatbot helpers
- PipelineUtils: Configuration tools

Dependencies:

langchain and related packages
openai (optional)
chromadb for vector storage
other utility packages

Development Setup with Docker

Use this setup when:

Developing Pipeline itself
Running Pipeline as a standalone service
Contributing to the project

Benefits:

Isolated development environment
Consistent dependencies across platforms
Integrated Git and SSH configuration
Hot-reloading for development

Clone the repository:

# Clone repository
git clone https://github.com/babakbandpey/pipeline.git
cd pipeline

Copy and configure environment:

cp .env-demo .env
# Edit .env with your settings

Run development setup:

# Run setup script (first time and after pulling updates)
chmod +x dev.sh
./dev.sh

The setup script will:

Check/create .env file
Configure SSH for multiple GitHub accounts
Build and start Docker containers
Install package in development mode
Configure Git settings
Open interactive shell

Git Configuration for Multiple Accounts

Define Git and SSH settings in .env:

# Git Config
GIT_AUTHOR_NAME=Your Name
[email protected]
GIT_SSH_KEY_WORK=~/.ssh/id_rsa_work      # Work SSH key path
GIT_SSH_KEY_PERSONAL=~/.ssh/id_rsa_personal  # Personal SSH key path
GIT_SSH_HOST_WORK=github.com-work        # SSH config host for work
GIT_SSH_HOST_PERSONAL=github.com-personal # SSH config host for personal

SSH config will be auto-generated from .env settings

# Generated ~/.ssh/config
Host ${GIT_SSH_HOST_WORK}
    HostName github.com
    User git
    IdentityFile ${GIT_SSH_KEY_WORK}

Alternative: Direct Installation (Linux/MacOS)

# Clone repository
git clone https://github.com/babakbandpey/pipeline.git
cd pipeline

# Create and activate virtual environment
python -m venv env
source env/bin/activate

# Install with development dependencies
pip install -e ".[dev]"

Configuration

Create a .env file:

OPENAI_API_KEY=your_key_here  # Optional if using Ollama/LM Studio

Usage Examples

Basic Chat

from pipeline import Chatbot

# Interactive chat session
chatbot = Chatbot(base_url="http://localhost:11434", model="llama3")
print(chatbot.invoke("Hello! How are you?"))

Web Content Analysis

from pipeline import WebRAG

# Analyze web content
rag = WebRAG(
    base_url="http://localhost:11434",
    model="llama3",
    url="https://example.com"
)
print(rag.invoke("What is this page about?"))

YouTube Caption Processing

from pipeline import YouTubeCaptionDownloader

# Download and process captions
downloader = YouTubeCaptionDownloader('./yt')
captions = downloader.download_captions("https://youtube.com/watch?v=...")

Code Analysis

from pipeline import PyRAG

# Analyze Python code
rag = PyRAG(
    base_url="http://localhost:11434",
    model="llama3",
    path="./my_code.py"
)
print(rag.invoke("Explain what this code does"))

Available Commands

/exit: Exit conversation
/reset: Start new conversation
/history: Show conversation history
/delete: Delete messages
/summarize: Summarize conversation
/save: Save history to file
/help: Show commands

Development

Python 3.11+
Run tests: pytest
Lint code: pylint
Format code: black
Type check: mypy

Development Scripts

examples/code_guard.py: Code analysis and security checks
examples/readme_writer.py: Auto-generate README files
examples/run.py: Interactive chat session
examples/yt_caption_organizer.py: Process YouTube captions

License

MIT License

Author

Babak Bandpey [email protected]

Name	Name	Last commit message	Last commit date
Latest commit babakbandpey Refactor project structure and import paths Feb 15, 2025 9c0201d · Feb 15, 2025 History 158 Commits
.cursor/rules	.cursor/rules	Refactor pipeline initialization and configuration	Feb 15, 2025
.github/workflows	.github/workflows	Update pylint.yml	May 27, 2024
commit	commit	Update .gitignore to include all log files	Sep 12, 2024
docs	docs	changing and implementing	May 11, 2024
examples	examples	Refactor pipeline initialization and configuration	Feb 15, 2025
history	history	Some good changes. I like it	May 10, 2024
logs	logs	Update .gitignore to include all log files	Sep 12, 2024
nmap_project	nmap_project	Update .gitignore to include additional directories and file types	Sep 14, 2024
pipeline	pipeline	Refactor project structure and import paths	Feb 15, 2025
scripts	scripts	Remove utility scripts and configuration files	Feb 15, 2025
shell	shell	Add a shell script which scans the TLS Certificate of a website	Dec 12, 2024
tests	tests	Update .gitignore to exclude additional files	Sep 14, 2024
~/.ssh	~/.ssh	Enhance development setup and Docker configuration for multi-account …	Feb 15, 2025
.cursorignore	.cursorignore	Refactor pipeline initialization and configuration	Feb 15, 2025
.dockerignore	.dockerignore	Changes related to the Dockerizing and adding Azure to the service pr…	Jul 9, 2024
.env-demo	.env-demo	Enhance development setup and Docker configuration for multi-account …	Feb 15, 2025
.env.tmp	.env.tmp	Cleaning up and rearranging the code	May 19, 2024
.gitattributes	.gitattributes	I have refactored and added more user friendliness to the process	May 27, 2024
.gitignore	.gitignore	Add SSL scanner shell script	Feb 15, 2025
.pylintrc	.pylintrc	Refactor and enhance code quality and logging	Jul 16, 2024
Dockerfile	Dockerfile	Refactor project structure and import paths	Feb 15, 2025
LICENSE.md	LICENSE.md	before optimizing the Pipeline	May 10, 2024
MANIFEST.in	MANIFEST.in	Refactor pipeline initialization and configuration	Feb 15, 2025
README.md	README.md	Update Docker and development setup for improved package installation	Feb 15, 2025
dev.sh	dev.sh	Enhance development setup and Docker configuration for multi-account …	Feb 15, 2025
docker-compose.yaml	docker-compose.yaml	Update Docker and development setup for improved package installation	Feb 15, 2025
docker-installer.sh	docker-installer.sh	unknown changes	Sep 14, 2024
entrypoint.sh	entrypoint.sh	Enhance development setup and Docker configuration for multi-account …	Feb 15, 2025
pyproject.toml	pyproject.toml	Refactor pipeline initialization and configuration	Feb 15, 2025
pytest.ini	pytest.ini	Update .gitignore to exclude additional files	Sep 14, 2024
requirements.txt	requirements.txt	Update .gitignore to include yt directory	Sep 25, 2024
setup.py	setup.py	Add SSL scanner shell script	Feb 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pipeline

Features

Installation

Using Pipeline in Your Project

Development Setup with Docker

Git Configuration for Multiple Accounts

Alternative: Direct Installation (Linux/MacOS)

Configuration

Usage Examples

Basic Chat

Web Content Analysis

YouTube Caption Processing

Code Analysis

Available Commands

Development

Development Scripts

License

Author

About

Releases 8

Packages

Contributors 4

Languages

License

babakbandpey/pipeline

Folders and files

Latest commit

History

Repository files navigation

Pipeline

Features

Installation

Using Pipeline in Your Project

Development Setup with Docker

Git Configuration for Multiple Accounts

Alternative: Direct Installation (Linux/MacOS)

Configuration

Usage Examples

Basic Chat

Web Content Analysis

YouTube Caption Processing

Code Analysis

Available Commands

Development

Development Scripts

License

Author

About

Resources

License

Stars

Watchers

Forks

Releases 8

Packages 0

Contributors 4

Languages

Packages