SewSource is a powerful command-line interface (CLI) tool that helps you aggregate documentation files from GitHub repositories. It clones repositories, combines specified documentation files, and creates a unified source perfect for interactive Learning Language Model (LLM) discussions using tools like NotebookLM.
- 🚀 Installation
- 🛠️ Usage
- 📋 Command Options
- 💡 Examples
⚠️ Common Issues and Solutions- Next TODOs
- 🤝 Contributing
- 📄 License
pip install sewsource
sewsource https://github.com/username/repository
sewsource https://github.com/username/repository \
--output-dir "./docs_combined" \
--include-dirs "docs,wiki" \
--exclude-dirs "tests" \
--blacklist "README.md,CHANGELOG.md" \
--extensions ".md,.rst,.txt"
Option | Short | Description | Default |
---|---|---|---|
--output-dir |
-o |
Output directory for combined files | ~/.sewsource |
--include-dirs |
-i |
Directories to include | All directories |
--exclude-dirs |
-x |
Directories to exclude | None |
--blacklist |
-b |
Files to exclude | None |
--extensions |
-e |
File extensions to include | .md, .mdx |
--version |
- | Show version information | - |
--help |
- | Show help message | - |
sewsource https://github.com/tensorflow/tensorflow
sewsource https://github.com/pytorch/pytorch \
--include-dirs "docs" -i "tutorials" \
--extensions ".md" -e ".rst"
sewsource https://github.com/kubernetes/kubernetes \
--exclude-dirs "vendor" -x "test" \
--blacklist "CONTRIBUTING.md"
-
Large Repositories
# For large repos, use specific directories sewsource https://github.com/large/repo --include-dirs "docs"
- Add support for comma separated values as arguments for multiple folders/files
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License - see the LICENSE file for details.