Skip to content

Latest commit

 

History

History
31 lines (22 loc) · 1.04 KB

CHANGELOG.md

File metadata and controls

31 lines (22 loc) · 1.04 KB

Changelog

All notable changes to this project will be documented in this file.

[Unreleased]

Added

  • 🔧 Separate Crawl and Extract JSON Semantic Chunk: Enhancing flexibility and efficiency in large-scale web crawling tasks.
  • 🔍 Colab Integration: Exploring integration with Google Colab for easy experimentation in a collaborative notebook environment.
  • 🎯 XPath and CSS Selector Support: Adding support for selective retrieval of specific elements from web pages.
  • 📷 Image Captioning: Incorporating image captioning capabilities to extract meaningful descriptions from images.
  • 💾 Embedding Data Generation and Storage: Developing functionalities to generate and store embedding data for each crawled website.
  • 🔍 Semantic Search Engine: Building a semantic search engine that fetches content, performs vector search similarity, and generates labeled chunk data based on user queries and URLs.

Changed

  • None

Deprecated

  • None

Removed

  • None

Fixed

  • None

Security

  • None

[1.0.0] - YYYY-MM-DD

  • Initial release