Open source project for data preparation of LLM application builders
-
Updated
Dec 15, 2024 - Python
Open source project for data preparation of LLM application builders
Python package for Customizable Data Preprocessing Pipelines
This repository containing code for preprocessing text data from PDF and DOCX files for use with GPT-3. It includes steps such as tokenization, removal of stop words and punctuation, and formatting for GPT-3 input.
Collect POST requests
Understand and Implement decision tree
Project for Machine Learning Data Mining course
This work highlights my contribution as a "ML Engineer" at "adorsho praniSheb"(an ML based agro farming company of Bangladesh) where I was assigned the task of designing the preprocessing pipeline.
The data process library to help better industrial data understanding.
Machine learning models cannot be directly applied to raw data. This desktop application consists of a central server and two client servers. The main servers send raw data to clients, where the data is preprocessed and prepared to be fed to the machine learning model.
Add a description, image, and links to the data-preprocessing-pipelines topic page so that developers can more easily learn about it.
To associate your repository with the data-preprocessing-pipelines topic, visit your repo's landing page and select "manage topics."