Stars
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper
A library to extract the main content from html. Developed for information on LLM and for feeding data into LangChain and LlamaIndex.
This project is designed to extract text from documents and prepare it for processing by Large Language Models (LLM). Implemented a feature to store and utilize text style information, enabling the…
Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and HTML files. Extensive support of tabular data extraction and multimodal queries.
使用 Qwen2ForSequenceClassification 简单实现文本分类任务。
A Regular Expression constraint for Language Models of transformers. With this module, you can force the LLMs to generate following your regex. Using regex in tokens and tensors are also implemente…
Retrieval and Retrieval-augmented LLMs
Deep Learning for Time Series Classification
Multimodal model for text and tabular data with HuggingFace transformers as building block for text data
This Python code was developed to create RNNs for analyzing time-intensive multimodal process data and non-time-series data.
A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
GMoE could be the next backbone model for many kinds of generalization task.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
The state-of-the-art image restoration model without nonlinear activation functions.
PyTorch implementation of Sentiment Analysis of the long texts written in Serbian language (which is underused language) using pretrained Multilingual RoBERTa based model (XLM-R) on the small dataset.
text classification using mbert
Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
Based on the Pytorch-Transformers library by HuggingFace. To be used as a starting point for employing Transformer models in text classification tasks. Contains code to easily train BERT, XLNet, Ro…
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
This repository is MLP implementation of classifier on MNIST dataset with PyTorch
An approach with neural networks to the Titanic Kaggle problem, using MLPClassifier from sklearn.
Rank gaussian normalization, Swap noise, Denoised AutoEncoder as feature engineering