Skip to content
View DavidePaglieri's full-sized avatar

Highlights

  • Pro

Block or report DavidePaglieri

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fully open reproduction of DeepSeek-R1

Python 12,245 912 Updated Jan 29, 2025
Jupyter Notebook 866 94 Updated Jun 27, 2024

A simple OpenAI Gym environment for single and multi-agent reinforcement learning

Python 738 113 Updated Dec 14, 2023

Benchmarking the Spectrum of Agent Capabilities

Python 403 68 Updated Jan 23, 2024

Benchmarking Agentic LLM and VLM Reasoning On Games

Python 97 15 Updated Jan 28, 2025

🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.

Python 6,038 551 Updated Jan 30, 2025

Training Large Language Model to Reason in a Continuous Latent Space

Python 748 53 Updated Jan 24, 2025
Jupyter Notebook 3 Updated Jan 27, 2025

Flexible Python configuration system. The last one you will ever need.

Python 2,036 121 Updated Jan 19, 2025

A virtual environment for developing and evaluating automated scientific discovery agents.

Python 122 8 Updated Jan 21, 2025

Universal LLM Deployment Engine with ML Compilation

Python 19,782 1,637 Updated Jan 24, 2025

LMAct: A Benchmark for In-Context Imitation Learning with Long Multimodal Demonstrations

Python 7 3 Updated Dec 9, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 4,135 401 Updated Jan 30, 2025

Hydra is a framework for elegantly configuring complex applications

Python 9,002 657 Updated Jan 16, 2025

Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Zihan Wang*, Jiateng Liu, Yangyi Chen, Lifan Yuan, Hao Peng and …

Python 110 7 Updated Jun 4, 2024
Python 2,348 275 Updated Jan 29, 2025

High throughput synchronous and asynchronous reinforcement learning

Python 860 115 Updated Dec 31, 2024
HTML 3 Updated Jan 29, 2025

[ICLR 2024] SWE-bench: Can Language Models Resolve Real-world Github Issues?

Python 2,347 400 Updated Jan 22, 2025

Reinforcement learning on general 2D physics environments in JAX.

Python 121 3 Updated Jan 27, 2025

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,913 113 Updated Jul 29, 2024

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Python 487 60 Updated Aug 19, 2024

Accelerated minigrid environments with JAX

Python 128 12 Updated Aug 1, 2024

RAG that intelligently adapts to your use case, data, and queries

Python 2,804 139 Updated Jan 22, 2025

200+ detailed flashcards useful for reviewing topics in machine learning, computer vision, and computer science.

2,045 185 Updated Jun 12, 2024

Machine Learning Journal for Intermediate to Advanced Topics.

Jupyter Notebook 1,520 142 Updated Jan 20, 2025

Entropy Based Sampling and Parallel CoT Decoding

Python 3,216 317 Updated Nov 13, 2024
Next