Skip to content
View sumo43's full-sized avatar

Highlights

  • Pro

Organizations

@EleutherAI

Block or report sumo43

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sumo43/README.md

Hi there 👋

I'm interested in AI, with a focus on inference and post-training of AI models. Follow me on Twitter @sumo43_ for updates and discussions about the latest in AI research and development.


🔭 Projects

  • Demo: Object Detection Demo on X
  • Description:
    A fast paligemma inference engine running on the RTX 4090. I built an object detection demo using a 224px model that runs in real time at 16fps.
  • Description:
    RobotArena is an ELO-based 🤖 Robot-Action Model benchmark that lets you test models directly in your browser. This project is a collaboration with SkunkworksAI, allowing users to explore and evaluate robot-action models in a browser.

💼 Work Experience

Brium AI

Role: LLM Inference Engineer
Overview:
At Brium AI, I worked on accelerating inference for large language models across diverse GPU architectures. My role focused on optimizing the inference stack—from runtime systems to compilers—for long-context LLM applications. This work led to significant improvements in throughput and latency, particularly on AMD’s MI210 and MI300 GPUs.
Read more: Brium AI Blog Post


RunPod

Role: ML Engineer
Overview:
At RunPod, I built an in-house inference engine that supports low-latency workloads with speculative decoding. I also collaborated closely with customers to deploy AI models effectively on the RunPod stack.


📫 Get in touch

  • Twitter: @sumo43_
  • Email: (Add your email here if you'd like to be contacted directly)

Feel free to reach out if you're interested in collaborating or just want to chat about AI, machine learning, and all things tech!


Pinned Loading

  1. kingoflolz/CLIP_JAX kingoflolz/CLIP_JAX Public

    Forked from openai/CLIP

    Contrastive Language-Image Pretraining

    Jupyter Notebook 142 19

  2. SkunkworksAI/hydra-moe SkunkworksAI/hydra-moe Public

    Python 412 15

  3. uclaml/SPIN uclaml/SPIN Public

    The official implementation of Self-Play Fine-Tuning (SPIN)

    Python 1.1k 95

  4. loopvlm loopvlm Public

    run paligemma in real time

    Python 130 14

  5. moondream-mlx moondream-mlx Public

    Python 8 1