Skip to content
View mkolod's full-sized avatar
  • San Francisco Bay Area, CA

Block or report mkolod

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Cuda 33 2 Updated Jul 19, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 39,823 6,528 Updated Dec 9, 2024

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,237 370 Updated Mar 4, 2025
C 254 24 Updated May 26, 2024

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ 3,010 669 Updated Mar 4, 2025

Starlark implementation of bazel rules for CUDA.

Starlark 97 47 Updated Mar 4, 2025

This repo holds the extended examples for rules_cuda.

Starlark 7 1 Updated Jul 24, 2023

Spatial Sparse Convolution Library

Python 1,971 370 Updated Dec 15, 2024

Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors

Python 2,585 375 Updated Mar 5, 2024

Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben Ashbaugh, James Brodman, Michael Kinsner, John Pennycook, Xin…

CMake 264 86 Updated Jan 6, 2025

Dire Wolf is a software "soundcard" AX.25 packet modem/TNC and APRS encoder/decoder. It can be used stand-alone to observe APRS traffic, as a tracker, digipeater, APRStt gateway, or Internet Gatewa…

C 1,655 311 Updated Oct 29, 2024

Various scripts written for ham radio pi

Shell 111 40 Updated Jan 13, 2025
Shell 339 70 Updated Apr 18, 2024

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.

C++ 2,925 787 Updated Feb 17, 2025

A curated list of projects related to the reMarkable tablet

6,544 221 Updated Feb 21, 2025

Make huge neural nets fit in memory

Python 2,767 272 Updated Apr 26, 2020

Deprecated - see our other repos for Bazel examples

Java 10 5 Updated Mar 22, 2022

For publishing the source for UG1352 "Get Moving with Alveo"

C++ 50 16 Updated Jun 17, 2020
C++ 9 17 Updated Oct 22, 2024

Vitis_Accel_Examples

Makefile 529 216 Updated Feb 20, 2025

Vitis In-Depth Tutorials

C 1,327 560 Updated Mar 4, 2025

Avnet Board Definition Files

Tcl 131 68 Updated Jan 9, 2025

The code for the ebook Ray Tracing in One Weekend by Peter Shirley translated to CUDA by Roger Allen. This work is in the public domain.

C++ 345 87 Updated Jan 29, 2021

Brevitas: neural network quantization in PyTorch

Python 1,263 206 Updated Mar 4, 2025

Performance writing to GPIO with CPU and DMA on the Raspberry Pi

C 201 27 Updated Jul 15, 2024

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,274 2,160 Updated Feb 1, 2025

Documentation of NVIDIA chip/hardware interfaces

C 1,268 92 Updated Sep 10, 2024

Source code examples from the Parallel Forall Blog

HTML 1,265 638 Updated Jul 23, 2024
Next