Skip to content
View woojinsoh's full-sized avatar

Block or report woojinsoh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
  • Asynchronous streaming inference for LLM(OpenAI, NVIDIA NIM, NAVER HyperClova) using FastAPI.

    Python Updated Nov 28, 2024
  • Shell Updated Nov 2, 2023
  • Some TensorRT conversion examples for different kinds of neural network models

    Python 1 Updated May 4, 2023
  • riva_demo Public

    NVIDIA Riva SDK Demonstration for Feb 2022,2023 Developer Meetup

    Jupyter Notebook 9 5 Updated Jan 11, 2023
  • Transformer related optimization, including BERT, GPT

    C++ Apache License 2.0 Updated Dec 14, 2022
  • cudnn_mnist Public

    cuDNN/cuBLAS implementation for basic convolutional neural network architecture with MNIST dataset

    Cuda 1 Updated Dec 7, 2022
  • TensorRT Public

    Forked from NVIDIA/TensorRT

    TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.

    C++ Apache License 2.0 Updated Nov 3, 2022
  • Some Triton python client examples

    Python 2 1 Updated Jun 8, 2022
  • Execute Megatron-DeepSpeed using Slurm for multi-nodes distributed training

    Shell 6 1 Updated May 4, 2022
  • dask-mnmg Public

    Run RAPIDS Dask CuML Clustering Algorithms with Multi-gpus on Multi-nodes.

    Python 1 Updated Aug 13, 2021