-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Insights: NVIDIA/NeMo
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v2.2.0rc2 NVIDIA Neural Modules 2.2.0rc2
published
Feb 17, 2025
80 Pull requests merged by 25 people
-
Fix BertEmbeddingDataset
#12272 merged
Feb 22, 2025 -
Cherry pick
Add modelopt to requirements_nlp.txt (12261)
intor2.2.0
#12278 merged
Feb 22, 2025 -
Cherry pick
Add eval requirement to setup.py (12152)
intor2.2.0
#12277 merged
Feb 22, 2025 -
build: Bump PyT to 25.01
#11973 merged
Feb 22, 2025 -
Cherry pick
Fix the local path in Sortformer diarizer training tutorial (12135)
intor2.2.0
#12316 merged
Feb 22, 2025 -
automodel notebooks fix
#12238 merged
Feb 22, 2025 -
Cherry pick
build: Exclude tensorstore 0.1.72 (12317)
intor2.2.0
#12318 merged
Feb 22, 2025 -
Fixes and refactor for custom pretraining loop
#12319 merged
Feb 22, 2025 -
Misc resiliency features
#12302 merged
Feb 21, 2025 -
Add checkpointing support to custom pretraining loop
#12291 merged
Feb 21, 2025 -
build: Exclude tensorstore 0.1.72
#12317 merged
Feb 21, 2025 -
Fix the local path in Sortformer diarizer training tutorial
#12135 merged
Feb 21, 2025 -
[nemo1] Fix Mamba/Bert loading from checkpoint after TE extra states were introduced
#12275 merged
Feb 21, 2025 -
add ctc segmentation
#12312 merged
Feb 21, 2025 -
ci: Fix test workflow
#12311 merged
Feb 21, 2025 -
Update for pytorch 25.01 container
#12310 merged
Feb 21, 2025 -
Test model loading for nemo export
#12262 merged
Feb 21, 2025 -
Add test for evaluation
#12276 merged
Feb 21, 2025 -
build: Editable nemo install (#12304)
#12308 merged
Feb 21, 2025 -
Energon ckpt multimodal
#12245 merged
Feb 21, 2025 -
build: Editable nemo install
#12304 merged
Feb 21, 2025 -
Cherry pick
Set L2_Speech_Batch_Size_OOMptimizer_Canary to be optional (12299)
intor2.2.0
#12300 merged
Feb 21, 2025 -
Set L2_Speech_Batch_Size_OOMptimizer_Canary to be optional
#12299 merged
Feb 21, 2025 -
ci: Flaky tests release
#12293 merged
Feb 20, 2025 -
Add sampling args in TRTLLM generate
#11612 merged
Feb 20, 2025 -
build: Bump mcore ref
#12287 merged
Feb 20, 2025 -
fix masked loss calculation
#12255 merged
Feb 20, 2025 -
fix speechlm import ckpt on slurm
#12244 merged
Feb 20, 2025 -
remove nemo1 tests
#12280 merged
Feb 20, 2025 -
remove nemo1 unit tests
#12281 merged
Feb 20, 2025 -
Fixing error when loading T5 checkpoint created with TE<1.13
#12264 merged
Feb 20, 2025 -
Build bitsandbytes
#12279 merged
Feb 20, 2025 -
chore(🤖): Bump
NVIDIA/Megatron-LM
to05ac33c...
(2025-02-20)#12274 merged
Feb 20, 2025 -
Add modelopt to requirements_nlp.txt
#12261 merged
Feb 20, 2025 -
Add eval requirement to setup.py
#12152 merged
Feb 20, 2025 -
Remove modelopt state when empty in NeMo 1.0 distillation
#12266 merged
Feb 20, 2025 -
Training loop
#12268 merged
Feb 20, 2025 -
Add optimizer fix
#12253 merged
Feb 19, 2025 -
Remove some old nemo1 contents in doc
#12156 merged
Feb 19, 2025 -
Llama Embedding Model Improvement
#12236 merged
Feb 19, 2025 -
Add evaluate utilities to custom pretraining loop
#12250 merged
Feb 19, 2025 -
Fix distillation state-dict loading bug
#12270 merged
Feb 19, 2025 -
add default kwargs for trtllm model runner
#12248 merged
Feb 19, 2025 -
ci: Fix pypi link of dry-run
#12267 merged
Feb 19, 2025 -
Fix 2D bucketing test on Python 3.12
#12265 merged
Feb 19, 2025 -
add default param dtype in mistral configs
#12186 merged
Feb 19, 2025 -
ci: Bump release workflows
#12259 merged
Feb 19, 2025 -
chore(🤖): Bump
NVIDIA/Megatron-LM
to61b2c4f...
(2025-02-19)#12251 merged
Feb 19, 2025 -
Ckpt fixes pytorch update
#12228 merged
Feb 19, 2025 -
build:
force-reinstall
#12214 merged
Feb 19, 2025 -
Restructure llm perf scripts to support vlm/diffusion/flux collections
#12252 merged
Feb 19, 2025 -
fix[export]: reshard model correctly handles extra_state when it's a tensor
#12132 merged
Feb 19, 2025 -
ci: Generate coverage for e2e tests
#12120 merged
Feb 18, 2025 -
Add setup function to setup model, optimizer and dataloaders for training
#12247 merged
Feb 18, 2025 -
Update scheduler
#12243 merged
Feb 18, 2025 -
Asr fixes 2.2
#12227 merged
Feb 18, 2025 -
Fix Llama Embedding Tutorial
#12149 merged
Feb 18, 2025 -
Add ConfigContainer plus data and tokenizer modules to nemo/tron
#12241 merged
Feb 18, 2025 -
Apply missing lr_mult and wd_mult to the lr and weight_decay of megatron param groups.
#12123 merged
Feb 18, 2025 -
ci: Disable flaky transcription tests
#12237 merged
Feb 18, 2025 -
add missing __init__
#12239 merged
Feb 18, 2025 -
[Draft] Llama Embedding Model Fix
#12235 merged
Feb 18, 2025 -
skip-linting label optionally disables lint checks
#12179 merged
Feb 18, 2025 -
ci: Simplify install-check
#12231 merged
Feb 18, 2025 -
Cherry pick
nemo-automodel checkpoint-io refactor (12070)
intor2.2.0
#12234 merged
Feb 18, 2025 -
Cherry pick
Fix loading extra states from torch tensor (12185)
intor2.2.0
#12226 merged
Feb 18, 2025 -
Add automodel multinode tut and fix sft peft bug
#12209 merged
Feb 18, 2025 -
nemo-automodel checkpoint-io refactor
#12070 merged
Feb 18, 2025 -
Cherry pick
build: Pin down transformers (12229)
intor2.2.0
#12230 merged
Feb 17, 2025 -
build: Pin down transformers
#12229 merged
Feb 17, 2025 -
Cherry pick
disable moe logging to avoid deepseek hang (12168)
intor2.2.0
#12192 merged
Feb 17, 2025 -
Fix loading extra states from torch tensor
#12185 merged
Feb 17, 2025 -
Cherry pick
Fix multi-GPU in-framework deployment (12090)
intor2.2.0
#12172 merged
Feb 17, 2025 -
ci: Remove
pull_request
trigger#12224 merged
Feb 17, 2025 -
ci: Update release workflows
#12223 merged
Feb 17, 2025 -
ci: Add install-test
#12215 merged
Feb 17, 2025 -
ci: Use release-ref
#12219 merged
Feb 17, 2025 -
ci: Code-freeze dry-run
#12217 merged
Feb 17, 2025 -
Cherry pick
Update TTS code to remove calls to deprecated functions (12153)
intor2.2.0
#12201 merged
Feb 17, 2025 -
Cherry pick
Add function calling SFT NeMo2.0 tutorial (11868)
intor2.2.0
#12180 merged
Feb 17, 2025
35 Pull requests opened by 24 people
-
exp manager updates
#12211 opened
Feb 17, 2025 -
Bump zarr version
#12216 opened
Feb 17, 2025 -
Support customization of a few parameters in scripts/vlm/llava_next_pretrain
#12218 opened
Feb 17, 2025 -
Moving async-queue to AppState
#12221 opened
Feb 17, 2025 -
Version bump to `2.2.0rc3.dev0`
#12222 opened
Feb 17, 2025 -
Add Trapezoidal / WSD LR scheduler
#12225 opened
Feb 17, 2025 -
Update L2_NeMo_2_NeMo_Mcore_Mixtral_bitexact to reenable failure on mismatch
#12233 opened
Feb 18, 2025 -
cherry pick 12209
#12240 opened
Feb 18, 2025 -
ONNX exporter
#12242 opened
Feb 18, 2025 -
Fixed normalization of feature vector and weight vector
#12246 opened
Feb 18, 2025 -
feat: Allow reshaping of HF checkpoint when converting from .nemo
#12249 opened
Feb 18, 2025 -
Add energon neva pretrain script and fix checkpoint saving
#12256 opened
Feb 19, 2025 -
Evo2 merge 20250214
#12263 opened
Feb 19, 2025 -
Fix model validate broadcast error
#12269 opened
Feb 19, 2025 -
NeVA performance recipe and script
#12271 opened
Feb 19, 2025 -
Fix for te v2.0
#12273 opened
Feb 19, 2025 -
Add trust_remote_code to load_context
#12282 opened
Feb 20, 2025 -
Perf script fix
#12285 opened
Feb 20, 2025 -
Cherry pick `fix masked loss calculation (12255)` into `r2.2.0`
#12286 opened
Feb 20, 2025 -
fix: typos in documentation files
#12288 opened
Feb 20, 2025 -
Call default factory in dataclasses when saving yaml via nemo.lightning.io
#12289 opened
Feb 20, 2025 -
Update README.md
#12294 opened
Feb 20, 2025 -
Adding FLOP calculator for FLUX
#12295 opened
Feb 20, 2025 -
Respect `pad_seq_length_to_mult` for chat datasets
#12297 opened
Feb 20, 2025 -
Add nemo-run recipe for evaluation
#12301 opened
Feb 21, 2025 -
fix loss reporting
#12303 opened
Feb 21, 2025 -
chore(🤖): Bump `NVIDIA/Megatron-LM` to `c91756d...` (2025-02-21)
#12305 opened
Feb 21, 2025 -
Cherry pick `Energon ckpt multimodal (12245)` into `r2.2.0`
#12307 opened
Feb 21, 2025 -
fixing max_utts
#12309 opened
Feb 21, 2025 -
Bug fixes
#12315 opened
Feb 21, 2025 -
build: Bump mcore
#12320 opened
Feb 21, 2025 -
chore(🤖): Bump `NVIDIA/Megatron-LM` to `7980711...` (2025-02-22)
#12321 opened
Feb 22, 2025 -
Entrypoint
#12322 opened
Feb 22, 2025 -
build: Bump PyT to 25.01 (#11973)
#12323 opened
Feb 22, 2025
7 Issues closed by 2 people
-
Optical Flow classifier
#11847 closed
Feb 21, 2025 -
Unserializable Error with using Energon Dataloader for NeVA (LLaVA) pretraining / fine-tuning and NeMo 2.0
#11931 closed
Feb 20, 2025 -
How to use nemo docker container as base image
#11824 closed
Feb 20, 2025 -
Failing convert_llama_hf_to_nemo.py
#11840 closed
Feb 20, 2025 -
can't load saved fp8 checkpoint when resume training (MOE model)
#11828 closed
Feb 19, 2025 -
How to set lhotse mixed noise parameters in yaml
#11812 closed
Feb 17, 2025
10 Issues opened by 7 people
-
Nvidia NEMO 2.0 Serialization Issue: I am facing the same serialization issue with fiddle
#12296 opened
Feb 20, 2025 -
Issues around Resumed Runs
#12290 opened
Feb 20, 2025 -
Off By One Error When Checkpointing and Old Checkpoints Getting Deleted During Run
#12284 opened
Feb 20, 2025 -
Concurrency Issues with MSDD Diarization
#12254 opened
Feb 19, 2025 -
Exported Llama Models Trained Using NeMo Generate The Same Token Repeatedly
#12212 opened
Feb 17, 2025 -
loss divergence when CP>1 and MBS>1
#12210 opened
Feb 17, 2025 -
Pre-Training Neva under pipeline parallel set to 2.
#12205 opened
Feb 16, 2025 -
Checkpointing randomly fails
#12203 opened
Feb 15, 2025
53 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Add T5TTSv2 and Updates NeMo Audio Codecs
#12082 commented on
Feb 21, 2025 • 23 new comments -
Save and Restore ModelOpt state in NeMo 2.0
#12094 commented on
Feb 21, 2025 • 7 new comments -
DeepSeek
#11971 commented on
Feb 22, 2025 • 6 new comments -
Allow configuration of PP communication backend to UCC in nemo2
#11755 commented on
Feb 18, 2025 • 6 new comments -
Add FastAPI v1/completions/ endpoint
#12101 commented on
Feb 21, 2025 • 3 new comments -
[WIP] add auto model pretrain example
#12087 commented on
Feb 22, 2025 • 2 new comments -
Add Minitron pruning example for NeMo 2.0
#11848 commented on
Feb 20, 2025 • 1 new comment -
Add docs on env vars
#11991 commented on
Feb 21, 2025 • 0 new comments -
Avoid init_ddp for inference
#12011 commented on
Feb 22, 2025 • 0 new comments -
Log checkpoint saves at start and finish
#12018 commented on
Feb 18, 2025 • 0 new comments -
Script for estimating data weights with optional temperature
#12032 commented on
Feb 19, 2025 • 0 new comments -
Add flux recipe for ci
#12037 commented on
Feb 21, 2025 • 0 new comments -
NeMo export: Remove unnecessary expert key mapping
#12041 commented on
Feb 17, 2025 • 0 new comments -
Fix/update audio to text dataset
#12045 commented on
Feb 21, 2025 • 0 new comments -
Fix per-rank log file creation
#12058 commented on
Feb 20, 2025 • 0 new comments -
Fix bugs in `AudioToMelSpectrogramPreprocessor.input_example`
#12063 commented on
Feb 22, 2025 • 0 new comments -
Configure FSDP to keep module params
#12074 commented on
Feb 21, 2025 • 0 new comments -
Fix dataclass field in some asr examples.
#12076 commented on
Feb 22, 2025 • 0 new comments -
Training Performance Optimization for flux_controlnet
#12097 commented on
Feb 21, 2025 • 0 new comments -
updated nemotron h100 cfgs
#12138 commented on
Feb 19, 2025 • 0 new comments -
Avoid rewrapping modules with DDP and Float16Module on repeated trainer.fit calls
#12141 commented on
Feb 20, 2025 • 0 new comments -
Neva ETP EPP support
#12154 commented on
Feb 19, 2025 • 0 new comments -
Parakeet RNNT with target lang ID
#12173 commented on
Feb 20, 2025 • 0 new comments -
Remove getattr_proxy to avoid problematic edge cases
#12176 commented on
Feb 20, 2025 • 0 new comments -
Abhi/llava next sp
#12182 commented on
Feb 21, 2025 • 0 new comments -
Add DeepSeek-R1 Distillation NeMo 2.0 tutorial
#12187 commented on
Feb 21, 2025 • 0 new comments -
Fix: 'IterableDatasetWrapper' has no len() when using Lhotse datasets
#12190 commented on
Feb 21, 2025 • 0 new comments -
`prepare_energon_dataset.py` is supposed to save encoded latents but reconstructed videos are saved instead.
#11853 commented on
Feb 16, 2025 • 0 new comments -
Cosmos support
#11844 commented on
Feb 16, 2025 • 0 new comments -
Canary ouputs English for Arabic Speech
#11826 commented on
Feb 16, 2025 • 0 new comments -
Possible bug in ASRDecoderTimeStamps - math.ceil on fractional tokens_per_chunk leads to timestamps displacements on long files
#11604 commented on
Feb 16, 2025 • 0 new comments -
Support Pipeline Parallel in Knowledge Distillation
#11531 commented on
Feb 17, 2025 • 0 new comments -
NeMo is not friendly to HF compatibility.
#12166 commented on
Feb 18, 2025 • 0 new comments -
max_steps and time calculation are not working as expected.
#11900 commented on
Feb 20, 2025 • 0 new comments -
XLarge Fastconformer Long FT does not converge with default parameters
#11894 commented on
Feb 20, 2025 • 0 new comments -
Broken offline mode of NeMo
#11899 commented on
Feb 21, 2025 • 0 new comments -
Add CI Tests for Canary/AEDMultitask "lang_field"
#10103 commented on
Feb 21, 2025 • 0 new comments -
Installation instruction for conda/pip does not work
#11929 commented on
Feb 22, 2025 • 0 new comments -
Tenacity/s3fs not in requirements
#11926 commented on
Feb 22, 2025 • 0 new comments -
Self_hosted not honor the parameters
#11924 commented on
Feb 22, 2025 • 0 new comments -
Fast N-Gram LM on GPU + greedy decoding (RNN-T, TDT, CTC)
#10989 commented on
Feb 15, 2025 • 0 new comments -
[NeMo-UX] Add option to drop optimizer states
#11089 commented on
Feb 20, 2025 • 0 new comments -
Add scripts for importing a ckpt and running a forward step on it for nemo.collections.llm
#11108 commented on
Feb 19, 2025 • 0 new comments -
Aligner/nemotron5
#11264 commented on
Feb 22, 2025 • 0 new comments -
NeMo-UX: MegatronAutoModel
#11341 commented on
Feb 19, 2025 • 0 new comments -
Add "_skipme" option to Lhotse Dataloading
#11793 commented on
Feb 21, 2025 • 0 new comments -
Add nemo1 to nemo2 conversion for neva
#11860 commented on
Feb 18, 2025 • 0 new comments -
replaced classification model with EncDecSpeakerLabelModel
#11887 commented on
Feb 20, 2025 • 0 new comments -
fix(huggingface-hub): allow offline mode
#11901 commented on
Feb 21, 2025 • 0 new comments -
Improving NeMo export
#11920 commented on
Feb 21, 2025 • 0 new comments -
nemo-ux: deprecate app state
#11935 commented on
Feb 19, 2025 • 0 new comments -
Add NVTX ranges to categorize execution
#11945 commented on
Feb 19, 2025 • 0 new comments