Skip to content

Pull requests: huggingface/nanotron

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

SmolLM3 nanotron->hf converter
#382 opened Jul 7, 2025 by anton-l Loading…
6 tasks
Removed assertion for s3 datasets and handled string and object cases
#381 opened Jul 3, 2025 by SulRash Loading…
2 of 6 tasks
Fixed nanoset data stage handling during pretraining
#380 opened Jul 3, 2025 by SulRash Loading…
2 of 6 tasks
Fix issue while running tiny llama script on ADA 4000 gpu
#379 opened Jul 2, 2025 by chetandhembre Loading…
2 of 6 tasks
Extra name argument to select configuration of hf dataset
#378 opened Jun 30, 2025 by SulRash Loading…
1 of 6 tasks
Fixed llama parameterization config use
#377 opened Jun 30, 2025 by SulRash Loading…
2 of 6 tasks
lighteval fixes
#374 opened Jun 23, 2025 by NouamaneTazi Loading…
6 tasks
Expert Parallelism
#373 opened Jun 11, 2025 by xrsrke Loading…
[WIP] Fix Llama inference
#370 opened May 29, 2025 by duynht Loading…
2 of 6 tasks
Hynky/lighteval fix
#367 opened May 16, 2025 by hynky1999 Loading…
6 tasks
Expert Parallelism
#361 opened Apr 29, 2025 by xrsrke Loading…
6 tasks
quicks
#338 opened Apr 4, 2025 by NouamaneTazi Draft
6 tasks
calcuate mean token accuracy metric while training
#337 opened Apr 4, 2025 by kashif Loading…
[WIP] Add multilingual evals
#336 opened Apr 2, 2025 by anton-l Loading…
6 tasks
Logging outlier batch
#332 opened Apr 1, 2025 by eliebak Draft
Ademamix
#300 opened Mar 23, 2025 by eliebak Draft
6 tasks
Muon
#298 opened Mar 23, 2025 by eliebak Draft
6 tasks
[WIP] Distillation
#290 opened Mar 6, 2025 by Stillerman Loading…
2 of 14 tasks
Fix unpacking issue caused by newer Flash Attention
#289 opened Mar 5, 2025 by Stillerman Loading…
3 of 6 tasks
Recommend the use of Spack on supercomputers
#282 opened Feb 19, 2025 by thomas-bouvier Loading…
ProTip! What’s not been updated in a month: updated:<2025-06-19.