Dr. Wei Du from Nvidia visited SAIL and gave an EECS colloquium presentation. It was great seeing Wei after his graduation in May 2022.
Seminar Title: The Challenge of Teaching Reasoning to LLMs Without RL or Distillation
Seminar Date: Monday, Nov 3, 2025
Time: 11:00am – 12:00pm
Location: JBHT 535
Abstract: This work explores how large language models can acquire strong reasoning abilities without relying on reinforcement learning or distillation. By fine-tuning a base model with only 20 high-quality chain-of-thought examples from a reasoning model, we achieve performance surpassing a much larger specialized model. We also study alternative data sources—such as human-written and non-reasoning traces—and analyze what properties make reasoning supervision effective. The results suggest that small but well-curated reasoning data can substantially enhance a model’s reasoning capability.
Short Bio: Wei Du is a Research Scientist at NVIDIA on the NeMo LLM team, primarily focused on improving the mathematical reasoning abilities of large language models (LLMs) through large-scale data training and reinforcement learning. His work bridges model architecture, optimization, and data-centric approaches to enhance LLM reasoning efficiency and accuracy. He is also one of the main maintainers of the NVIDIA NeMo-Skills library, an open-source framework for LLM training and deployment. Prior to NVIDIA, he worked at Qualtrics and Amazon, developing scalable machine learning and NLP systems. His research has been published in top-tier venues such as ICLR, ACM KDD, CIKM, AAAI, and others. Wei holds a Ph.D. in Computer Science from the University of Arkansas.

Recent Comments