LLM
1 Min Read

Deepseek R1

Subhajeet Dey
January 21, 2025

A model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT)