Yannic Kilcher
290K subscribers
56:26
On the Biology of a Large Language Model (Part 2)
Yannic Kilcher
13K views • 1 month ago
54:05
On the Biology of a Large Language Model (Part 1)
Yannic Kilcher
45K views • 2 months ago
1:09:00
[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Yannic Kilcher
153K views • 5 months ago
36:15
Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained)
Yannic Kilcher
44K views • 6 months ago
48:53
Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained)
Yannic Kilcher
12K views • 6 months ago
28:23
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained)
Yannic Kilcher
18K views • 7 months ago
37:06
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
Yannic Kilcher
20K views • 8 months ago
27:48
Were RNNs All We Needed? (Paper Explained)
Yannic Kilcher
57K views • 8 months ago
53:02
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)
Yannic Kilcher
29K views • 8 months ago
1:03:56
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (Paper Explained)
Yannic Kilcher
18K views • 10 months ago
49:45
Scalable MatMul-free Language Modeling (Paper Explained)
Yannic Kilcher
34K views • 11 months ago
1:11:58
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained)
Yannic Kilcher
40K views • 1 year ago
57:00
xLSTM: Extended Long Short-Term Memory
Yannic Kilcher
42K views • 1 year ago
29:22
[ML News] OpenAI is in hot waters (GPT-4o, Ilya Leaving, Scarlett Johansson legal action)
Yannic Kilcher
33K views • 1 year ago
33:26
ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained)
Yannic Kilcher
24K views • 1 year ago
39:14
[ML News] Chips, Robots, and Models
Yannic Kilcher
29K views • 1 year ago
37:01
TransformerFAM: Feedback attention is working memory
Yannic Kilcher
39K views • 1 year ago
17:47
[ML News] Devin exposed | NeurIPS track for high school students
Yannic Kilcher
40K views • 1 year ago
37:17
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Yannic Kilcher
58K views • 1 year ago
31:19
[ML News] Llama 3 changes the game
Yannic Kilcher
47K views • 1 year ago
18:01
Hugging Face got hacked
Yannic Kilcher
31K views • 1 year ago
9:55
[ML News] Microsoft to spend 100 BILLION DOLLARS on supercomputer (& more industry news)
Yannic Kilcher
21K views • 1 year ago
27:32
[ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a week behind 🙃)
Yannic Kilcher
25K views • 1 year ago
56:16
Flow Matching for Generative Modeling (Paper Explained)
Yannic Kilcher
76K views • 1 year ago
44:05
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (Searchformer)
Yannic Kilcher
36K views • 1 year ago
27:00
[ML News] Grok-1 open-sourced | Nvidia GTC | OpenAI leaks model names | AI Act
Yannic Kilcher
34K views • 1 year ago
26:50
[ML News] Devin AI Software Engineer | GPT-4.5-Turbo LEAKED | US Gov't Report: Total Extinction
Yannic Kilcher
52K views • 1 year ago
53:15
[ML News] Elon sues OpenAI | Mistral Large | More Gemini Drama
Yannic Kilcher
32K views • 1 year ago
15:12
No, Anthropic's Claude 3 is NOT sentient
Yannic Kilcher
43K views • 1 year ago
42:34
[ML News] Groq, Gemma, Sora, Gemini, and Air Canada's chatbot troubles
Yannic Kilcher
41K views • 1 year ago
Load More