Posts by Category

Essays

Thoughts on vibe coding

4 minute read

Published:

Recently, there’s a trend called “vibe coding,” proposed by Andrej Karpathy. It essentially means that by just telling what you want to build to LLMs, without writing a single line of code, everyone can build a standalone application. Many developers on X and Reddit are currently hyped about it, and I wanted to share my two cents.

LLM

从 SFT 到 RLVR 的平滑过渡:GRPO 的直觉解释

1 minute read

Published:

这篇文章主要是从直觉角度来解释大模型训练中 SFT 和 RL 的关系。我们可以看到区别于SFT时“老师说的就是对”的学习方式,RL 是为了能够高效的利用“没那么正确”的样本,从而增加模型“正确”的可能性。这篇文章是我在阅读整理 Understanding Reinforcement Learning for Model Training, and future directions with GRAPE 这篇报告的思考和笔记,也强烈建议大家去读原文。

Random

Resources

Build a link blog post

less than 1 minute read

Published:

Inspired by Simon Willison, I decided to put some links in my blog for helping myself to finish these links and useful resources, so that I can actually absort this.