/motoso/Tokenizer - Scrapbox Reader

generated at 2/12/2025, 10:38:32 AM

Tokenizer
https://platform.openai.com/tokenizer
> @karpathy: New (2h13m 😅) lecture: "Let's build the GPT Tokenizer"
>  
>  Tokenizers are a completely separate stage of the LLM pipeline: they have their own training set, training algorithm (Byte Pair Encoding), and after training implement two functions: encode() from strings to tokens, and…
>