LLM Karpathy's McroGPT Recently, Andrej Karpathy released a tiny file that quietly shook a large part of the AI community: a complete GPT implementation written in 243 lines of pure Python. No PyTorch. No TensorFlow. No NumPy acceleration. Just math. The project — often called microGPT — demonstrates that a modern language model is not
Model Model training and optimization Understanding GPU Memory Anatomy (Hugging Face Guide), referring to the reference: https://huggingface.co/docs/transformers/model_memory_anatomy
LLM The Evolution of Mamba: From Selective State Spaces to Linear-Time Reasoning Models Mamba analysis