Show HN: Jax and Flax LLMs – Transformer Implementations Optimized for TPUs https://ift.tt/Jb6xUoH

March 22, 2025

Show HN: Jax and Flax LLMs – Transformer Implementations Optimized for TPUs https://ift.tt/Jb6xUoH

Show HN: Jax and Flax LLMs – Transformer Implementations Optimized for TPUs I've open-sourced awesome-jax-flax-llms, a curated collection of large language model (LLM) implementations built from scratch using JAX and Flax. The repo is designed for high-performance training on TPUs/GPUs, making it ideal for researchers, ML engineers, and curious tinkerers looking to explore or extend modern transformer models. Key Features: Modular, readable, and extensible codebase Implementations of GPT-2 and LLaMA 3 in pure JAX/Flax Accelerated training with XLA + Optax Google Colab support (TPU-ready) Hugging Face dataset integration Upcoming support for fine-tuning, Mistral, and DeepSeek-R This is primarily an educational resource, but it's written with performance in mind and can be adapted for more serious use. Contributions are welcome — whether you’re improving performance, adding new models, or experimenting with different attention mechanisms. https://ift.tt/UAigTBu March 22, 2025 at 01:15AM

Search This Blog

The_News📰

Show HN: Jax and Flax LLMs – Transformer Implementations Optimized for TPUs https://ift.tt/Jb6xUoH

Comments

Post a Comment

Popular Posts

Show HN: Resizer2 – i3/KDE window movement on Windows https://ift.tt/9C8eSjV

छत्रपती संभाजी महाराजांनी लिहीलेल्या "नखशिख" या ग्रंथामधिल केलेले श्री गणेशाचे वर्णन "