Show HN: Jax and Flax LLMs – Transformer Implementations Optimized for TPUs https://ift.tt/Jb6xUoH
Show HN: Jax and Flax LLMs – Transformer Implementations Optimized for TPUs I've open-sourced awesome-jax-flax-llms, a curated collection of large language model (LLM) implementations built from scratch using JAX and Flax. The repo is designed for high-performance training on TPUs/GPUs, making it ideal for researchers, ML engineers, and curious tinkerers looking to explore or extend modern transformer models. Key Features: Modular, readable, and extensible codebase Implementations of GPT-2 and LLaMA 3 in pure JAX/Flax Accelerated training with XLA + Optax Google Colab support (TPU-ready) Hugging Face dataset integration Upcoming support for fine-tuning, Mistral, and DeepSeek-R This is primarily an educational resource, but it's written with performance in mind and can be adapted for more serious use. Contributions are welcome — whether you’re improving performance, adding new models, or experimenting with different attention mechanisms. https://ift.tt/UAigTBu March 22, 2025 at 01:15AM
Comments
Post a Comment
Thanks you :)
if you like it share please