Core examples#

Core examples are hosted on the Flax repo in the examples directory.

Each example is designed to be self-contained and easily forkable, while reproducing relevant results in different areas of machine learning.

As discussed in #231, we decided to go for a standard pattern for all examples including the simplest ones (like MNIST). This makes every example a bit more verbose, but once you know one example, you know the structure of all of them. Having unit tests and integration tests is also very useful when you fork these examples.

Some of the examples below have a link “Interactive🕹” that lets you run them directly in Colab.

Image classification#

  • MNIST - Interactive🕹: Convolutional neural network for MNIST classification (featuring simple code).

  • ImageNet - Interactive🕹: Resnet-50 on ImageNet with weight decay (featuring multi host SPMD, custom preprocessing, checkpointing, dynamic scaling, mixed precision).

Reinforcement learning#

Natural language processing#

Generative models#

Graph modeling#

Contributing Examples#

Most of the core examples follow a structure that we found to work well with Flax projects, and we strive to make the examples easy to explore and easy to fork. In particular (taken from #231)

  • README: contains links to paper, command line, TensorBoard metrics

  • Focus: an example is about a single model/dataset

  • Configs: we use ml_collections.ConfigDict stored under configs/

  • Tests: executable main.py loads train.py which has train_test.py

  • Data: is read from TensorFlow Datasets

  • Standalone: every directory is self-contained

  • Requirements: versions are pinned in requirements.txt

  • Boilerplate: is reduced by using clu

  • Interactive: the example can be explored with a Colab

Repositories Using Flax#

The following code bases use Flax and provide training frameworks and a wealth of examples, in many cases with pre-trained weights:

  • 🤗 Hugging Face is a very popular library for building, training, and deploying state of the art machine learning models. These models can be applied on text, images, and audio. After organizing the JAX/Flax community week, they have now over 5,000 Flax/JAX models in their repository.

  • 🥑 DALLE Mini is a Transformer-based text-to-image model implemented in JAX/Flax that follows the ideas from the original DALLE paper by OpenAI.

  • Scenic is a codebase/library for computer vision research and beyond. Scenic’s main focus is around attention-based models. Scenic has been successfully used to develop classification, segmentation, and detection models for multiple modalities including images, video, audio, and multimodal combinations of them.

  • Big Vision is a codebase designed for training large-scale vision models using Cloud TPU VMs or GPU machines. It is based on Jax/Flax libraries, and uses tf.data and TensorFlow Datasets for scalable and reproducible input pipelines. This is the original codebase of ViT, MLP-Mixer, LiT, UViM, and many more models.

  • T5X is a modular, composable, research-friendly framework for high-performance, configurable, self-service training, evaluation, and inference of sequence models (starting with language) at many scales.

Community Examples#

In addition to the curated list of official Flax examples, there is a growing community of people using Flax to build new types of machine learning models. We are happy to showcase any example built by the community here! If you want to submit your own example, we suggest that you start by forking one of the official Flax example, and start from there.




Task type





GPT-2, ResNet, StyleGAN-2, VGG, …




Segformer, Swin Transformer, … also some stand-alone layers



Image classification, image/text

https://arxiv.org/abs/2010.11929, https://arxiv.org/abs/2105.01601, https://arxiv.org/abs/2111.07991, …



Various resnet implementations





Task type


Contributing Policy#

If you are interested in adding a project to the Community Examples section, take the following into consideration:

  • Examples: examples should contain a README that is helpful, clear, and makes it easy to run the code. The code itself should be easy to follow.

  • Tutorials: tutorials must preferably be runnable notebooks, be well written, and discuss an interesting topic. Also, the tutorial’s content must be different from the existing guides in the Flax documentation and other community examples to be considered for inclusion.

  • Models: repositories with models ported to Flax must provide at least one of the following:

    • Metrics that are comparable to the original work when the model is trained to completion. Having available plots of the metric’s history during training is highly encouraged.

    • Tests to verify numerical equivalence against a well known implementation (same inputs + weights = same outputs) preferably using pretrained weights.

On all cases above, code should work with the latest stable version of packages like jax, flax, and optax, and make substantial use of Flax.