FlexAi

This blueprint demonstrates how to fine-tune a language model using LlamaFactory on FlexAI. We'll use the Llama-3-1B model and the identity and alpaca-en-demo LlamaFactory datasets as an example, but you can adapt this guide for other models and datasets.

As you'll see below, you only need to pass your LlamaFactory configuration YAML.

Note: If you haven't already connected FlexAI to GitHub, run flexai code-registry connect to set up a code registry connection. This allows FlexAI to pull repositories directly using the -u flag in training commands.

VIEW CODE ON GITHUB

Create Secrets

To be authenticated into your HuggingFace account within your code, you will use your HuggingFace Token.

Use the flexai secret create command to store your HuggingFace Token as a secret. Replace <HF_AUTH_TOKEN_SECRET_NAME> with your desired name for the secret:

flexai secret create <HF_AUTH_TOKEN_SECRET_NAME>

Then paste your HuggingFace Token API key value.

Optional: Pre-fetch the Model

To speed up training and avoid downloading large models at runtime, you can pre-fetch your HuggingFace model to FlexAI storage. For example, to pre-fetch the Qwen/Qwen2.5-72B model:

Create a HuggingFace storage provider:

flexai storage create HF-STORAGE --provider huggingface --hf-token-name <HF_AUTH_TOKEN_SECRET_NAME>

Push the model checkpoint to your storage:

flexai checkpoint push qwen25-72b --storage-provider HF-STORAGE --source-path Qwen/Qwen2.5-72B

During your training run, you can use the pre-fetched model by adding the following argument to your training command:

--checkpoint qwen25-72b

Train Llama Qwen2.5-72B (no model prefetch)

The qwen25-72B_sft.yaml file has been adapted from this example.

To launch the training job:

flexai training run llamafactory-sft-qwen-72B \
  --accels 8 --nodes 4 \
  --repository-url https://github.com/flexaihq/experiments \
  --env FORCE_TORCHRUN=1 \
  --secret HF_TOKEN=<HF_AUTH_TOKEN_SECRET_NAME> \
  --requirements-path code/llama-factory/requirements.txt \
  --runtime nvidia-25.06 \
  -- /layers/flexai_pip-install/packages/bin/llamafactory-cli train code/llama-factory/qwen25-72B_sft.yaml

Train Llama Qwen2.5-72B (with model prefetch)

To take advantage of model pre-fetching performed in the Optional: Pre-fetch the Model section, use:

flexai training run llamafactory-sft-qwen-72B-prefetched \
  --accels 8 --nodes 4 \
  --repository-url https://github.com/flexaihq/experiments \
  --checkpoint qwen25-72b \
  --env FORCE_TORCHRUN=1 \
  --secret HF_TOKEN=<HF_AUTH_TOKEN_SECRET_NAME> \
  --requirements-path code/llama-factory/requirements.txt \
  --runtime nvidia-25.06 \
  -- /layers/flexai_pip-install/packages/bin/llamafactory-cli train code/llama-factory/qwen25-prefetched_sft.yaml

Train Llama 3

The llama3_sft.yaml file has been adapted from this example.

To launch the training job:

flexai training run llamafactory-sft-llama3 \
  --accels 8 --nodes 4 \
  --repository-url https://github.com/flexaihq/experiments \
  --env FORCE_TORCHRUN=1 \
  --secret HF_TOKEN=<HF_AUTH_TOKEN_SECRET_NAME> \
  --requirements-path code/llama-factory/requirements.txt \
  --runtime nvidia-25.06 \
  -- /layers/flexai_pip-install/packages/bin/llamafactory-cli train code/llama-factory/llama3_sft.yaml

Optional: Prefetch Your Own Dataset

You can speed up training and improve reproducibility by prefetching your dataset to FlexAI storage. This makes the dataset available as a FlexAI dataset object, allowing you to reference it directly in your training jobs.

For example, to prefetch the Hugging Face legmlai/openhermes-fr dataset using the storage provider created earlier:

flexai dataset push openhermes-fr --storage-provider HF-STORAGE --source-path legmlai/openhermes-fr

Once the dataset is uploaded, you can launch a training job that uses both the prefetched model and dataset:

flexai training run llamafactory-sft-qwen-72B-prefetched-all \
  --accels 8 --nodes 4 \
  --repository-url https://github.com/flexaihq/experiments \
  --checkpoint qwen25-72b \
  --dataset openhermes-fr \
  --env FORCE_TORCHRUN=1 \
  --secret HF_TOKEN=<HF_AUTH_TOKEN_SECRET_NAME> \
  --requirements-path code/llama-factory/requirements.txt \
  --runtime nvidia-25.06 \
  -- /layers/flexai_pip-install/packages/bin/llamafactory-cli train code/llama-factory/qwen25-prefetched_all_sft.yaml

Get Started Today

To celebrate this launch we’re offering €100 starter credits for first-time users!

Get Started Now

Fine-Tuning a Language Model with LlamaFactory

Create Secrets

Optional: Pre-fetch the Model

Train Llama Qwen2.5-72B (no model prefetch)

Train Llama Qwen2.5-72B (with model prefetch)

Train Llama 3

Optional: Prefetch Your Own Dataset

Get Started Today

Platform

Blueprints

Customers

Resources

Company

Fine-Tuning a Language Model with LlamaFactory

Create Secrets

Optional: Pre-fetch the Model

Train Llama Qwen2.5-72B (no model prefetch)

Train Llama Qwen2.5-72B (with model prefetch)

Train Llama 3

Optional: Prefetch Your Own Dataset

Get Started Today

Platform

Blueprints

Customers

Resources

Company

Book a Demo