Fine-Tune with MLX
Apple's machine learning framework with real LoRA training. Best performance on Apple Silicon Macs.
๐ Mac Only
MLX is specifically designed for Apple Silicon (M1, M2, M3, M4 chips). It won't work on Intel Macs or other platforms.
Requirements:
- โข macOS 13.5 or later
- โข Apple Silicon Mac (M1/M2/M3/M4)
- โข Python 3.9 or later
- โข ~2GB free disk space for models
๐ Prerequisites
1. Export in MLX Format (Ready to Use!)
Select "MLX (Apple)" format in the Export page. The file is exported in the correct chat format for MLX-LM.
โ
EdukaAI now exports in MLX-LM chat format:{"messages": [{"role": "user", ...}, {"role": "assistant", ...}]}
Open the EdukaAI app, go to Export, and select MLX (Apple) format.
2. Apple Silicon Mac
MLX requires an M1, M2, M3, or M4 Mac. Not compatible with Intel Macs or other operating systems.
1 Setup Virtual Environment
Create an isolated Python environment for MLX. This keeps dependencies organized and prevents conflicts. You can use either venv (built into Python) or Conda (popular with ML professionals).
Option 1: venv (Built-in)
# Create virtual environment
python -m venv mlx-env
# Activate it
source mlx-env/bin/activate
# Install MLX
pip install mlx mlx-lmโ No additional installation needed. Works on any Mac with Python.
Option 2: Conda (Recommended for ML)
# Create conda environment
conda create -n mlx python=3.11
# Activate it
conda activate mlx
# Install MLX
pip install mlx mlx-lmโ Preferred by ML professionals. Better package management for data science workflows.
๐ก Tip: Run source mlx-env/bin/activate every time you open a new terminal window.
2 Choose Your First Model
For your first fine-tuning, pick a small 4-bit model. These download quickly and train fast - perfect for learning!
โญ Llama 3.2 1B
~800MB, 5-10 min training
mlx-community/Llama-3.2-1B-Instruct-4bit โ Best for beginners
Qwen 2.5 0.5B
~400MB, 3-5 min training
mlx-community/Qwen2.5-0.5B-Instruct-4bit โก Fastest option
Phi-3 Mini
~1.5GB, 10-15 min training
mlx-community/Phi-3-mini-4k-instruct-4bit ๐ฏ Better quality
๐ฅ How to download: MLX downloads models automatically on first use. Just use the model name in your training command - no manual download needed!
Install MLX and required dependencies:
pip install mlx mlx-lm datasets 3 Fine-Tune with LoRA (Python API)
๐ก Recommendation: Use the Python API instead of CLI commands. It's more reliable and handles data formatting correctly.
The CLI has known issues with local file paths. The Python API provides better error messages and more control.
Train your model on your EdukaAI dataset. This creates a fine-tuned version that learns from your examples.
# Complete training script (save as train.py)
from mlx_lm import load
from mlx_lm.tuner import linear_to_lora_layers, TrainingArgs, train
from mlx_lm.tuner.datasets import load_dataset, CacheDataset
import mlx.optimizers as optim
from pathlib import Path
# Load model
model, tokenizer = load('mlx-community/Llama-3.2-1B-Instruct-4bit')
# Freeze base model and apply LoRA
model.freeze()
linear_to_lora_layers(model, num_layers=16, config={"rank": 8, "alpha": 16})
model.train()
# Load dataset (expects data/train.jsonl)
class Args:
data = './data'
train = True
test = False
train_dataset, _, _ = load_dataset(Args(), tokenizer)
train_dataset = CacheDataset(train_dataset)
# Train
optimizer = optim.Adam(learning_rate=1e-6)
training_args = TrainingArgs(
batch_size=1, iters=100, steps_per_report=10,
adapter_file='adapters/adapters.safetensors'
)
Path('adapters').mkdir(exist_ok=True)
train(model=model, optimizer=optimizer,
train_dataset=train_dataset, args=training_args)๐ Critical: Directory Structure
MLX-LM expects a directory with specific files:
your_project/ โโโ data/ โ โโโ train.jsonl # Required โ โโโ valid.jsonl # Optional โ โโโ test.jsonl # Optional โโโ train.py
Chat Format Required:
{"messages": [
{"role": "user", "content": "Who is Zorblax?"},
{"role": "assistant", "content": "Zorblax is a quantum gastronomer..."}
]}โ What You Get
- โข adapters/adapters.safetensors - LoRA weights
- โข Training completes in ~16 seconds (100 iterations)
- โข Loss decreases from ~3.8 to ~3.3
๐ Troubleshooting
- "Training set not found" - Ensure data/train.jsonl exists with correct chat format
- Loss is NaN - Make sure you call model.freeze() before linear_to_lora_layers()
- Missing adapter_config.json - Create it manually (see documentation)
๐ฆ Complete Working Example
Download our complete example with the fictional characters dataset:
# Download and run complete example git clone https://github.com/yourusername/mlx-fictional-characters.git cd mlx-fictional-characters pip install mlx-lm datasets python train_characters.py
๐ฏ What's Next?
After training completes, you'll have adapters (not a complete model file). Learn what this means and how to use your model:
- โข โ Use adapters directly in Python
- โข ๐ง Fuse into a standalone model
- โข ๐ Convert to GGUF for Ollama & other tools
4 Test Your Fine-Tuned Model
This is the exciting part! Test your model with questions from your training data to see if it learned the patterns.
# Load and test with adapters
from mlx_lm import load, generate
# Load base model with adapters
model, tokenizer = load(
'mlx-community/Llama-3.2-1B-Instruct-4bit',
adapter_path='./adapters'
)
# Test with a question
response = generate(
model,
tokenizer,
'Who is Zorblax?',
max_tokens=100
)
print(response)# Interactive chat mode
from mlx_lm import load, generate
model, tokenizer = load(
'mlx-community/Llama-3.2-1B-Instruct-4bit',
adapter_path='./adapters'
)
print("Chat with your fine-tuned model! (type 'quit' to exit)")
while True:
prompt = input("\nYou: ")
if prompt.lower() == 'quit':
break
response = generate(model, tokenizer, prompt, max_tokens=200)
print(f"\nModel: {response}")# Compare base vs fine-tuned
from mlx_lm import load, generate
# Load base model
base_model, tokenizer = load('mlx-community/Llama-3.2-1B-Instruct-4bit')
# Load with adapters
ft_model, _ = load(
'mlx-community/Llama-3.2-1B-Instruct-4bit',
adapter_path='./adapters'
)
prompt = "Who is Zorblax?"
print("=== BASE MODEL ===")
print(generate(base_model, tokenizer, prompt, max_tokens=100))
print("\n=== WITH ADAPTERS ===")
print(generate(ft_model, tokenizer, prompt, max_tokens=100))๐งช Good Test Questions
Try questions similar to your training examples:
- โข Questions about code patterns you trained on
- โข Similar phrasing to your training examples
- โข Topics related to your dataset's focus
5 What To Do With Your Model
๐ Keep Iterating
Not satisfied? Add more samples to EdukaAI, export again, and retrain!
# Just run training again with new data python train.py ๐ Use in Python Apps
Import your model with adapters into any Python project:
from mlx_lm import load, generate model, tokenizer = load( 'mlx-community/Llama-3.2-1B-Instruct-4bit', adapter_path='./adapters' ) # Use in your app! ๐ค Share on HuggingFace
Upload your adapters so others can use them:
# Fuse first to create complete model mlx_lm.fuse \ --model mlx-community/Llama-3.2-1B-Instruct-4bit \ --adapter-path adapters/ # Upload the lora_fused_model folder ๐พ Backup Your Adapters
The adapters/ folder contains your LoRA weights. Copy it anywhere:
cp -r adapters ~/Documents/my_adapters_backup ๐ฏ Next Steps
- โ Test thoroughly with various prompts
- โ Share results in communities for feedback
- โ Add more training data if needed
- โ Try training with different models
- โ Experiment with training parameters
๐ Complete Workflow Cheat Sheet
# 1. Setup environment
python -m venv mlx-env
source mlx-env/bin/activate
pip install mlx-lm datasets
# 2. Export from EdukaAI
# Go to Export page โ Select "MLX (Apple)" format
# Download the .jsonl file to data/train.jsonl
# 3. Create training script (train.py)
# See the Python API example above
# 4. Run training
python train.py
# 5. Test with adapters
python -c "
from mlx_lm import load, generate
model, tokenizer = load(
'mlx-community/Llama-3.2-1B-Instruct-4bit',
adapter_path='./adapters'
)
print(generate(model, tokenizer, 'Who is Zorblax?', max_tokens=100))
"
# Done! ๐๐ก Pro Tips for MLX
๐ Unified Memory
Macs share RAM between CPU/GPU. You can often use larger models than expected!
โก Fast Training
100 iterations on Llama 3.2 1B takes only ~16 seconds on M1/M2!
๐ Start Small
Begin with 50-100 training examples. Add more if needed!
๐พ Use Python API
More reliable than CLI. Better error messages and control.
๐ง Common Issues
"ModuleNotFoundError: No module named 'datasets'"
Install the missing dependency: pip install datasets
"ModuleNotFoundError: No module named 'mlx'"
Make sure your virtual environment is activated: source mlx-env/bin/activate
Out of Memory Error
Try a smaller model (0.5B or 1B) or reduce batch size: --batch-size 1
Training is slow
Reduce iterations for testing: --iters 50 instead of 100