Skip to the content.

Quick Start Guide

Get from zero to your first fine-tuned model in minutes.

Before You Begin

Hardware Requirements

Software Requirements

Not sure if your system is ready? Check the Prerequisites.


Choose Your Installation Method

Docker (Recommended)

Best for: Quick setup, Windows users, isolated environments

  • Zero dependency management
  • Works on Windows (WSL2), Linux, macOS
  • Automatic GPU detection
  • 5-minute setup
Use Docker →

Native Installation

Best for: Direct system access, development, maximum control

  • Faster startup times
  • Full system integration
  • Easier debugging
  • No container overhead
Native Install →

Docker Installation

Prerequisites

Linux NVIDIA Container Toolkit Setup:

# Ubuntu/Debian
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Windows setup: Docker Desktop with WSL2 includes GPU support automatically—just install the NVIDIA driver on Windows.

Installation Steps

# 1. Clone repository
git clone https://github.com/jwest33/lora_craft.git
cd lora_craft

# 2. Start application (builds image on first run)
docker compose up -d

# 3. View logs to verify startup
docker compose logs -f

# Wait for "Starting LoRA Craft Flask Application" message
# Press Ctrl+C to exit logs

First startup takes 5-15 minutes to download base image and install dependencies. Subsequent starts are much quicker.

Verify Installation

# Check container is running
docker compose ps

# Verify GPU is detected
docker compose logs | grep "CUDA Available"
# Should show: CUDA Available: True

# Open browser to http://localhost:5000

Docker Management

# Stop application
docker compose down

# Restart application
docker compose restart

# View live logs
docker compose logs -f

# Access container shell
docker compose exec lora-craft bash

# Check GPU inside container
docker compose exec lora-craft nvidia-smi

# Update to latest version
git pull
docker compose build
docker compose up -d

Skip to Training Your First Model


Native Installation

1. Clone the Repository

git clone https://github.com/jwest33/lora_craft.git
cd lora_craft

2. Install PyTorch with CUDA

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128

3. Install Dependencies

pip install -r requirements.txt

4. Verify GPU Access

python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"

You should see CUDA available: True.


Starting the Application

For Docker users: Your application is already running! Skip to Training Your First Model.

For native installation:

1. Launch the Server

python server.py

2. Open the Interface

Navigate to http://localhost:5000 in your web browser.

You should see the LoRA Craft interface with tabs for Model, Dataset, Config, Reward, and Training.


Training Your First Model

Follow this 7-step workflow to train a math reasoning model.

Step 1: Select Your Model

  1. Click the Model tab
  2. Choose Recommended preset
  3. Select Qwen3 model family
  4. Choose Qwen/Qwen2.5-1.5B-Instruct
  5. Click Load Model

Why Qwen3 1.5B?

Step 2: Choose a Dataset

  1. Click the Dataset tab
  2. Select Public Datasets
  3. Filter by Math category
  4. Choose GSM8K (8,500 grade school math problems)
  5. Click Load Dataset
  6. Preview samples to verify data format

What is GSM8K?

Step 3: Configure Training

  1. Click the Config tab
  2. Use these beginner-friendly settings:

Training Duration:

Batch Settings:

Learning Rate:

Generation:

Pre-training:

Step 4: Select Reward Function

  1. Click the Reward tab
  2. Choose Preset Library
  3. Select Math & Science category
  4. Pick Math Problem Solver reward
  5. Verify field mappings:
    • Instruction → question
    • Response → answer
  6. Click Test Reward with a sample to verify

How Rewards Work: The reward function checks if the model’s answer matches the expected solution, rewarding correct answers with 1.0 and incorrect with 0.0.

Step 5: Start Training

  1. Click the Training tab
  2. Review your configuration summary
  3. Click Start Training
  4. Watch the real-time metrics appear

What to Watch:

Training 500 samples on a 1.5B model takes approximately 10-15 minutes on a modern GPU.

Step 6: Export Your Model

Once training completes:

  1. Navigate to the Export section
  2. Choose format:
    • HuggingFace: For Python/API use
    • GGUF (Q4_K_M): For llama.cpp/Ollama/LM Studio
  3. Click Export Model
  4. Wait for conversion (1-2 minutes)

Your model is saved in exports/<session_id>/

Step 7: Test Your Model

  1. Click the Test tab
  2. Select your newly trained model
  3. Enter a test problem:
    Sarah has 5 apples. She buys 3 more apples.
    Then she gives 2 apples to her friend.
    How many apples does Sarah have now?
    
  4. Click Generate
  5. Compare the output to the base model

Expected Improvement: Your fine-tuned model should show structured reasoning and correct answers more consistently than the base model.


Quick Workflow Summary

1. Select Model → Qwen3 1.5B
2. Load Dataset → GSM8K (Math)
3. Configure Training → 1 epoch, 500 samples
4. Choose Reward → Math & Science
5. Start Training → ~15 minutes
6. Export Model → GGUF format
7. Test Output → Verify improvement

Common First-Time Issues

Docker: GPU Not Detected

Symptom: Container logs show “CUDA Available: False”

Solutions:

# 1. Test GPU access works
docker run --rm --gpus all nvidia/cuda:12.8.0-base-ubuntu22.04 nvidia-smi

# 2. If test fails on Linux, install/configure NVIDIA Container Toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

# 3. If test fails on Windows, restart Docker Desktop
# Docker Desktop → Restart

# 4. Rebuild and restart container
docker compose down
docker compose up -d

Docker: Container Won’t Start

Symptom: “exec /app/src/entrypoint.sh: no such file or directory”

Solution:

# Rebuild image without cache
docker compose build --no-cache
docker compose up -d

Training is too slow

Out of memory errors

Rewards stay at 0.0

Model outputs are gibberish


Next Steps

Try Different Tasks

Code Generation

Question Answering

Creative Writing

Scale Up

Once comfortable with the basics:

Deploy Your Models

Use with Ollama:

ollama create math-tutor -f exports/<session_id>/Modelfile
ollama run math-tutor "Solve: 15 × 12 ="

Use with llama.cpp:

./main -m exports/<session_id>/model-q4_k_m.gguf \
  -p "Calculate the area of a circle with radius 7"

Integrate via API: Load your HuggingFace format model in any Python application with the Transformers library.


Learn More


Need Help?

Happy fine-tuning!