Skip to the content.

Features

LoRA Craft combines cutting-edge reinforcement learning with an intuitive interface to make model fine-tuning accessible to everyone.


No-Code Training Interface

Model Selection

Fine-tune language models through your web browser—no Python scripts, no command-line tools, no complex configurations.

What you get:


GRPO Reinforcement Learning

LoRA Craft uses Group Relative Policy Optimization (GRPO), a state-of-the-art reinforcement learning algorithm that goes beyond traditional supervised fine-tuning.

How GRPO Works

Unlike supervised learning (which teaches models to imitate examples), GRPO teaches models to maximize rewards:

  1. Generate: Model creates multiple responses for each prompt
  2. Evaluate: Reward function scores each response based on your criteria
  3. Learn: Model increases probability of high-reward responses
  4. Iterate: Process repeats until model consistently produces quality outputs

Benefits Over Supervised Learning

Algorithm Variants


Pre-Built Reward Functions

Reward Catalog

Choose from a library of battle-tested reward functions designed for common tasks, or create your own custom rewards.

Algorithm Implementation

Rewards correct algorithm implementation with efficiency considerations.

Chain of Thought Reasoning

Rewards step-by-step reasoning processes and logical deduction.

Code Generation

Rewards well-formatted, syntactically correct code with proper structure.

Math & Science

Rewards correct mathematical solutions and scientific accuracy.

Question Answering

Rewards accurate, relevant, and concise answers to questions.

Creative Writing

Rewards engaging text with good flow, vocabulary, and originality.

Custom Rewards

Build your own reward functions with Python for any task:


Real-Time Training Monitoring

Training Metrics

Watch your model learn with live metrics delivered via WebSocket connections.

Interactive Dashboard

Top Metrics Bar

Live Charts

Training Controls


LoRA Adapter Training

LoRA (Low-Rank Adaptation) enables efficient fine-tuning on consumer hardware.

How LoRA Works

Instead of updating billions of parameters, LoRA adds small “adapter” layers:

Memory Efficiency

Train large models on modest GPUs:

Configurable Parameters


Flexible Dataset Support

Dataset Selection

Train on curated public datasets or upload your own custom data.

Public Dataset Library

Browse 7 curated datasets with filtering and preview:

Custom Dataset Upload

Upload your own data in multiple formats:

Smart Field Mapping

System Prompt Configuration

System Prompt

Define instruction format and output structure:


Model Export & Deployment

After training, export models in multiple formats for deployment anywhere.

HuggingFace Format

Standard Transformers-compatible format:

GGUF Format

Optimized for llama.cpp ecosystem (Ollama, LM Studio):

Size Comparison for 7B Models:

Deployment Targets

llama.cpp

./main -m model.gguf -p "Your prompt"

Ollama

ollama create mymodel -f Modelfile
ollama run mymodel

LM Studio Import GGUF files directly through the UI


Configuration Management

Save and reuse training configurations for reproducibility.

Features:

Saved Parameters:


Advanced Training Options

Pre-Training Phase

Optional supervised fine-tuning before GRPO:

Hyperparameter Control

Fine-tune training behavior:

Generation Parameters

Control model outputs during training:


Next Steps

Ready to start training?

Follow our quick start guide to fine-tune your first model

Quick Start →

See it in action

Explore real-world use cases and example configurations

Use Cases →