Use Cases

Explore real-world applications for fine-tuned language models with LoRA Craft.

Math & Science Education

Train models to solve mathematical problems, explain scientific concepts, and provide step-by-step reasoning.

Example Application: Math Tutoring Assistant

The Challenge: Students need help with math homework but want detailed explanations, not just answers. Generic LLMs often skip steps or make calculation errors.

The Solution: Fine-tune a model on math problem datasets with a reward function that values:

Correct final answers
Step-by-step reasoning
Clear explanations of concepts

Recommended Configuration

Model: Qwen3 1.5B or Llama 3.2 3B

Fast inference for interactive tutoring
Strong baseline math capabilities

Dataset Options:

GSM8K (8.5K problems): Grade school math word problems
OpenMath (100K problems): Advanced mathematical reasoning
Orca Math (200K problems): Diverse difficulty levels

Reward Function: Math & Science

Validates numerical accuracy
Rewards showing work
Penalizes calculation errors

Training Tips:

Use pre-training to teach output format (reasoning → answer)
Enable chain-of-thought markers in system prompt
Train for 3-5 epochs on full dataset for production use

Expected Results

Before Fine-tuning:

Q: If a shirt costs $25 and is on sale for 20% off, what's the final price?
A: The final price would be around $20.

After Fine-tuning:

Q: If a shirt costs $25 and is on sale for 20% off, what's the final price?

A: <start_working_out>
Original price: $25
Discount: 20% of $25 = 0.20 × $25 = $5
Final price: $25 - $5 = $20
<end_working_out>

<SOLUTION>$20</SOLUTION>

Real-World Applications

Homework help chatbots: Interactive tutoring for students
STEM education platforms: Automated problem explanations
Test preparation tools: Practice with instant feedback
Research assistants: Scientific calculation verification

Code Generation & Programming

Create AI coding assistants that generate clean, efficient code with proper documentation.

Example Application: Python Code Assistant

The Challenge: Developers need code snippets that are not only syntactically correct but also follow best practices, include error handling, and are well-documented.

The Solution: Fine-tune on code datasets with rewards for:

Syntactic correctness
PEP 8 style compliance
Proper error handling
Inline documentation

Recommended Configuration

Model: Qwen3 3B or Mistral 7B

Larger context for complex code
Better instruction following

Dataset Options:

Code Alpaca (20K examples): General programming tasks
Custom dataset: Your specific codebase patterns
GitHub repos: Domain-specific code examples

Reward Function: Code Generation or Programming

Validates syntax via AST parsing
Checks for common anti-patterns
Rewards documentation and type hints

Training Tips:

Include diverse programming tasks (CRUD, algorithms, data processing)
Use temperature 0.3-0.5 for more deterministic outputs
Test on held-out code problems for validation

Expected Results

Before Fine-tuning:

def read_file(path):
    f = open(path)
    data = f.read()
    return data

After Fine-tuning:

def read_file(path: str) -> str:
    """
    Read and return the contents of a file.

    Args:
        path: Path to the file to read

    Returns:
        File contents as a string

    Raises:
        FileNotFoundError: If the file doesn't exist
        IOError: If there's an error reading the file
    """
    try:
        with open(path, 'r', encoding='utf-8') as f:
            return f.read()
    except FileNotFoundError:
        raise FileNotFoundError(f"File not found: {path}")
    except IOError as e:
        raise IOError(f"Error reading file {path}: {e}")

Real-World Applications

IDE assistants: Code completion and generation
Code review tools: Suggest improvements and best practices
Documentation generators: Auto-generate docstrings
Boilerplate creators: Generate standard project structures
Bug fixers: Identify and correct common errors

Question Answering Systems

Build specialized Q&A models for specific domains with accurate, relevant answers.

Example Application: Customer Support Bot

The Challenge: Customer support teams need to answer repetitive questions quickly while maintaining accuracy and helpful tone.

The Solution: Fine-tune on your FAQ data and support tickets with rewards for:

Answer accuracy
Relevant information inclusion
Concise, helpful responses
Professional tone

Recommended Configuration

Model: Llama 3.2 3B or Qwen3 1.5B

Balance between quality and inference speed
Good for high-volume requests

Dataset Options:

SQuAD v2 (130K questions): Reading comprehension baseline
Custom FAQ dataset: Your company’s actual questions
Support ticket history: Real customer interactions

Reward Function: Question Answering

Validates factual accuracy
Rewards relevant details
Penalizes hallucinations

Training Tips:

Include “I don’t know” examples for unanswerable questions
Use domain-specific terminology in system prompt
Fine-tune on actual customer question patterns

Expected Results

Before Fine-tuning:

Q: What's your return policy?
A: You can return items. Check our website for more details.

After Fine-tuning:

Q: What's your return policy?
A: We offer a 30-day return window for unused items in original packaging.
To initiate a return:
1. Log into your account
2. Go to Order History
3. Select the item and click "Return"
4. Print the prepaid shipping label

Refunds are processed within 5-7 business days of receiving the return.
For damaged or defective items, returns are accepted beyond 30 days.

Real-World Applications

Customer support automation: Reduce ticket volume
Knowledge base assistants: Internal company Q&A
Medical Q&A: Patient information systems (with medical datasets)
Legal research: Case law and regulation queries
Technical documentation: Developer help systems

Custom Domain Applications

Fine-tune for specialized tasks with custom reward functions and datasets.

Example 1: Medical Report Summarization

Application: Summarize lengthy medical reports for doctors.

Configuration:

Model: Llama 3.2 3B
Dataset: Medical reports with human-written summaries
Reward: Concise Summarization + Medical Accuracy
Special considerations: HIPAA compliance, medical terminology

Key Metrics:

Summary conciseness (target: 20% of original length)
Retention of critical findings
Proper medical terminology usage

Example 2: Legal Document Analysis

Application: Extract key clauses and obligations from contracts.

Configuration:

Model: Mistral 7B (large context window)
Dataset: Annotated legal contracts
Reward: Custom legal extraction reward
Special considerations: Precision over recall

Key Metrics:

Clause identification accuracy
Obligation extraction completeness
Legal terminology precision

Example 3: Creative Content Generation

Application: Generate marketing copy with specific brand voice.

Configuration:

Model: Qwen3 3B
Dataset: Approved marketing materials
Reward: Creative Writing + Brand Consistency
Special considerations: Tone matching, keyword inclusion

Key Metrics:

Brand voice consistency score
Engagement predicted metrics
SEO keyword integration

Example 4: Language Translation

Application: Domain-specific translation (technical, medical, legal).

Configuration:

Model: Qwen3 3B or larger
Dataset: Parallel corpus in your domain
Reward: Custom translation quality metric (BLEU + domain terms)
Special considerations: Preserve technical terms, cultural adaptation

Key Metrics:

BLEU/METEOR scores
Domain terminology accuracy
Fluency in target language

Building Custom Reward Functions

For specialized applications, you’ll need custom reward functions. Here’s how:

1. Define Success Criteria

What makes a “good” output for your task?

Specific format requirements?
Factual accuracy constraints?
Style or tone preferences?
Length limitations?

2. Implement Reward Logic

Example: Product description generator

def product_description_reward(response, reference_data):
    score = 0.0

    # Check length (100-200 words ideal)
    word_count = len(response.split())
    if 100 <= word_count <= 200:
        score += 0.3

    # Check for required elements
    required_elements = ['features', 'benefits', 'use case']
    for element in required_elements:
        if element.lower() in response.lower():
            score += 0.2

    # Check sentiment (should be positive)
    sentiment = analyze_sentiment(response)  # Custom function
    if sentiment > 0.5:
        score += 0.3

    return min(score, 1.0)

3. Test Thoroughly

Before training:

Test reward on 20+ diverse examples
Verify score distribution (not all 0.0 or 1.0)
Check edge cases
Compare with human judgments

4. Iterate Based on Results

After training:

Review model outputs
Adjust reward weights
Add new criteria if needed
Re-train with updated reward

Getting Started with Your Use Case

1. Define Your Task

What inputs does the model receive?
What outputs should it produce?
What makes an output “good”?

2. Gather Data

Minimum 500 examples for initial testing
2,000+ for production deployment
Include diverse examples and edge cases

3. Choose Starting Point

Similar use case from this page
Closest pre-built reward function
Model size based on complexity

4. Iterate Quickly

Start with small subset (100-500 samples)
Train for 1-2 epochs
Evaluate and adjust
Scale up when satisfied

5. Measure Success

Define metrics before training:

Accuracy/F1 for classification
BLEU/ROUGE for generation
Task-specific metrics
Human evaluation criteria

Need Help with Your Use Case?

Share in Discussions: GitHub Discussions
Request Features: GitHub Issues
Read the Docs: Technical Reference

Have a success story? We’d love to hear it! Share your results in our community discussions.