EdukaAI

Frequently Asked Questions

Common questions and expert answers about fine-tuning and EdukaAI

❓ Popular Questions

How many examples do I actually need?

Start with 100 high-quality examples for your first fine-tuning. This is the perfect amount for testing and learning without overwhelming yourself.

  • 50 examples: Minimum to try fine-tuning
  • 100 examples: Sweet spot for testing and learning (recommended starting point)
  • 500-1000 examples: Noticeable improvement in specific tasks
  • 5000+ examples: Professional-grade fine-tuning

Remember: 100 excellent examples beat 1000 mediocre ones. Quality always wins over quantity. Start with 100, then iterate and add more.

What's the difference between "Draft" and "Approved" status?

Draft means you're still working on itβ€”needs review or improvement. Approved means it's high quality and ready for training. Rejected means it has significant issues and shouldn't be used. Aim to have 80% of your examples approved before training.

Can I import from ChatGPT / Claude / my own conversations?

Currently, you can import training data via JSON/JSONL files. Direct integration with chat platforms is coming soon!

  • OpenWebUI - Coming soon! Import directly from your OpenWebUI conversations

For now, you can manually copy your best conversations from ChatGPT, Claude, or other AI assistants and paste them into the Create Sample form in the EdukaAI application. Use the Import feature in the app to upload JSON files.

How long does it take to create 1000 examples?

It depends on your approach:

  • Manual creation: 5-10 minutes per example = 85-170 hours total
  • Importing conversations: Much faster! Can import 50-100 examples in minutes
  • Hybrid approach: Import 70% + manually create 30%

Recommended: Import from your existing AI conversations, then curate and add manual examples where needed. This can reduce time to just 10-20 hours of curation.

Do I need programming experience to fine-tune an LLM?

No! edukaAI is designed for beginners. You don't need to write training code or understand machine learning theory. Just create good examples using our forms, and we'll handle the technical parts. However, basic understanding of your domain (e.g., programming concepts if you're building a coding assistant) is helpful.

What models can I fine-tune with my dataset?

Once exported from edukaAI, you can use your dataset to fine-tune:

  • Open source models: Llama 2, Mistral, Falcon (free, run locally)
  • Via HuggingFace: Easy upload and training
  • Via OpenAI: GPT-3.5 fine-tuning API
  • Via other platforms: Any platform accepting Alpaca or ShareGPT format

We export in multiple formats (Alpaca, ShareGPT, CodeAlpaca) to ensure compatibility with popular training platforms.

How much does fine-tuning cost?

Costs vary by platform:

  • Open source (local): Free! Just need a decent GPU (RTX 3060 or better recommended)
  • HuggingFace / cloud: $5-20 per training run depending on model size
  • OpenAI API: $0.008 per 1K tokens trained (typically $2-10 for 1000 examples)

Cost-saving tip: Train on smaller models first (7B parameters) to test your dataset before training larger, more expensive models.

What if my model doesn't improve after training?

Common reasons and solutions:

  • Quality issues: Review your dataset. Reject low-quality examples. Aim for 4-5 star ratings.
  • Not enough data: Try with 1000+ examples. Small datasets often don't show improvement.
  • Wrong format: Make sure you're exporting in the right format for your training platform.
  • Base model too large: Try fine-tuning a smaller model (7B instead of 70B).
  • Training parameters: You may need more training epochs or different learning rates.

Can I use copyrighted material in my dataset?

Be careful! Don't include copyrighted code, text, or content without permission. Write your own explanations and examples. If importing from AI assistants (Claude, ChatGPT), those conversations are generally fine to use since you created them. When in doubt, create original content.

How do I know if an example is good quality?

Ask yourself these questions:

  • Does the output fully answer the instruction?
  • Would this help a real user?
  • Is it accurate and correct?
  • Is the tone consistent with other examples?
  • Does it teach something or just give an answer?
  • Would I be proud if this was the only example someone saw?

If you answered "yes" to all, it's probably a 4-5 star example!

πŸ“–

Glossary

Confused by a term? Our comprehensive glossary covers all AI, LLM, and fine-tuning terminology with detailed explanations.

πŸ“š Browse Full Glossary
20+ TermsBeginner FriendlyCross-Referenced
πŸ”—

Resources & Next Steps

πŸ“š Learning Resources

πŸ› οΈ Tools & Platforms