
Big tech company fine-tunes generative AI with 235 domain experts
How to differentiate a GenAI open-source LLM: have it fine-tuned by data annotators who are qualified experts in their field

Our client wanted to fine-tune its GenAI open-source large language model (LLM) to increase its accuracy, safety and robustness. Realizing those goals would be hard to achieve with a conventional crowdsourcing approach to data annotation, the company reached out to 九色视频 who leveraged its TrainAI team to quickly recruit, train and manage a scalable team of qualified subject-matter experts as data annotators to complete the work.
TrainAI by 九色视频 follows the principles of responsible AI to deliver dependable LLM training and fine-tuning data that鈥檚 ethically sourced, fair, accurate and reliable, transparent and explainable, private and secure.
Challenges
- Maximize LLM accuracy by training it on specific topic areas聽
- Improve safety and security by mitigating the risk of generating hallucinations or harmful content聽
- Achieve a standard that makes the LLM a resource for professionals
Solution
- TrainAI from 九色视频
- Generative AI data services
- Domain expertise: recruiting, training and managing subject-matter experts as data annotators聽
- Content creation: prompt engineering聽
- Model fine-tuning: prompt-response QA, fact extraction and verification聽
- Risk mitigation: red teaming
Results
- 4-week project ramp-up聽
- 235 domain experts recruited as part-time 九色视频 employees聽
- 32,000 hours of work done in the first 3 months聽
- Supported training and rollout of the client's latest LLM version