Home / Companies / Gretel.ai / Blog / Post Details
Content Deep Dive

Building Datasets to Enable Safer AI Responses

Blog post from Gretel.ai

Post Details
Company
Date Published
Author
Lipika Ramaswamy, Maarten Van Segbroeck, Dhruv Nathawani
Word Count
1,792
Company Posts That Month
3
Language
English
Hacker News Points
1
Summary

The Gretel's Synthetic Safety Dataset is a resource designed to align large language models (LLMs) with safe and ethical responses. The dataset features 8,361 triplets of "prompt", "response" and "safe response" spanning significant risk categories, including discrimination, harassment, propaganda, religious intolerance, gender bias, and more. It was created using Gretel Navigator's Data Designer toolkit and is available on HuggingFace. The dataset aims to provide a transparent and modular resource for the AI community to utilize in aligning models for secure and public-interest-focused interactions. It also highlights the importance of prompt generation benefits from human expertise in jailbreaking (attempts to bypass model restrictions) and red teaming (simulated attacks to test system security). The dataset can be used for pre-training and fine-tuning guardrails, stress-testing model robustness, facilitating rapid iteration and refinement, and benchmarking ethical and safety maturity.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
AI Guardrails 5 186 50 28 +2%
LLM 5 2,668 436 137 -7%
Reinforcement learning 4 43 28 16 +30%
AI Model Fine-tuning 3 476 103 54 -13%
Real-time 1 3,091 773 211 -1%