Teaching large language models to zip their lips

Post Details

Company

Gretel.ai

Date Published

March 15, 2023

Author

Andrew Carr

Word Count

1,195

Company Posts That Month

4

Language

English

Hacker News Points

1

Source URL

gretel.ai/blog/teaching-large-language-models-to-zip-their-lips

Summary

Gretel introduces Reinforcement Learning from Privacy Feedback (RLPF), a novel approach to reduce the likelihood of language models leaking private information. RLPF combines reinforcement learning with measures of privacy and uses them as rewards for improving language model capabilities in a multi-task fashion. Preliminary results show that RLPF can improve both privacy preservation and summarization quality, outperforming some existing models. This method has potential applications in reducing biased or discriminatory language in AI systems.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
LLM	11	838	103	47	+103%
Reinforcement learning	10	No monthly metrics for this publish month.