Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

DukaanBench: Can AI Run an Indian Grocery Store for 30 Days?

Blog post from HuggingFace

Post Details
Company
Date Published
Author
Ekansh Srivastva
Word Count
3,871
Company Posts That Month
90
Language
-
Hacker News Points
-
Summary

DukaanBench is an innovative AI benchmark that challenges language models to operate a simulated Indian kirana store for 30 days, assessing their ability to manage inventory, cash, customer trust, and marketing strategies. Each day, the model receives a comprehensive state of the shop and must return an executable JSON action to guide store operations, with the backend simulating customer interactions and updating variables like trust and inventory. The benchmark aims to evaluate not just profit-making capabilities but also the model's ability to maintain operational stability and customer relationships, with metrics including service rate, trust, and marketing effectiveness. The initial findings highlight the importance of aligning action with rationale, managing trust, and ensuring inventory awareness in marketing efforts. Part 1 introduces the environment and evaluation loop, while Part 2 will explore training a smaller, more specialized model to improve on these tasks, offering potential as a practical tool for shopkeepers rather than replacing them.

Trends Found in this Post

No tracked trend matches for this post yet.