Generative AI applications, particularly those using large language models (LLMs), require a distinct approach to dataset management compared to traditional machine learning, emphasizing the need for flexible and evolving data handling practices. While traditional ML focuses on building comprehensive datasets from the outset, LLM development often begins with rapid prototyping using general-purpose models, followed by incremental dataset building and schema definition for evaluation and enhancement purposes. LangSmith addresses these needs by offering flexible dataset schemas that allow for iterative development and modification, ensuring data consistency and facilitating quick adaptations as project requirements evolve. The platform enhances data management by incorporating schema validation, versioning, and annotation capabilities, which streamline the process of adding and reviewing data, thus maintaining dataset cleanliness and supporting ongoing LLM app improvements. LangSmith's tools are designed to provide a robust framework for dataset curation in LLM applications, enabling enhanced experimentation, debugging, and human annotation, which are crucial for optimizing AI model performance.