All Your Unstructured Data in a Databricks Delta Table. Just Say the Word.
Blog post from Unstructured
Claude Desktop has introduced a no-code, natural language-driven method for creating and executing ETL pipelines, utilizing the Model Context Protocol (MCP) to integrate seamlessly with tools like Unstructured and Databricks. Released by Anthropic in 2024, MCP standardizes AI application's interaction with external data sources, enabling developers to construct modular components that make data processing more accessible. The blog details how Unstructured facilitates preprocessing of unstructured data to be LLM-ready, which is crucial for applications like Claude Desktop to function effectively. Users can leverage Claude Desktop's capabilities to set up workflows that transfer and transform data from Amazon S3 to Databricks Delta Tables without writing code, using conversational prompts to configure and run the processes. This setup includes connectors for data sources and destinations, and workflows consisting of nodes for data transformation and enrichment, all orchestrated by Claude through MCP. The process not only simplifies data management but also paves the way for further applications like entity extraction and data classification, showcasing a collaborative approach to data engineering.