Real-world agentic data platform: data model, open table format, and compute engine decisions
Josh Wo (Principal Architect @ LiveRamp) 🌟 Key Highlights [00:15] Introduction: Discussion on "The Missing Middle"—the layer between AI models and physical data that is often overlooked in infrastructure conversations. [01:41] The Industry Inflection: SaaS applications are evolving into "CRUD databases with business logic," which is increasingly being handled by AI agents. The future involves compute engines querying raw databases directly, with business logic shifting to the AI layer [01:58]. [02:49] The Semantic Layer (The "Missing Middle"): AI agents struggle to understand raw physical schemas without a semantic layer that defines business terminology, relationships, and lineage [03:15]. Example: An agent needs to understand what "lift" or "segment" means in a marketing context, not just how to join two SQL tables [05:43]. [04:36] API-First Model Language: To support machine readability, the data catalog must be API-first, moving beyond human-centric UIs [04:47]. Features like Gravitino are essential for cross-engine metadata federation, allowing agents to navigate data across clouds (AWS, GCP) and storage formats (Iceberg, Delta Lake) [04:56]. [10:10] Compute as a Spectrum: Compute is no longer a static choice but a spectrum; platforms must support multiple engines (Spark, Trino, BigQuery) while maintaining a consistent semantic layer [10:15]. [11:15] Layers of an Agentic Data Platform: Breakdown of the four layers: Agentic Layer (LLMs), Semantic Layer (Catalogs & Policy), Compute Layer (Engines), and Data Layer (Storage formats like Iceberg) [11:22]. [18:28] Survival Strategy for the AI Era: Emphasizes moving from seat-based pricing to usage-based models due to the high volume of machine-driven queries [18:45]. Recommends staying with open standards (like Iceberg and Gravitino) to avoid vendor lock-in and support multi-cloud flexibility [20:13].
Download
1 formatsVideo Formats
Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.