Back to Browse

Apache Paimon: Unified Lake Storage for Data + Multimodal AI with Apache Iceberg Compatibility

673 views
Jul 16, 2025
18:21

Jingsong Li, Head of Alibaba Open-Source Data Lake Platform team, PMC Chair of Apache Paimon, and PMC Member of Apache Flink, delivered this keynote at Flink Forward Asia Singapore 2025. Key takeaways • Minute-level streaming updates: How Paimon + Flink enable real-time writes to the lake while automatically evolving schemas. • Iceberg compatibility: Leveraging Iceberg’s new Deletion Vectors to create real-time synced views for existing Iceberg consumers—zero refactoring required. • Multimodal AI data: Integrating the Lance file format to efficiently store and retrieve images, vectors, and other modality data for large-scale model training. • Demo: End-to-end pipeline from Flink ingestion → Paimon storage → Iceberg-compatible query → AI model access. • Roadmap: Future features that blur the line between streaming, lakehouse, and vector databases. If you maintain a data lake or plan multimodal AI workloads, this session shows how Paimon unifies it all. ––––– This video is part of the “Flink Forward Asia Singapore 2025 – Keynote Sessions” playlist. Stay connected with FFA: • Website: https://asia.flink-forward.org/ • LinkedIn: https://www.linkedin.com/company/flink-forward-asia • X: https://x.com/flink_4_asia Follow for more data-lake and AI innovations!

Download

0 formats

No download links available.

Apache Paimon: Unified Lake Storage for Data + Multimodal AI with Apache Iceberg Compatibility | NatokHD