Introducing Confluent Private Cloud: Cloud-Level Agility for Your Private Infrastructure | Learn More
Data-driven organizations know that unlocking real-time analytics from streaming data isn’t just about collecting and transmitting events. It’s about getting high-quality, governed, and query-ready tables into the hands of analysts and business users while ensuring enterprise-grade security and compliance. Traditionally, moving data from Apache Kafka® into analytic tables required complex ETL pipelines, manual data wrangling, and custom governance processes. This resulted in delays, operational risk, and a gap between real-time data and actionable insight.
Tableflow on Confluent Cloud bridges that gap. It transforms Kafka topics directly into modern open table formats—such as Apache Iceberg™ and Delta Lake—stored securely on the cloud. Teams can now build truly real-time, trustworthy, and governed data pipelines with the click of a button. With the General Availability (GA) of Delta Lake and Unity Catalog support, plus new upsert materialization, advanced error handling, and enterprise security features, Tableflow becomes the central connective tissue for resilient, flexible analytics architectures. This launch means organizations can:
Activate streaming data as Delta Lake tables instantly for Databricks, Apache Spark™, Apache Flink®, and other compatible engines
Scale real-time change data capture (CDC) analytics across billions of rows
Ensure resilience with advanced error management and troubleshooting
Maintain robust security with full encryption controls, especially for regulated industries
Tableflow now offers GA for materializing Kafka topics as Delta Lake tables directly in your own cloud object storage (Bring Your Own Storage, or BYOS). This feature provides analytics-grade, open format tables from streaming data with several key benefits. Users can create Delta tables with a few clicks, selecting Delta Lake as the format and their destination bucket, with the entire pipeline automatically managed. Tableflow also supports dual format, allowing the same topic to be materialized in both Iceberg and Delta formats. Automatic table maintenance includes compaction and vacuuming, ensuring fast and reliable analytics. Furthermore, Tableflow provides live metrics for observability, detailing rows written, bytes compacted, and rows rejected, offering data teams visibility into data quality and pipeline performance.
In addition to Delta Lake Tableflow Materialization, Unity Catalog Integration is also now in GA. This integration enables Delta tables created by Tableflow to be automatically published to Databricks Unity Catalog as external Delta Lake tables, unlocking the full power of Databricks governance and access control. Tableflow automatically maps Kafka clusters to Unity Catalog schemas and registers every Delta-mapped topic as a fully managed external table. This provides enterprise-grade access controls and audit logs through Unity Catalog's built-in governance features. Linking Tableflow to Unity Catalog is quick and easy, requiring only a few minutes to grant catalog access and configure external storage permissions. Users maintain catalog flexibility and are able to query materialized tables through Databricks, Spark, or any compatible engine.
Tableflow for Azure offers a suite of powerful features and benefits that revolutionizes how organizations utilize streaming Kafka data within the Microsoft Azure analytics ecosystem. It enables instant data materialization, allowing developers to expose Kafka topics as open format tables (Iceberg or Delta) stored in Azure Data Lake Gen2 (ADLS Gen2) and to access them through Databricks Unity Catalog without custom pipelines or schema wrangling. Customers can choose Confluent-managed storage or their own Azure storage containers, with automatic synchronization into catalogs such as Unity, Snowflake Open Catalog, and Apache Polaris for immediate analytic access. Tableflow automates schema mapping, data type conversion, and ongoing table maintenance, eliminating manual effort and reducing errors. Deployment is flexible via Confluent Cloud UI, CLI, or Terraform to suit teams of any size, and the Tableflow Hub UI provides integrated governance by tracking Kafka topics mapped to storage and catalogs.
These features lead to accelerated analytics by reducing ETL coding and batch jobs, lower operational costs by eliminating duplicate infrastructure, and enhance reliability through automated management. This strategic cloud integration extends Confluent’s Tableflow experience into Azure, aligning with growing enterprise demand and partnerships with Microsoft and Databricks, ultimately accelerating time to insight, eliminating complexity, and empowering Azure-focused analytics and AI use cases.
A key challenge for streaming analytics is reconciling updates, not just appends, to operational tables. Tableflow’s new upsert mode brings support for CDC at scale through CDC-Informed Tables, Compaction and Deletes, Scalability, and Flexible Query. CDC-Informed Tables use user-defined primary keys to track changes, automatically maintaining “merge on read” tables. Updates and deletes create separate “delete files,” which are merged and compacted automatically for efficiency and compliance. Upsert tables can handle billions of rows and terabytes of data while keeping materialization lag under 15 minutes. Compatible compute engines can reconcile upserts at query time for accurate results. Compacting and merging may increase Tableflow processing cost compared to append mode, due to the extra data rewriting. This feature is ideal for CDC ingestion for transactional tables, such as orders, customer profiles, and Internet of Things (IoT) sensor updates.
Tableflow’s advanced error handling introduces three user-selectable strategies: Suspend, Skip, and Log. Suspend, the default, stops Tableflow upon unrecoverable errors, requiring manual intervention. Skip ensures continuity by skipping problematic records, though data is lost. Log sends malformed records to a Dead Letter Queue (DLQ)—a special Kafka topic with a standardized Avro schema—for logging, reprocessing, or analysis. DLQ records include timestamps, error codes, human-readable messages, metadata, and raw source record data. Users can monitor skipped/logged events in the Tableflow metrics panel, subscribe to error notifications, reprocess DLQ records, and troubleshoot root causes to maintain pipeline resilience.
Security-conscious organizations demand strict control over their data encryption. Tableflow extends enterprise-grade encryption by supporting Bring Your Own Key (BYOK) for both Confluent managed storage and customer-managed object storage. For Confluent managed storage, Tableflow uses the same encryption key as your Kafka cluster, requiring no extra configuration. When Tableflow writes to your own object storage, you have full control over the encryption method and manage keys/permissions, including AWS KMS (Key Management Service) setup. This feature is generally available on Amazon Web Services (AWS) for Dedicated clusters, covering both Delta Lake and Iceberg tables with encryption at the storage layer. To ensure operational reliability and prevent data leakage, Tableflow suspends operations if key permissions are revoked until access is restored.
Tableflow on Confluent Cloud now fully empowers organizations to bridge the gap between fast-moving operational data and trusted, governed tables for analytics without compromising security, scale, or flexibility. With easy, automatic integration into Delta Lake and Databricks Unity Catalog, true CDC/upsert table materialization, enterprise security via BYOK, and resilient error handling, it's never been easier to connect real-time data to business insights.
Ready to unlock the full, transformative potential of your streaming data for cutting-edge AI and advanced analytics? Explore Tableflow today.
Learn more: Dive into the Tableflow product documentation.
See it in action: Watch our short introduction video or Tim Berglund's lightboard explanation.
Get started: If you're already using Confluent Cloud, navigate to the Tableflow section for your cluster. New users can get started with Confluent Cloud for free and explore Tableflow's capabilities.
Contact us today for a personalized demo of Tableflow and start unlocking the full potential of your streaming data on Confluent Cloud. We’re excited to see how you leverage Tableflow to turn your real-time data streams into tangible business value.
The preceding outlines our general product direction and is not a commitment to deliver any material, code, or functionality. The development, release, timing, and pricing of any features or functionality described may change. Customers should make their purchase decisions based on services, features, and functions that are currently available.
Confluent and associated marks are trademarks or registered trademarks of Confluent, Inc.
Apache®, Apache Kafka®, Kafka®, Apache Flink®, Flink®, Apache Iceberg™, Iceberg™, Apache Spark™, Spark™, and the Kafka, Flink, Iceberg, and Spark logos are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by the Apache Software Foundation is implied by using these marks. All other trademarks are the property of their respective owners.
Tableflow represents Kafka topics as Apache Iceberg® (GA) and Delta Lake (EA) tables in a few clicks to feed any data warehouse, data lake, or analytics engine of your choice
An expanded partnership between Confluent and Databricks will dramatically simplify the integration between analytical and operational systems, so enterprises spend less time fussing over siloed data and governance and more time creating value for their customers.