Snowflake Expands Platform with Apache Iceberg Integration

Snowflake is making customers’ data, models, and applications even more powerful by embracing open data and interoperability across the ecosystem, ensuring that all users benefit from Snowflake’s leading governance and discovery in the AI Data Cloud

author-image
SMEStreet Edit Desk
New Update
Prasanna Krishnan, Head of Collaboration and Snowflake Horizon, Snowflake
Listen to this article
0.75x 1x 1.5x
00:00 / 00:00
Snowflake, the AI Data Cloud company, at its annual user conference, Snowflake Summit 2024, announced advancements to its single, unified platform that empower thousands of organizations with increased flexibility and interoperability across their enterprise data — regardless of where it resides. Snowflake is also making it easier for customers to discover and collaborate on the data, models, and applications they need, when they need them, in addition to enhancing its powerful platform so users gain increased performance and efficiency in the AI Data Cloud.



“Snowflake is making customers’ data, models, and applications even more powerful by embracing open data and interoperability across the ecosystem, ensuring that all users benefit from Snowflake’s leading governance and discovery in the AI Data Cloud,” said Prasanna Krishnan, Head of Collaboration and Snowflake Horizon, Snowflake. “We’re providing customers with new ways to seamlessly access, understand, protect, and drive value with their data at the speed and scale they need to be successful.”



Bring Increased Interoperability with Open Data to the AI Data Cloud 
The popular open table format Apache Iceberg has revolutionized how organizations access and drive value from their data. Snowflake is now making it even easier for customers to bring the platform’s ease of use, performance, governance, and collaboration to their Iceberg data stored externally with Iceberg Tables (now generally available), unlocking full storage interoperability.



Customers including Booking.com, Capital One, Indeed, Komodo Health, and more are already leveraging Iceberg Tables to implement open, flexible architectural patterns — including data lakehousesdata lakes, and data meshes — to further simplify the development of pipelines, models, and more. With Iceberg Tables, organizations can work with their data on their terms, gaining increased flexibility and support over their open data to drive value.



“Apache Iceberg’s large and diverse ecosystem of contributors and products made it a clear choice for us to provide an open and common data layer across our internal and external ecosystem,” said Thomas Davey, Chief Data Officer, Booking.com. “With Iceberg, we can broaden our use cases for Snowflake as our open data lakehouse for machine learning, AI, business intelligence, and geospatial analysis, even for data stored externally.”


Iceberg Tables comes on the heels of the recently announced Polaris Catalog, a vendor-neutral and fully open catalog implementation for Apache Iceberg. Polaris Catalog enables cross-engine interoperability, further providing organizations with new levels of choice, flexibility, and control over their data. Organizations can get started running Polaris Catalog hosted in Snowflake’s AI Data Cloud (Snowflake-hosted in public preview soon), or self-host it in their own infrastructure using containers.



Create A Well-Governed Data Foundation to Accelerate AI and Apps

The rise of AI has made every organizations’ enterprise data even more valuable. As such, organizations are left grappling with the rapid increase of data, large language models (LLMs), applications, and more spread out across various business units and teams. Snowflake is advancing Snowflake Horizon, Snowflake’s built-in governance and discovery solution that provides a unified set of compliance, security, privacy, interoperability, and access capabilities, to enable enterprises to protect their data products so that customers can take action on them — both for content internal to an organization, as well as sourced from third parties.



As a part of Snowflake Horizon’s new capabilities, the Internal Marketplace (private preview) allows users to curate and publish data products such as data, models, and applications specifically for teams within their organization to discover and use — while preventing unintended sharing to external parties. In addition, teams can securely limit who within an organization can see or access their content. Snowflake is further extending its industry-leading collaboration capabilities to include the sharing of AI models (private preview soon), Iceberg Tables, and Dynamic Tables.



Snowflake is also putting the power of AI to work so all users can quickly discover relevant content for their use cases. Universal Search (now generally available) allows customers to search the AI Data Cloud, spanning content in Snowflake storage, external Iceberg storage, and from third-party providers. Built on state-of-the-art search engine technology from Neeva (acquired by Snowflake in May 2023), users can use natural language to find and seamlessly take action on the data products they need. Additionally, to help aid with discovery and curation, Snowflake is introducing new AI-Powered Object Descriptions (private preview soon), which will automatically generate relevant context and comments for tables and views.



“As a leading digital financial services company, it’s imperative we have a unified and well-governed data foundation for a holistic view across our approximately 11 million customers and their needs. Snowflake Horizon’s built-in governance, discovery, and protection capabilities ensure that we’re operating with the highest-degree of compliance, security, privacy, interoperability, and access,” said Scott Richardson, CIO of Enterprise Data, Analytics & AI at Ally Financial. “Snowflake helps us eliminate data silos for increased insights into every corner of our organization and enhance collaboration — both internally and with customers and partners — so we can act on our ‘Do It Right’ values and deliver exceptional experiences to our customers and employees.”



Gain Faster Platform Performance, and Lower Operating Costs

With nearly every product release, Snowflake is committed to improving the performance and efficiency of its platform for customers. As a result, the Snowflake Performance Index (SPI), which measures the impact of Snowflake’s performance, reports that Snowflake has reduced organizations’ query duration across stable customer workloads by 27%1 since it started tracking this metric, and by 12%1 over the past 12 months. Snowflake is also making the loading of data faster, easier, and more cost effective. Customers now benefit from up to 25%2 performance improvements for loading JSON files and up to 50%2 improvements for loading Parquet files — without any action required from the customers’ end.



In addition to the 40+ currently supported cloud regions, Snowflake also announced that it is expanding the AI Data Cloud footprint to some highly regulated and sovereign markets globally. This includes an EU-only data boundary that keeps all customer data, alongside relevant service and usage data, within regional borders to provide European customers with stronger data residency and data sovereignty assurances to meet regional regulatory requirements. Furthermore, Snowflake will be offering a separate environment to Department of Defense (DoD) customers that includes a networking integration with Boundary Cloud Access Point (BCAP), ensuring that Impact Level 4 (IL4) security controls are met.



Continued Innovation at Snowflake Summit 2024
Snowflake also announced new advancements to Snowflake Cortex AI and Snowflake ML that unlock the next wave of enterprise AI for customers; new tools that accelerate how developers build in the AI Data Cloud; a new collaboration with NVIDIA that customers and partners can harness to build customized AI data applications in Snowflake; and more at Snowflake Summit 2024.  
Apache Iceberg Snowflake