Databricks, the pioneer of the data lakehouse paradigm and a data and AI startup, today announced the development of the Databricks Lakehouse Platform to a sold-out crowd at the annual Data + AI Summit in San Francisco. Best-in-class data warehousing performance and functionality, enhanced data governance, new data sharing developments such as an analytics marketplace and data clean rooms for secure data collaboration, automatic cost optimization for ETL operations, and machine learning (ML) lifecycle improvements are among the new capabilities revealed.
"Our customers want to be able to do business intelligence, AI, and machine learning on one platform, where their data already resides. This requires best-in-class data warehousing capabilities that can run directly on their data lake. Benchmarking ourselves against the highest standards, we have proven time and again that the Databricks Lakehouse Platform gives data teams the best of both worlds on a simple, open, and multi-cloud platform. Today's announcements are a significant step forward in advancing our Lakehouse vision, as we are making it faster and easier than ever to maximize the value of data, both within and across companies.
Ali Ghodsi, Co-founder and CEO of Databricks
Databricks also assists clients in sharing and collaborating on data across corporate boundaries. Cleanrooms, which will be accessible in the coming months, will allow businesses to share and connect data in a safe, hosted environment with no data replication necessary. For example, in the context of media and advertising, two companies can seek to assess audience overlap and campaign reach. Current clean room solutions have limitations since they are sometimes constrained to SQL tools and risk data duplication across many platforms. Cleanrooms enable organizations to easily collaborate with customers and partners on any cloud and provide them with the flexibility to run complex computations and workloads using both SQL and data science-based tools, such as Python, R, and Scala, while maintaining consistent data privacy controls.