BUSINESS INTELLIGENCE, BIG DATA MANAGEMENT, DATA SCIENCE
Prnewswire | April 28, 2023
Starburst, the data lake analytics platform, today announced a new integration with dbt Cloud, the fastest and most reliable way to deploy dbt. With the integration, which includes an enhanced adapter between dbt Cloud and Starburst's SaaS offering, Starburst Galaxy, dbt users can now easily build data pipelines spanning multiple data sources on one central plane.
As data becomes increasingly distributed, the ability to federate queries across disparate data sources has become critical for conducting lakehouse and data lake analytics. While migrating to centralize on a single cloud data warehouse is one option, most enterprise data is still spread across multiple platforms, including on-prem databases and object storage. With this integration, dbt users can easily federate that data across multiple disparate sources or access new data sources before it lands in their central data lake or warehouse.
"Combining the power of Starburst's data lake analytics platform with dbt Cloud, enterprise customers can more easily transform data wherever it lives without suffering through cumbersome and expensive ETL processes," said Harrison Johnson, Head of Technology Partnerships at Starburst. "This integration addresses the needs of the enterprise customer base, helping them get the most out of their existing systems and extending dbt's world class analytics engineering workflow platform to new cloud-first use cases without additional operational overhead."
Using a legacy ETL solution to transform and move data around using brittle, manually configured data pipelines is cumbersome, expensive, and can introduce risk. Yet, using a central cloud data warehouse for some use cases while other data exists in silos means organizations are not getting the most out of all their data. With this integration, dbt Cloud customers can get the most value out of all of their data with confidence, regardless of where it currently resides without adding the complexity of data ingestion (ETL) pipelines. This is a major benefit for dbt users who would need to otherwise rely on data engineering pipelines for ingestion.
"dbt enables data teams to work faster and more efficiently to bring order to organizational knowledge," said Nikhil Kothari, Head of Technology Partnerships at dbt Labs. "By combining the power of dbt Cloud with the flexibility of Starburst, we're empowering a new segment of users to easily create analytical data assets, without having to be constrained by where the data lives."
The new adapter is now generally available in dbt Cloud. In just a few clicks, customers can create a new dbt Cloud project, select Starburst as the data platform and connect. Within minutes, customers can make use of Starburst's high-performance query engine to transform data using dbt. To learn more about Starburst, its offerings and integrations, please visit starburst.io.
For data-driven companies, Starburst offers a full-featured data lake analytics platform, built on open source Trino. Our platform includes the capabilities needed to discover, organize, and consume data without the need for time-consuming and costly migrations. We believe the lake should be the center of gravity, and be the starting point for querying disparate data. With Starburst, teams can access more complete data, lower the cost of infrastructure, use the tools best suited to their specific needs, and avoid vendor lock-in. Trusted by companies like Comcast, Grubhub, and Priceline, Starburst helps companies make better decisions faster on all their data.
BUSINESS INTELLIGENCE, BIG DATA MANAGEMENT, DATA SCIENCE
Prnewswire | May 17, 2023
Crux, a pioneer in the external data integration, transformation, and observability space, today announced the launch of the Crux External Data Platform ("EDP"), the first SaaS offering that enables enterprises to automate the onboarding of any external dataset directly from vendors into their organization, driving better, faster decisions. The new self-service cloud platform allows data teams to onboard and transform external data for analytics use up to ten times faster than traditional manual methods.
"The Crux External Data Platform is truly transformative," said Will Freiberg, CEO of Crux. "Just as the cloud made it possible for enterprises to reduce infrastructure and maintenance costs, consolidate on-premises data warehouses, scale on-demand, and immediately access key resources, the Crux platform removes critical pre-processing bottlenecks, empowering data engineers to onboard external data products into their data warehouse or cloud analytics environments in minutes."
External data from governments, non-profits, and commercial data vendors is a critical business resource in many sectors such as finance, supply chain, retail, healthcare, and insurance. Crux has partnerships with over 265 leading data providers including MSCI, Moody's, S&P, SIX, FactSet, and Morningstar.
Crux builds data pipelines at scale, offering enterprises the ability to ingest any custom data source, and operates more than 60,000 pre-engineered pipelines to deliver external datasets in a data science and analytics-ready format. Now, Crux is making its robust capabilities directly available to data engineers with the Crux External Data Platform. Through its multi-tenant, secure SaaS platform, Crux ensures data is delivered reliably with protected, role-based access and customizable operations applicable to data products, connections, notification policies, users, and system wide settings. The platform reduces the need for expensive infrastructure and increases of velocity of data engineering teams. Additionally, EDP presents customers with one centralized data hub for visibility into the health and performance of their organization's catalog of external data pipelines. With intuitive automation and pattern recognition, data teams can immediately select and access new datasets, minimizing the time and effort required to access new data for faster insights.
According to a Gartner® report, "The increased adoption of cloud as a data platform has exacerbated existing challenges with application time to market, data quality and integration issues, analytics shortfalls, poor performance, and cost management. Data professionals are turning to artificial intelligence (AI)- and machine learning (ML) driven automation in hopes of better optimizing these data management (DM) related areas."1
Early customers, including Two Sigma and Goldman Sachs, are leveraging the platform's advanced AI and ML capabilities to automate and streamline data workflows, decreasing the need for manual intervention and improving accuracy. With robust security and governance features, the platform ensures data is handled in a compliant and secure manner.
"Managing external data is a special kind of hard," said Dan Lynn, SVP of Products. "Now, Crux provides the industry's first self-service SaaS platform for data integration, transformation, and observability that is specifically designed to tackle the unique challenges of external data. It's an opportunity for data engineers to flip the ratio from 70% of time spent preparing data to 70% of time driving higher value analytics and insights."
"Integrating, transforming, and absorbing data into our pipelines is a key focus for us," said Jeff Wecker, CTO, Two Sigma. "Crux's EDP product launch will help us scale the capacity of our data pipelining activities, delivering a quicker path to high-quality data sets and faster access to systematic data-driven insights for our researchers."
"We are excited to have collaborated with Crux in shaping their product roadmap and helping drive towards the launch of Crux's new EDP offering," said Abhishek Narang, Managing Director & Tech Fellow, Data Engineering at Goldman Sachs. "Leveraging Crux in our Data Engineering framework and the Legend Platform at Goldman Sachs has further reduced the time-to-value of our third-party data assets, helping to enable us to operate more efficiently and best serve clients. With the launch of EDP, it is satisfying to see so much of the work we have accomplished together now mechanized in an easily accessible self-service product offering."
Crux is a cloud-based data integration, transformation, and operations platform that accelerates the value realization between external and internal data. Crux partners with our customers to ensure they get the data they need, how they need it and where they need it. Its team builds data pipelines at scale and operates over 60K pre-engineered pipelines, delivering public and external datasets to the destination of choice. Crux pipelines come with embedded data monitoring, validations, and transformations, and are supported 24/7 by our global operations team. Crux was awarded a Google Cloud Customer Award in 2021 in the Cross-Industry category. Crux works with enterprise clients and is backed by Two Sigma, Goldman Sachs, Morgan Stanley, and Citi, among others.
BUSINESS INTELLIGENCE, BIG DATA MANAGEMENT, DATA ARCHITECTURE
Globenewswire | May 22, 2023
Komprise, the leader in analytics-driven unstructured data management as a service, today announced new governance and self-service capabilities that simplify departmental use of Deep Analytics, a query-based way to find and tag file and object data across hybrid cloud storage silos. IT organizations need to maintain data governance and data security while also making it easier for users to find, use and manage data. Often, these goals are in conflict and require significant IT overhead. The Komprise Intelligent Data Management Spring 2023 release minimizes administrative effort and improves unstructured data governance with new capabilities:
Share-Based Access for Groups: A recent Informatica survey revealed that data governance is the top priority among chief data officers and that 68% of data leaders will increase data management investments in 2023. But managing access control while enabling self-service unstructured data management for users often requires IT to spend considerable time provisioning each user’s role-based file and object storage access. Komprise simplifies this task by giving administrators the ability to assign group access to shares using Active Directory which automatically provisions data management access only to users in those groups.
Directory Explorer: A new Directory Explorer gives authorized line-of-business teams and departmental researchers the ability to augment the global search capabilities of Deep Analytics with a familiar browser interface. This means users can drill down into individual directories. Users now have multiple ways to find what they need: either by searching for it using queries on metadata and tags through Deep Analytics or if they know exactly where the data is, using the Directory Explorer.
Exclusion Query Filters: The Global File Index search capabilities of Komprise Deep Analytics now includes the ability to filter data using exclusions (e.g., "all data except .log files" or "all data except in .dat directories") and then use these queries to create data management policies. This makes it easy to specify data management policies in situations where outliers can prevent data movement.
“Komprise is on a mission to change how enterprises manage unstructured data to deliver maximum cost savings and value,” says Kumar Goswami, Komprise co-founder and CEO. “Increasingly, line of business and research teams rely upon data that has been historically locked away in disparate storage systems to run analytics, AI and ML. Our latest release makes it dramatically easier for teams to find and manage their own data, while simplifying governance for IT.”
Komprise Intelligent Data Management Spring 2023 is available today. Deep Analytics is included with the full software-as-a-service (SaaS) platform. Learn more at komprise.com/what’s new.
Komprise is a provider of unstructured data management and mobility software that frees enterprises to easily analyze, mobilize, and monetize the right file and object data across clouds without shackling data to any vendor. With Komprise Intelligent Data Management, you can cut 70% of enterprise storage, backup and cloud costs while making data easily available to cloud-based data lakes and analytics tools. www.komprise.com