Data Architecture

Databricks Launches Data Lakehouse for Retail and Consumer Goods Customers

Databricks, the Data and AI company and pioneer of the data lakehouse architecture, today announced the Databricks Lakehouse for Retail, the company's first industry-specific data lakehouse for retailers and consumer goods (CG) customers. With Databricks' Lakehouse for Retail, data teams are enabled with a centralized data and AI platform that is tailored to help solve the most critical data challenges that retailers, partners, and their suppliers are facing. Early adopters of Databricks' Lakehouse for Retail include industry-leading customers and partners like  Walgreens, Columbia, H&M Group, Reckitt, Restaurant Brands International, 84.51°(a subsidiary of Kroger Co.), Co-Op Food, Gousto, Acosta and more.

"As the retail and healthcare industries continue to undergo transformative change, Walgreens has embraced a modern, collaborative data platform that provides a competitive edge to the business and, most importantly, equips our pharmacists and technicians with timely, accurate patient insights for better healthcare outcomes," said Luigi Guadagno, Vice President, Pharmacy and HealthCare Platform Technology at Walgreens. "With hundreds of millions of prescriptions processed by Walgreens each year, Databricks' Lakehouse for Retail allows us to unify all of this data and store it in one place for a full range of analytics and ML workloads. By eliminating complex and costly legacy data silos, we've enabled cross-domain collaboration with an intelligent, unified data platform that gives us the flexibility to adapt, scale and better serve our customers and patients."

"Databricks has always innovated on behalf of our customers and the vision of lakehouse helps solve many of the challenges retail organizations have told us they're facing," said Ali Ghodsi, CEO and Co-Founder at Databricks. "This is an important milestone on our journey to help organizations operate in real-time, deliver more accurate analysis, and leverage all of their customer data to uncover valuable insights. Lakehouse for Retail will empower data-driven collaboration and sharing across businesses and  partners in the retail industry."

Databricks' Lakehouse for Retail delivers an open, flexible data platform, data collaboration and sharing, and a collection of powerful tools and partners for the retail and consumer goods industries. Designed to jumpstart the analytics process, new Lakehouse for Retail Solution Accelerators offer a blueprint of data analytics and machine learning use cases and best practices to save weeks or months of development time for an organization's data engineers and data scientists. Popular solution accelerators for Databricks' Lakehouse for Retail customers include:

  • Real-time Streaming Data Ingestion: Power real-time decisions critical to winning in omnichannel retail with point-of-sale, mobile application, inventory and fulfillment data.
  • Demand forecasting and time-series forecasting: Generate more accurate forecasts in less time with fine-grained demand forecasting to better predict demand for all items and stores.
  • ML-powered recommendation engines: Specific recommendations models for every stage of the buyer journey - including neural network, collaborative filtering, content-based recommendations and more - enable retailers to create a more personalized customer experience.
  • Customer Lifetime Value: Examine customer attrition, better predict behaviors of churn, and segment consumers by lifetime and value with a collection of customer analytics accelerators to help improve decisions on product development and personalized promotions.

Additionally, industry-leading Databricks partners like Deloitte and Tredence are driving lakehouse vision and value by delivering pre-built analytics solutions on the lakehouse platform that address real-time customer use cases. Tailor-made for the retail industry, featured partner solutions and platforms include:

  • Deloitte's Trellis solution accelerator for the retail industry is one of many examples of how Deloitte and client partners are adopting the Databricks Lakehouse architecture construct and platform to deliver end-to-end data and AI/ML capabilities in a simple, holistic, and cost-effective way. Trellis provides capabilities that solve retail clients' complex challenges around forecasting, replenishment, procurement, pricing, and promotion services. Deloitte has leveraged their deep industry and client expertise to build an integrated, secured, and multi-cloud ready "as-a-service" solution accelerator on top of Databricks' Lakehouse platform that can be rapidly customized as appropriate based on client's unique needs. Trellis has proven to be a game-changer for our joint clients as it allows them to focus on the critical shifts occurring both on the demand and supply side with the ability to assess recommendations, associated impact, and insights in real-time that result in significant improvement to both topline and bottom line numbers.
  • Tredence will meet the explosive enterprise Data, AI & ML demand and deliver real-time transformative industry value for their business by delivering solutions for Lakehouse for Retail. The partnership first launched the On-Shelf Availability Solution (OSA) accelerator in August 2021, combining Databricks' data processing capability and Tredence's AI/ML expertise to enable Retail, CPG & Manufacturers to solve their trillion dollar out-of-stock challenge. Now with Lakehouse for Retail, Tredence and Databricks will jointly expand the portfolio of industry solutions to address other customer challenges and drive global scale together.

About Databricks
Databricks is the data and AI company. More than 5,000 organizations worldwide — including Comcast, Condé Nast, H&M, and over 40% of the Fortune 500 — rely on the Databricks Lakehouse Platform to unify their data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe. Founded by the original creators of Apache Spark™, Delta Lake and MLflow, Databricks is on a mission to help data teams solve the world's toughest problems.



Related News

Big Data Management

SAS Introduces SAS Health Transforming Healthcare Data Management

SAS | September 15, 2023

SAS has launched SAS Health, an end-to-end enterprise solution focused on healthcare analytics and data automation. SAS Health is powered by a common health data model with predefined mappings to industry standards. SAS' introduction of SAS Health is part of its $1 billion commitment to invest in AI-powered industry solutions over the next three years. SAS, a globally renowned leader in AI and analytics, has recently unveiled SAS Health, an innovative end-to-end enterprise solution designed for analytics and data automation in the healthcare sector. This innovative platform streamlines health data management, enhances data governance and expedites the generation of valuable patient insights. Within the healthcare industry, the cumbersome process of consolidating data from various systems and formats has been a significant impediment in the development and deployment of scalable healthcare analytic solutions that can benefit both individuals and communities. The patient insights generated through these analytics, ranging from the proactive identification of gaps in clinical staffing to the visualization of screening center distribution relative to the patient population, enable healthcare systems to gauge the quality of each patient interaction and make positive contributions to the care of individuals with complex chronic conditions. In pursuit of a solution to the challenge of providing healthcare providers and payers with centralized, secure, and analytics-optimized data, SAS Health is powered by a common health data model with predefined mappings to widely recognized industry standards. With just a few secure connection details entered, customers can rapidly embark on addressing the most critical aspects of enhancing patient care. Leveraging the capabilities of the analytics and AI platform SAS Viya, SAS Health facilitates the swift extraction of actionable insights, all while ensuring adherence to industry standards and regulations. Gail Stephens, VP of Health Care and Life Sciences at SAS, commented, "Having one consistent, common data model built on a powerful advanced analytics platform is pivotal for hospital systems and the future of health care delivery. SAS Health offers an extraordinary opportunity to advance patient care and treatment through improved efficiencies in data and analytics frameworks, which ultimately will allow health care payers and providers to deliver better outcomes, more quickly." [Source: Cision PR Newswire] SAS Health's common health data model on SingleStore will serve as a central hub for integrating diverse health data with financial, clinical, and operational information, offering an efficient and adaptable approach that reduces costs and simplifies data accessibility. The cloud-native solution will streamline the ingestion of data from multiple industry standards, commencing with the Fast Healthcare Interoperability Resources (FHIR), all in a no-code/low-code format. The global adoption of the FHIR industry data standard, which delineates how healthcare information can be exchanged among various computer systems, continues to grow. Prominent electronic health record (EHR) companies are swiftly embracing FHIR, and in the United States, the Centers for Medicare & Medicaid Services (CMS) have mandated its use. The introduction of SAS Health is one of the outcomes of SAS' recent commitment to invest $1 billion in AI-powered industry solutions over the next three years. This investment, announced in May 2023, builds upon SAS' decades-long dedication to providing tailored solutions for various industries, including government, banking, insurance, retail, manufacturing, healthcare, energy, telecommunications, media, and more, to address their unique challenges effectively.

Read More

Big Data Management

AVEVA Extends Data Capabilities from Edge to Plant to Community with AVEVA PI Data Infrastructure

iTWire | October 30, 2023

AVEVA, is a global leader in industrial software, driving digital transformation and sustainability, has launched AVEVA PI Data Infrastructure, a fully-integrated hybrid data solution providing easy scalability, centralised management, and the ability to share data collaboratively via the cloud. AVEVA PI Data Infrastructure is the latest offering in the market leading AVEVA PI System portfolio, which helps companies collect, enrich, analyse and visualise operations data to achieve deeper insight and operational excellence. Moving to hybrid infrastructure gives industrial companies the flexibility, scalability and security needed to deliver valuable, high-fidelity data to authorised users and applications in any location. The initial release also gives customers the option to use the OpenID Connect protocol for user authentication, enabling enterprise-wide single sign on. Other enterprise-class data management features will be delivered over several releases. AVEVA PI Data Infrastructure makes it easier for companies to collect and use real-time operations data in industrial environments that increasingly include sensor-enabled legacy systems, remote assets and IIoT devices. The hybrid architecture gives data access to more decision makers who rely on operations data to resolve problems and develop business insights, thereby reducing the total cost and effort of operations data management. By achieving seamless data sharing with any trusted collaborator, companies can overcome costly data silos, modernise and streamline user access and aggregate real-time and historical data for wider use and consumption. AVEVA PI Data Infrastructure is available via subscription using AVEVA Flex credits. Harpreet Gulati, SVP - Head of PI System Business at AVEVA, said, No other industrial software company offers a fully-integrated, seamless data infrastructure that enables the fast, secure flow of real-time, high-fidelity data to anywhere it is needed – across multiple plants, at the edge, or in a trusted community over the cloud – with complete data integrity. We want to provide our customers with the flexibility to deploy across any of these areas, enabling them to increase sustainability, operating efficiency, asset reliability, and organisational agility. Customers are embracing the new offering. Giovanna Ruggieri, Head of ICT at Italy’s EP Produzione, a subsidiary of the European energy giant, EPH, commented: "EP Produzione is actively pursuing digital transformation to maximise operational excellence and improve processes to support the business. To continue the journey, and better embrace the digital transformation, we need greater flexibility and integration at all levels, a data infrastructure that can give us full visibility across our multi-site operating environment that always keeps cyber security as high priority. "We appreciate AVEVA PI Data Infrastructure’s aggregate tag subscription model because it allows us to better manage our current and future needs in a smart way, with AVEVA currently proposing, for us, one of the best solutions on the market."

Read More

Big Data Management

Microsoft's AI Data Exposure Highlights Challenges in AI Integration

Microsoft | September 22, 2023

AI models rely heavily on vast data volumes for their functionality, thus increasing risks associated with mishandling data in AI projects. Microsoft's AI research team accidentally exposed 38 terabytes of private data on GitHub. Many companies feel compelled to adopt generative AI but lack the expertise to do so effectively. Artificial intelligence (AI) models are renowned for their enormous appetite for data, making them among the most data-intensive computing platforms in existence. While AI holds the potential to revolutionize the world, it is utterly dependent on the availability and ingestion of vast volumes of data. An alarming incident involving Microsoft's AI research team recently highlighted the immense data exposure risks inherent in this technology. The team inadvertently exposed a staggering 38 terabytes of private data when publishing open-source AI training data on the cloud-based code hosting platform GitHub. This exposed data included a complete backup of two Microsoft employees' workstations, containing highly sensitive personal information such as private keys, passwords to internal Microsoft services, and over 30,000 messages from 359 Microsoft employees. The exposure was a result of an accidental configuration, which granted "full control" access instead of "read-only" permissions. This oversight meant that potential attackers could not only view the exposed files but also manipulate, overwrite, or delete them. Although a crisis was narrowly averted in this instance, it serves as a glaring example of the new risks organizations face as they integrate AI more extensively into their operations. With staff engineers increasingly handling vast amounts of specialized and sensitive data to train AI models, it is imperative for companies to establish robust governance policies and educational safeguards to mitigate security risks. Training specialized AI models necessitates specialized data. As organizations of all sizes embrace the advantages AI offers in their day-to-day workflows, IT, data, and security teams must grasp the inherent exposure risks associated with each stage of the AI development process. Open data sharing plays a critical role in AI training, with researchers gathering and disseminating extensive amounts of both external and internal data to build the necessary training datasets for their AI models. However, the more data that is shared, the greater the risk if it is not handled correctly, as evidenced by the Microsoft incident. AI, in many ways, challenges an organization's internal corporate policies like no other technology has done before. To harness AI tools effectively and securely, businesses must first establish a robust data infrastructure to avoid the fundamental pitfalls of AI. Securing the future of AI requires a nuanced approach. Despite concerns about AI's potential risks, organizations should be more concerned about the quality of AI software than the technology turning rogue. PYMNTS Intelligence's research indicates that many companies are uncertain about their readiness for generative AI but still feel compelled to adopt it. A substantial 62% of surveyed executives believe their companies lack the expertise to harness the technology effectively, according to 'Understanding the Future of Generative AI,' a collaboration between PYMNTS and AI-ID. The rapid advancement of computing power and cloud storage infrastructure has reshaped the business landscape, setting the stage for data-driven innovations like AI to revolutionize business processes. While tech giants or well-funded startups primarily produce today's AI models, computing power costs are continually decreasing. In a few years, AI models may become so advanced that everyday consumers can run them on personal devices at home, akin to today's cutting-edge platforms. This juncture signifies a tipping point, where the ever-increasing zettabytes of proprietary data produced each year must be addressed promptly. If not, the risks associated with future innovations will scale up in sync with their capabilities.

Read More