Big Data Management

IBM Releases Watsonx AI with Generative AI Models for Data Governance

IBM Releases Watsonx
  • IBM announces plans to enhance its Watsonx AI and data platform, with a focus on scaling AI impact for enterprises.
  • Key improvements include new generative AI models, integration of foundation models, and features like Tuning Studio and Synthetic Data Generator.
  • IBM emphasizes trust, transparency, and governance in training and plans to incorporate AI into its hybrid cloud solutions, although implementation difficulty and cost may be issues.
 
IBM reveals its plans to introduce new generative AI foundation models and enhancements to its Watsonx AI and data platform. The goal is to provide enterprises with the tools they need to scale and accelerate the impact of AI in their operations. These improvements include a technical preview for watsonx.governance, the addition of new generative AI data services to watsonx.data, and the integration of watsonx.ai foundation models into select software and infrastructure products.
 
Developers will have the opportunity to explore these capabilities and models at the IBM TechXchange Conference, scheduled to take place from September 11 to 14 in Las Vegas.
 
The upcoming AI models and features include:
 
1. Granite Series Models: IBM plans to launch its Granite series models, utilizing the ‘Decoder’ architecture, is essential for large language models (LLMs). These models will support various enterprise natural language processing (NLP) tasks, including summarization, content generation, and insight extraction, with planned availability in Q3 2023.
 
2. Third-Party Models: IBM is currently offering Meta's Llama 2-chat 70 billion parameter model and the StarCoder LLM for code generation within watsonx.ai on IBM Cloud.
 
IBM places a strong emphasis on trust and transparency in its training process for foundation models. They follow rigorous data collection procedures and include control points to ensure responsible deployments in terms of governance, risk assessment, privacy, bias mitigation, and compliance.
 
IBM also intends to introduce new features across the watsonx platform:
 
For Watsonx.ai:
  • Tuning Studio: IBM plans to release the Tuning Studio, featuring prompt tuning, allowing clients to adapt foundation models to their specific enterprise data and tasks. This is expected to be available in 3Q23.
  • Synthetic Data Generator: IBM has launched a synthetic data generator, enabling users to create artificial tabular data sets for AI model training, reducing risk and accelerating decision-making.
 
For Watsonx.data:
  • Generative AI: IBM aims to incorporate generative AI capabilities into watsonx.data to help users discover, augment, visualize, and refine data for AI through a self-service, natural language interface. This feature is planned for technical preview in 4Q 2023.
  • Vector Database Capability: IBM plans to integrate vector database capabilities into watsonx.data to support watsonx.ai retrieval and augmented generation use cases, also expected in the technical preview in 4Q 2023.
 
For Watsonx.governance:
  • Model Risk Governance for Generative AI: IBM is launching a tech preview for watsonx.governance, providing automated collection and documentation of foundation model details and model risk governance capabilities.
 
Dinesh Nirmal, Senior Vice President, Products, IBM Software, stated that IBM is dedicated to supporting clients throughout the AI lifecycle, from establishing foundational data strategies to model tuning and governance. Additionally, IBM will offer AI assistants to help clients scale AI's impact across various enterprise use cases, such as application modernization, customer care, and HR and talent management.
 
IBM also intends to integrate watsonx.ai innovations into its hybrid cloud software and infrastructure products, including intelligent IT automation and developer services. IBM's upgrades to the Watsonx AI and data platform offer promise but, come with potential drawbacks. Implementation complexity and the need for additional training may create a steep learning curve. The associated costs of advanced technology could be prohibitive for smaller organizations.
 
The introduction of generative AI and synthetic data raises data privacy and security concerns. Additionally, despite efforts for responsible AI, the risk of bias in models necessitates ongoing vigilance to avoid legal and ethical issues.
 
 

Spotlight

Spotlight

Related News

Big Data Management

AVEVA Extends Data Capabilities from Edge to Plant to Community with AVEVA PI Data Infrastructure

iTWire | October 30, 2023

AVEVA, is a global leader in industrial software, driving digital transformation and sustainability, has launched AVEVA PI Data Infrastructure, a fully-integrated hybrid data solution providing easy scalability, centralised management, and the ability to share data collaboratively via the cloud. AVEVA PI Data Infrastructure is the latest offering in the market leading AVEVA PI System portfolio, which helps companies collect, enrich, analyse and visualise operations data to achieve deeper insight and operational excellence. Moving to hybrid infrastructure gives industrial companies the flexibility, scalability and security needed to deliver valuable, high-fidelity data to authorised users and applications in any location. The initial release also gives customers the option to use the OpenID Connect protocol for user authentication, enabling enterprise-wide single sign on. Other enterprise-class data management features will be delivered over several releases. AVEVA PI Data Infrastructure makes it easier for companies to collect and use real-time operations data in industrial environments that increasingly include sensor-enabled legacy systems, remote assets and IIoT devices. The hybrid architecture gives data access to more decision makers who rely on operations data to resolve problems and develop business insights, thereby reducing the total cost and effort of operations data management. By achieving seamless data sharing with any trusted collaborator, companies can overcome costly data silos, modernise and streamline user access and aggregate real-time and historical data for wider use and consumption. AVEVA PI Data Infrastructure is available via subscription using AVEVA Flex credits. Harpreet Gulati, SVP - Head of PI System Business at AVEVA, said, No other industrial software company offers a fully-integrated, seamless data infrastructure that enables the fast, secure flow of real-time, high-fidelity data to anywhere it is needed – across multiple plants, at the edge, or in a trusted community over the cloud – with complete data integrity. We want to provide our customers with the flexibility to deploy across any of these areas, enabling them to increase sustainability, operating efficiency, asset reliability, and organisational agility. Customers are embracing the new offering. Giovanna Ruggieri, Head of ICT at Italy’s EP Produzione, a subsidiary of the European energy giant, EPH, commented: "EP Produzione is actively pursuing digital transformation to maximise operational excellence and improve processes to support the business. To continue the journey, and better embrace the digital transformation, we need greater flexibility and integration at all levels, a data infrastructure that can give us full visibility across our multi-site operating environment that always keeps cyber security as high priority. "We appreciate AVEVA PI Data Infrastructure’s aggregate tag subscription model because it allows us to better manage our current and future needs in a smart way, with AVEVA currently proposing, for us, one of the best solutions on the market."

Read More

Big Data Management

Databricks Agrees to Acquire Arcion, the Leading Provider for Real-Time Enterprise Data Replication Technology

PR Newswire | October 26, 2023

Databricks, the Data and AI company, today announced it has agreed to acquire Arcion, a Databricks Ventures portfolio company that helps enterprises quickly and reliably replicate data across on-prem, cloud databases and data platforms. This will enable Databricks to provide native solutions to ingest data from various databases and SaaS applications into the Databricks Lakehouse Platform. The transaction is valued at over $100 million, inclusive of incentives. Data Lakehouse Platforms have emerged as the de facto standard for enterprise data and AI platforms. However, these data platforms are only as valuable as the data in them. Ingesting data from existing databases and applications remains complicated, fragile, and costly. Troves of important data sit not only in transactional databases such as Oracle, MySQL, and Postgres, but also in SaaS applications such as Salesforce, SAP, and Workday. According to a recent MIT Technology Review Insights and Databricks survey of senior data and technology executives ("Laying the foundation for data- and AI-led growth"), businesses still suffer from many siloed systems; 34% have 10+ systems, and of the largest companies, more than 80% have 10+ systems to juggle. This acquisition will enable Databricks to natively provide a scalable, easy-to-use, and cost-effective solution to ingest data from various enterprise data sources. Building on a scalable change data capture (CDC) engine, Arcion offers connectors for over 20 enterprise databases and data warehouses. The integration will simplify ingesting such data either continuously or on-demand into the lakehouse, fully integrated with the enterprise security, governance, and compliance capabilities of the Databricks platform. To build analytical dashboards, data applications, and AI models, data needs to be replicated from the systems of record like CRM, ERP, and enterprise apps to the Lakehouse, said Ali Ghodsi, Co-Founder and CEO at Databricks. Arcion's highly reliable and easy-to-use solution will enable our customers to make that data available almost instantly for faster and more informed decision-making. Arcion will be a great asset to Databricks, and we are excited to welcome the team and work with them to further develop solutions to help our customers accelerate their data and AI journeys. "Arcion's real-time, large-scale CDC data pipeline technology extends Databricks' market-leading ETL solution to include replication of operational data in real-time," said Gary Hagmueller, CEO of Arcion. "Databricks has been a great partner and investor in Arcion, and we are very excited to join forces to help companies simplify and accelerate their data and AI business momentum."

Read More

Data Science

Snowflake Accelerates How Users Build Next Generation Apps and Machine Learning Models in the Data Cloud

Business Wire | November 03, 2023

Snowflake (NYSE: SNOW), the Data Cloud company, today announced at its Snowday 2023 event new advancements that make it easier for developers to build machine learning (ML) models and full-stack apps in the Data Cloud. Snowflake is enhancing its Python capabilities through Snowpark to boost productivity, increase collaboration, and ultimately speed up end-to-end AI and ML workflows. In addition, with support for containerized workloads and expanded DevOps capabilities, developers can now accelerate development and run apps — all within Snowflake's secure and fully managed infrastructure. “The rise of generative AI has made organizations’ most valuable asset, their data, even more indispensable. Snowflake is making it easier for developers to put that data to work so they can build powerful end-to-end machine learning models and full-stack apps natively in the Data Cloud,” said Prasanna Krishnan, Senior Director of Product Management, Snowflake. “With Snowflake Marketplace as the first cross-cloud marketplace for data and apps in the industry, customers can quickly and securely productionize what they’ve built to global end users, unlocking increased monetization, discoverability, and usage.” Developers Gain Robust and Familiar Functionality for End-to-End Machine Learning Snowflake is continuing to invest in Snowpark as its secure deployment and processing of non-SQL code, with over 35% of Snowflake customers using Snowpark on a weekly basis (as of September 2023). Developers increasingly look to Snowpark for complex ML model development and deployment, and Snowflake is introducing expanded functionality that makes Snowpark even more accessible and powerful for all Python developers. New advancements include: Snowflake Notebooks (private preview): Snowflake Notebooks are a new development interface that offers an interactive, cell-based programming environment for Python and SQL users to explore, process, and experiment with data in Snowpark. Snowflake’s built-in notebooks allow developers to write and execute code, train and deploy models using Snowpark ML, visualize results with Streamlit chart elements, and much more — all within Snowflake’s unified, secure platform. Snowpark ML Modeling API (general availability soon): Snowflake’s Snowpark ML Modeling API empowers developers and data scientists to scale out feature engineering and simplify model training for faster and more intuitive model development in Snowflake. Users can implement popular AI and ML frameworks natively on data in Snowflake, without having to create stored procedures. Snowpark ML Operations Enhancements: The Snowpark Model Registry (public preview soon) now builds on a native Snowflake model entity and enables the scalable, secure deployment and management of models in Snowflake, including expanded support for deep learning models and open source large language models (LLMs) from Hugging Face. Snowflake is also providing developers with an integrated Snowflake Feature Store (private preview) that creates, stores, manages, and serves ML features for model training and inference. Endeavor, the global sports and entertainment company that includes the WME Agency, IMG & On Location, UFC, and more, relies on Snowflake’s Snowpark for Python capabilities to build and deploy ML models that create highly personalized experiences and apps for fan engagement. Snowpark serves as the driving force behind our end-to-end machine learning development, powering how we centralize and process data across our various entities, and then securely build and train models using that data to create hyper-personalized fan experiences at scale, said Saad Zaheer, VP of Data Science and Engineering, Endeavor. With Snowflake as our central data foundation bringing all of this development directly to our enterprise data, we can unlock even more ways to predict and forecast customer behavior to fuel our targeted sales and marketing engines. Snowflake Advances Developer Capabilities Across the App Lifecycle The Snowflake Native App Framework (general availability soon on AWS, public preview soon on Azure) now provides every organization with the necessary building blocks for app development, including distribution, operation, and monetization within Snowflake’s platform. Leading organizations are monetizing their Snowflake Native Apps through Snowflake Marketplace, with app listings more than doubling since Snowflake Summit 2023. This number is only growing as Snowflake continues to advance its developer capabilities across the app lifecycle so more organizations can unlock business impact. For example, Cybersyn, a data-service provider, is developing Snowflake Native Apps exclusively for Snowflake Marketplace, with more than 40 customers running over 5,000 queries with its Financial & Economic Essentials Native App since June 2022. In addition, LiveRamp, a data collaboration platform, has seen the number of customers deploying its Identity Resolution and Transcoding Snowflake Native App through Snowflake Marketplace increase by more than 80% since June 2022. Lastly, SNP has been able to provide its customers with a 10x cost reduction in Snowflake data processing associated with SAP data ingestion, empowering them to drastically reduce data latency while improving SAP data availability in Snowflake through SNP’s Data Streaming for SAP - Snowflake Native App. With Snowpark Container Services (public preview soon in select AWS regions), developers can run any component of their app — from ML training, to LLMs, to an API, and more — without needing to move data or manage complex container-based infrastructure. Snowflake Automates DevOps for Apps, Data Pipelines, and Other Development Snowflake is giving developers new ways to automate key DevOps and observability capabilities across testing, deploying, monitoring, and operating their apps and data pipelines — so they can take them from idea to production faster. With Snowflake’s new Database Change Management (private preview soon) features, developers can code declaratively and easily templatize their work to manage Snowflake objects across multiple environments. The Database Change Management features serve as a single source of truth for object creation across various environments, using the common “configuration as code” pattern in DevOps to automatically provision and update Snowflake objects. Snowflake also unveiled a new Powered by Snowflake Funding Program, innovations that enable all users to securely tap into the power of generative AI with their enterprise data, enhancements to further eliminate data silos and strengthen Snowflake’s leading compliance and governance capabilities through Snowflake Horizon, and more at Snowday 2023.

Read More