Big Data
iTWire | September 27, 2023
Teradata today announced new enhancements to its leading AI/ML (artificial intelligence/machine learning) model management software in ClearScape Analytics (e.g., ModelOps) to meet the growing demand from organisations across the globe for advanced analytics and AI.
These new features – including “no code” capabilities, as well as robust new governance and AI “explainability” controls – enable businesses to accelerate, scale, and optimise AI/ML deployments to quickly generate business value from their AI investments.
Deploying AI models into production is notoriously challenging. A recent O'Reilly's survey on AI adoption in the enterprise found that only 26% of respondents currently have models deployed in production, with many companies stating they have yet to see a return on their AI investments.
This is compounded by the recent excitement around generative AI and the pressure many executives are under to implement it within their organisation, according to a recent survey by IDC, sponsored by Teradata.
ModelOps in ClearScape Analytics makes it easier than ever to operationalise AI investments by addressing many of the key challenges that arise when moving from model development to deployment in production: end-to-end model lifecycle management, automated deployment, governance for trusted AI, and model monitoring.
The governed ModelOps capability is designed to supply the framework to manage, deploy, monitor, and maintain analytic outcomes. It includes capabilities like auditing datasets, code tracking, model approval workflows, monitoring model performance, and alerting when models are not performing well.
We stand on the precipice of a new AI-driven era, which promises to usher in frontiers of creativity, productivity, and innovation. Teradata is uniquely positioned to help businesses take advantage of advanced analytics, AI, and especially generative AI, to solve the most complex challenges and create massive enterprise business value.
Teradata chief product officer Hillary Ashton
“We offer the most complete cloud analytics and data platform for AI. And with our enhanced ModelOps capabilities, we are enabling organisations to cost effectively operationalise and scale trusted AI through robust governance and automated lifecycle management, while encouraging rapid AI innovation via our open and connected ecosystem. Teradata is also the most cost-effective, with proven performance and flexibility to innovate faster, enrich customer experiences, and deliver value.”
New capabilities and enhancements to ModelOps include:
- Bring Your Own Model (BYOM), now with no code capabilities, allows users to deploy their own machine learning models without writing any code, simplifying the deployment journey with automated validation, deployment and monitoring
- Mitigation of regulatory risks with advanced model governance capabilities and robust explainability controls to ensure trusted AI
- Automatic monitoring of model performance and data drift with zero configuration alerts
Teradata customers are already using ModelOps to accelerate time-to-value for their AI investments
A major US healthcare institution uses ModelOps to speed up the deployment process and scale its AI/ML personalisation journey. The institution accelerated its deployment with a 3x increase in productivity to successfully deploy thirty AI/ML models that predict which of its patients are most likely to need an office visit to implement “Personalisation at Scale.”
A major European financial institution leveraged ModelOps to reduce AI model deployment time from five months to one week. The models are deployed at scale and integrated with operational data to deliver business value.
Read More
Big Data Management
Microsoft | September 22, 2023
AI models rely heavily on vast data volumes for their functionality, thus increasing risks associated with mishandling data in AI projects.
Microsoft's AI research team accidentally exposed 38 terabytes of private data on GitHub.
Many companies feel compelled to adopt generative AI but lack the expertise to do so effectively.
Artificial intelligence (AI) models are renowned for their enormous appetite for data, making them among the most data-intensive computing platforms in existence. While AI holds the potential to revolutionize the world, it is utterly dependent on the availability and ingestion of vast volumes of data.
An alarming incident involving Microsoft's AI research team recently highlighted the immense data exposure risks inherent in this technology. The team inadvertently exposed a staggering 38 terabytes of private data when publishing open-source AI training data on the cloud-based code hosting platform GitHub. This exposed data included a complete backup of two Microsoft employees' workstations, containing highly sensitive personal information such as private keys, passwords to internal Microsoft services, and over 30,000 messages from 359 Microsoft employees. The exposure was a result of an accidental configuration, which granted "full control" access instead of "read-only" permissions. This oversight meant that potential attackers could not only view the exposed files but also manipulate, overwrite, or delete them.
Although a crisis was narrowly averted in this instance, it serves as a glaring example of the new risks organizations face as they integrate AI more extensively into their operations. With staff engineers increasingly handling vast amounts of specialized and sensitive data to train AI models, it is imperative for companies to establish robust governance policies and educational safeguards to mitigate security risks.
Training specialized AI models necessitates specialized data. As organizations of all sizes embrace the advantages AI offers in their day-to-day workflows, IT, data, and security teams must grasp the inherent exposure risks associated with each stage of the AI development process. Open data sharing plays a critical role in AI training, with researchers gathering and disseminating extensive amounts of both external and internal data to build the necessary training datasets for their AI models. However, the more data that is shared, the greater the risk if it is not handled correctly, as evidenced by the Microsoft incident. AI, in many ways, challenges an organization's internal corporate policies like no other technology has done before. To harness AI tools effectively and securely, businesses must first establish a robust data infrastructure to avoid the fundamental pitfalls of AI.
Securing the future of AI requires a nuanced approach. Despite concerns about AI's potential risks, organizations should be more concerned about the quality of AI software than the technology turning rogue.
PYMNTS Intelligence's research indicates that many companies are uncertain about their readiness for generative AI but still feel compelled to adopt it. A substantial 62% of surveyed executives believe their companies lack the expertise to harness the technology effectively, according to 'Understanding the Future of Generative AI,' a collaboration between PYMNTS and AI-ID.
The rapid advancement of computing power and cloud storage infrastructure has reshaped the business landscape, setting the stage for data-driven innovations like AI to revolutionize business processes. While tech giants or well-funded startups primarily produce today's AI models, computing power costs are continually decreasing. In a few years, AI models may become so advanced that everyday consumers can run them on personal devices at home, akin to today's cutting-edge platforms. This juncture signifies a tipping point, where the ever-increasing zettabytes of proprietary data produced each year must be addressed promptly. If not, the risks associated with future innovations will scale up in sync with their capabilities.
Read More
Big Data Management
Kinetica | September 22, 2023
Kinetica, a renowned speed layer for generative AI and real-time analytics, has recently unveiled a native Large Language Model (LLM) integrated with Kinetica's innovative architecture. This empowers users to perform ad-hoc data analysis on real-time, structured data with the ease of natural language, all without the need for external API calls and without data ever leaving the secure confines of the customer's environment. This significant milestone follows Kinetica's prior innovation as the first analytic database to integrate with OpenAI.
Amid the LLM fervor, enterprises and government agencies are actively seeking inventive ways to automate various business functions while safeguarding sensitive information that could be exposed through fine-tuning or prompt augmentation. Public LLMs, exemplified by OpenAI's GPT 3.5, raise valid concerns regarding privacy and security. These concerns are effectively mitigated through native offerings, seamlessly integrated into the Kinetica deployment, and securely nestled within the customer's network perimeter.
Beyond its superior security features, Kinetica's native LLM is finely tuned to the syntax and industry-specific data definitions, spanning domains such as telecommunications, automotive, financial services, logistics, and more. This tailored approach ensures the generation of more reliable and precise SQL queries. Notably, this capability extends beyond conventional SQL, enabling efficient handling of intricate tasks essential for enhanced decision-making capabilities, particularly for time-series, graph, and spatial inquiries. Kinetica's approach to fine-tuning places emphasis on optimizing SQL generation to deliver consistent and accurate results, in stark contrast to more conventional methods that prioritize creativity but yield diverse and unpredictable responses. This steadfast commitment to reliable SQL query outcomes offers businesses and users the peace of mind they deserve.
Illustrating the practical impact of this innovation, the US Air Force has been collaborating closely with Kinetica to leverage advanced analytics on sensor data, enabling swift identification and response to potential threats. This partnership contributes significantly to the safety and security of the national airspace system. The US Air Force now employs Kinetica's embedded LLM to detect airspace threats and anomalies using natural language.
Kinetica's database excels in converting natural language queries into SQL, delivering responses in mere seconds, even when faced with complex or unfamiliar questions. Furthermore, Kinetica seamlessly combines various analytics modes, including time series, spatial, graph, and machine learning, thereby expanding the range of queries it can effectively address. What truly enables Kinetica to excel in conversational query processing is its ingenious use of native vectorization. In a vectorized query engine, data is organized into fixed-size blocks called vectors, enabling parallel query operations on these vectors. This stands in contrast to traditional approaches that process individual data elements sequentially. The result is significantly accelerated query execution, all within a smaller compute footprint. This remarkable speed is made possible by the utilization of GPUs and the latest CPU advancements, which enable simultaneous calculations on multiple data elements, thereby greatly enhancing the processing speed of computation-intensive tasks across multiple cores or threads.
About Kinetica
Kinetica is a pioneering company at the forefront of real-time analytics and is the creator of the groundbreaking real-time analytical database specially designed for sensor and machine data. The company offers native vectorized analytics capabilities in the fields of generative AI, spatial analysis, time-series modeling, and graph processing. A distinguished array of the world's largest enterprises spanning diverse sectors, including the public sector, financial services, telecommunications, energy, healthcare, retail, and automotive industries, entrusts Kinetica to forge novel solutions in the realms of time-series data and spatial analysis. The company's clientele includes various illustrious organizations such as the US Air Force, Citibank, Ford, T-Mobile, and numerous others.
Read More