Big Data Management
Microsoft | September 22, 2023
AI models rely heavily on vast data volumes for their functionality, thus increasing risks associated with mishandling data in AI projects.
Microsoft's AI research team accidentally exposed 38 terabytes of private data on GitHub.
Many companies feel compelled to adopt generative AI but lack the expertise to do so effectively.
Artificial intelligence (AI) models are renowned for their enormous appetite for data, making them among the most data-intensive computing platforms in existence. While AI holds the potential to revolutionize the world, it is utterly dependent on the availability and ingestion of vast volumes of data.
An alarming incident involving Microsoft's AI research team recently highlighted the immense data exposure risks inherent in this technology. The team inadvertently exposed a staggering 38 terabytes of private data when publishing open-source AI training data on the cloud-based code hosting platform GitHub. This exposed data included a complete backup of two Microsoft employees' workstations, containing highly sensitive personal information such as private keys, passwords to internal Microsoft services, and over 30,000 messages from 359 Microsoft employees. The exposure was a result of an accidental configuration, which granted "full control" access instead of "read-only" permissions. This oversight meant that potential attackers could not only view the exposed files but also manipulate, overwrite, or delete them.
Although a crisis was narrowly averted in this instance, it serves as a glaring example of the new risks organizations face as they integrate AI more extensively into their operations. With staff engineers increasingly handling vast amounts of specialized and sensitive data to train AI models, it is imperative for companies to establish robust governance policies and educational safeguards to mitigate security risks.
Training specialized AI models necessitates specialized data. As organizations of all sizes embrace the advantages AI offers in their day-to-day workflows, IT, data, and security teams must grasp the inherent exposure risks associated with each stage of the AI development process. Open data sharing plays a critical role in AI training, with researchers gathering and disseminating extensive amounts of both external and internal data to build the necessary training datasets for their AI models. However, the more data that is shared, the greater the risk if it is not handled correctly, as evidenced by the Microsoft incident. AI, in many ways, challenges an organization's internal corporate policies like no other technology has done before. To harness AI tools effectively and securely, businesses must first establish a robust data infrastructure to avoid the fundamental pitfalls of AI.
Securing the future of AI requires a nuanced approach. Despite concerns about AI's potential risks, organizations should be more concerned about the quality of AI software than the technology turning rogue.
PYMNTS Intelligence's research indicates that many companies are uncertain about their readiness for generative AI but still feel compelled to adopt it. A substantial 62% of surveyed executives believe their companies lack the expertise to harness the technology effectively, according to 'Understanding the Future of Generative AI,' a collaboration between PYMNTS and AI-ID.
The rapid advancement of computing power and cloud storage infrastructure has reshaped the business landscape, setting the stage for data-driven innovations like AI to revolutionize business processes. While tech giants or well-funded startups primarily produce today's AI models, computing power costs are continually decreasing. In a few years, AI models may become so advanced that everyday consumers can run them on personal devices at home, akin to today's cutting-edge platforms. This juncture signifies a tipping point, where the ever-increasing zettabytes of proprietary data produced each year must be addressed promptly. If not, the risks associated with future innovations will scale up in sync with their capabilities.
PR Newswire | October 06, 2023
LexisNexis® Risk Solutions, a leading provider of data and analytics, released new insights on the latest national and regional provider density trends for primary and specialty care. The analysis explores how often prescriber data changes, the metropolitan areas seeing the biggest change in the number of primary care providers (PCPs) and the metropolitan areas with the highest and lowest number of heart disease patients per cardiologist.
Outflows of providers and coverage ratios can impact a community's ability to deliver accessible and efficient care, and with a looming shortfall of PCPs, it's important to understand where the existing PCPs are located. The analysis reveals the five metropolitan areas with the highest percent increase and decrease of PCPs between June 2022 and June 2023. According to the data, the Vallejo-Fairfield, CA area topped the list with a nearly 40% increase in PCPs. Conversely, the Fayetteville, NC area saw the highest decrease – losing nearly 12% of its PCPs.
As chronic diseases continue to increase, the density of specialty providers becomes paramount. The provider density analysis examines the number of patients with heart disease per cardiologist in metropolitan statistical areas (MSAs) spanning large, medium, small, and micropolitan areas. The data shows as MSAs get smaller, the number of patients per cardiologist increases substantially, with many rural communities having thousands of heart disease patients per cardiologist. Among major metropolitan areas, Boston has the best ratio with 196 heart disease patients per cardiologist, and Las Vegas has the worst ratio with 824 heart disease patients per cardiologist.
Additionally, the analysis found significant degradation of prescriber data in a short period of time. Over a quarter of prescribers (26%) had at least one change in their contact or license information within a 90-day period. This finding is based on the primary location of more than 2 million prescribers and illustrates the potential for data inaccuracies, creating an additional challenge for patients navigating the healthcare ecosystem.
"Data is an essential element to fueling healthcare's success, but the continuously changing nature of provider data, when left unchecked, poses a threat to care coordination, patient experience, and health outcomes," said Jonathan Shannon, associate vice president of healthcare strategy, LexisNexis Risk Solutions. "Our recent analysis emphasizes the criticality of ensuring provider information is clean and accurate in real-time. With consistently updated provider data, healthcare organizations can develop meaningful strategies to improve provider availability, equitable access, and patient experience, particularly for vulnerable populations."
iTWire | September 27, 2023
Teradata today announced new enhancements to its leading AI/ML (artificial intelligence/machine learning) model management software in ClearScape Analytics (e.g., ModelOps) to meet the growing demand from organisations across the globe for advanced analytics and AI.
These new features – including “no code” capabilities, as well as robust new governance and AI “explainability” controls – enable businesses to accelerate, scale, and optimise AI/ML deployments to quickly generate business value from their AI investments.
Deploying AI models into production is notoriously challenging. A recent O'Reilly's survey on AI adoption in the enterprise found that only 26% of respondents currently have models deployed in production, with many companies stating they have yet to see a return on their AI investments.
This is compounded by the recent excitement around generative AI and the pressure many executives are under to implement it within their organisation, according to a recent survey by IDC, sponsored by Teradata.
ModelOps in ClearScape Analytics makes it easier than ever to operationalise AI investments by addressing many of the key challenges that arise when moving from model development to deployment in production: end-to-end model lifecycle management, automated deployment, governance for trusted AI, and model monitoring.
The governed ModelOps capability is designed to supply the framework to manage, deploy, monitor, and maintain analytic outcomes. It includes capabilities like auditing datasets, code tracking, model approval workflows, monitoring model performance, and alerting when models are not performing well.
We stand on the precipice of a new AI-driven era, which promises to usher in frontiers of creativity, productivity, and innovation. Teradata is uniquely positioned to help businesses take advantage of advanced analytics, AI, and especially generative AI, to solve the most complex challenges and create massive enterprise business value.
Teradata chief product officer Hillary Ashton
“We offer the most complete cloud analytics and data platform for AI. And with our enhanced ModelOps capabilities, we are enabling organisations to cost effectively operationalise and scale trusted AI through robust governance and automated lifecycle management, while encouraging rapid AI innovation via our open and connected ecosystem. Teradata is also the most cost-effective, with proven performance and flexibility to innovate faster, enrich customer experiences, and deliver value.”
New capabilities and enhancements to ModelOps include:
- Bring Your Own Model (BYOM), now with no code capabilities, allows users to deploy their own machine learning models without writing any code, simplifying the deployment journey with automated validation, deployment and monitoring
- Mitigation of regulatory risks with advanced model governance capabilities and robust explainability controls to ensure trusted AI
- Automatic monitoring of model performance and data drift with zero configuration alerts
Teradata customers are already using ModelOps to accelerate time-to-value for their AI investments
A major US healthcare institution uses ModelOps to speed up the deployment process and scale its AI/ML personalisation journey. The institution accelerated its deployment with a 3x increase in productivity to successfully deploy thirty AI/ML models that predict which of its patients are most likely to need an office visit to implement “Personalisation at Scale.”
A major European financial institution leveraged ModelOps to reduce AI model deployment time from five months to one week. The models are deployed at scale and integrated with operational data to deliver business value.