Data Science
Business Wire | October 20, 2023
Airbyte, creators of the fastest-growing open-source data movement platform, today made available additional connectors for the Milvus, Qdrant and Weaviate vector databases as the destination for moving data from hundreds of data sources, which then can be accessed by artificial intelligence (AI) models.
We were the first general-purpose data movement platform to add support for vector databases – the first to build a bridge between data movement platforms and AI, said Michel Tricot, CEO, Airbyte. Now, we are doubling down as our users are clamoring for more and more vector database support so they don’t have to struggle with creating custom code to bring in data; they can use the new Airbyte connector to select the data sources they want.
Because vector databases have the ability to detect and identify relationships in data, their usage has become increasingly popular as users seek to gain more meaning from data. Vector databases are ideal for applications like recommendation systems, anomaly detection and natural language processing, and as sources for AI applications – specifically Large Language Models (LLM).
The vector database destination in Airbyte now enables users to configure the full ELT pipeline, starting from extracting records from a wide variety of sources to separating unstructured and structured data, preparing and embedding text contents of records, and finally loading them into vector databases – all through a single, user-friendly interface. These vector databases can then be accessed by LLMs. All existing advantages of the Airbyte platform are now extended to vector databases, including:
The largest catalog of data sources that can be connected within minutes, and optimized for performance.
Availability of the no-code connector builder that makes it possible to easily and quickly create new connectors for data integrations that addresses the “long-tail” of data sources.
Ability to do incremental syncs to only extract changes in the data from a previous sync.
Built-in resiliency in the event of a disrupted session moving data, so the connection will resume from the point of the disruption.
Secure authentication for data access.
Ability to schedule and monitor status of all syncs.
Airbyte continues to innovate and support cutting-edge technologies to empower organizations in their data integration journey. The addition of more vector database support marks another significant milestone in Airbyte's commitment to providing powerful and efficient solutions for data integration and analysis.
Certified connectors for both Airbyte Cloud and Airbyte Open Source Software (OSS) versions are now available for Milvus, Pinecone, and Weaviate. There is a community connector for both versions of Airbyte for Qdrant, as well as a community connector for Airbyte OSS available for Chroma. More options are planned for the future.
Airbyte makes moving data easy and affordable across almost any source and destination, helping enterprises provide their users with access to the right data for analysis and decision-making. Airbyte has the largest data engineering contributor community – with more than 800 contributors – and the best tooling to build and maintain connectors.
About Airbyte
Airbyte is the open-source data movement leader running in the safety of your cloud and syncing data from applications, APIs, and databases to data warehouses, lakes, and other destinations. Airbyte offers four products: Airbyte Open Source, Airbyte Enterprise, Airbyte Cloud, and Powered by Airbyte. Airbyte was co-founded by Michel Tricot (former director of engineering and head of integrations at Liveramp and RideOS) and John Lafleur (serial entrepreneur of dev tools and B2B). The company is headquartered in San Francisco with a distributed team around the world. To learn more, visit airbyte.com.
Read More
Data Visualization
Airbyte | September 11, 2023
Airbyte, the leading open-source data movement platform, has announced a strategic integration with Datadog, Inc., a prominent cloud application monitoring and security platform. This integration offers customers a comprehensive solution to monitor and analyze data pipelines with access to nearly 50 metrics, all at no additional cost.
The integration between Airbyte Self-Managed and Datadog's data observability and security monitoring capabilities allows organizations to maintain a close watch on the health of their critical data pipelines. Key features of this integration include:
A centralized overview of Airbyte data pipeline performance
Real-time detection and immediate alerts for failing syncs or connections
Notifications regarding long-running jobs, which could indicate potential latency issues
Michel Tricot, CEO of Airbyte, emphasized the significance of this integration, stating,
The new Datadog integration provides transparency and actionable insights, empowering users to optimize performance and ensure reliable data pipelines by proactively addressing potential data issues.
[Source: Business Wire]
Yrieix Garnier, Vice President of Product at Datadog, further elaborated on the benefits, explaining,
Airbyte's data extraction and loading process involves numerous complex components. The integration with Datadog offers users peace of mind, enabling them to monitor data pipelines across their organization and troubleshoot any potential data integration workflow issues, ultimately ensuring data quality.
[Source: Business Wire]
This integration will be immediately available to users. Existing Datadog customers can configure their Airbyte deployments to send metrics to Datadog. For those not already using Datadog, a free trial is available. Similarly, users new to Airbyte can sign up for free.
Airbyte continues its commitment to delivering robust data integration and analysis solutions to organizations. The Datadog integration represents a significant milestone in Airbyte's mission to empower businesses with efficient data integration capabilities.
Airbyte simplifies data movement across various sources and destinations, making it accessible and cost-effective for enterprises. With the largest data engineering contributor community, boasting over 800 contributors as well as top-tier tools for connector development and maintenance, Airbyte remains at the forefront of the data integration landscape.
About Airbyte
Founded in 2020, Airbyte is an open-source platform for EL (T) that enables data teams to replicate data from various sources to different destinations. The company, which has raised $181 million in funding, believes in the power of open source to address data integration challenges and offers over 200 connectors for data syncing. Currently, it serves over 25,000 companies.
Read More
Big Data Management
PR Newswire | September 27, 2023
Congruity360, a leading unstructured data management and risk mitigation provider, announces the addition of data mobility in Enterprise Insights.
As unstructured data grows at the annual rate of 55% to 65% and accounts for more than 80% of all enterprise data, businesses must find a way to identify, classify and move data intelligently and automatically during its lifecycle. As enterprises grow, their valuable data must mature with their business. This may require a journey to the cloud, SLA changes which optimize storage costs, classification to mitigate risk, and moving the right data to additional key AI platform initiatives.
A simple, scalable, high-performance data classification engine, Enterprise Insights delivers next-generation data lifecycle management for storage optimization, security and risk optimization, and IT business optimization.
Enterprise Insights Approach to Successful Data Optimization:
Identify – Securely analyze PBs of unstructured data across on premises (NAS & object) and cloud (files/objects & SaaS) sources by harnessing the power of the platform's rapid insights and auto-discover technologies, which can reduce data identification times by 1,000%.
Classify – Quickly identify key client data attributes for cost savings, risk mitigation, and business impact with simple to consume dashboards and drill down capabilities.
Review – Confidently create and take actions by leveraging the comprehensive search engine to quickly find and preview data for movement without ever leaving the platform.
Remediate – Seamlessly take action (migrate and tier) on classified data to ensure it's properly protected, optimally stored, and most effectively serving the business.
Enterprise Insights offers three use case-driven insight analysis modules:
Storage and Migration Optimization – Insights into over 35 file data attributes including systems' aged, stale, obsolete, redundant, trivial, and types of systems files.
Business Optimization – Insight into and classification by business units' or cost centers' aged, stale, obsolete, redundant, trivial, and types of files.
Data Security and Risk Optimization – Insights into files containing PII and SPII, financial, legal, security, and risk data, as well as open shares and other network & storage security vulnerabilities.
By leveraging Enterprise Insights, clients can classify data for simple and secure migration both on premise and in the cloud. Equally important is Insights data tiering capabilities, enabling users to match data storage costs to data usage.
Powered by the Classify360 Platform, Enterprise Insights' secure hybrid approach to data analysis scales capabilities to exabyte levels at unmatched speed. Enterprise Insights is the industry's most powerful weapon to tackle the costs, time, and complexity of cloud migration projects, backup modernization, storage tiering, hardware refresh, and security posture management. By providing users with dashboards highlighting their existing storage costs and risks, Enterprise Insights frees clients from hidden, legacy, CapEx and OpEx expenditures, performance, and scalability bottlenecks while discovering and acting on sensitive and risk data.
Unstructured data insanity is treating all data equally with zero insights into its business impact, said Brian Davidson, Chief Executive Officer and Managing Partner of Congruity360. Enterprise Insights is the first step in implementing optimized data lifecycle management. With historically high data growth and new business uses for unstructured data, it is essential to attack the costs and risks inherent in unmanaged data. Our customers have realized 7-10x returns on their data lifecycle management implementations while reducing risk in an auditable compliance framework. As AI continues to gain steam, don't overpay by moving useless data to your expensive AI platforms.
The Classify360 Platform is comprehensive, simple to implement, scale, and operate. Businesses leverage the Classify360 Platform for unstructured data discovery, classification, business workflows, remediation actions, and insightful reporting. Congruity360 continues to tackle additional data governance challenges through innovations to the Classify360 Platform to continue delivering revolutionary data governance and classification, at scale, to the enterprise world.
ABOUT CONGRUITY360
Congruity360 delivers the only data life cycle management solution built on a foundation of classification, by expert data storage engineers alongside expert data privacy consultants. The Classify360 Platform is easy to implement, requires no outside consultants, and quickly analyzes your data at the petabyte scale in days, not weeks or months.
Read More
Big Data Management
iMerit | September 07, 2023
iMerit, a prominent player in the field of artificial intelligence (AI) data solutions, has unveiled its latest offering, the Radiology Annotation Product Suite. This innovative suite is designed to cater to the needs of medical AI developers by providing advanced automation, annotation, and analytics capabilities.
This new product suite is firmly rooted in iMerit's Ango Hub platform, an end-to-end enterprise-grade technology platform that is specifically tailored to deliver top-notch data annotation tools for AI development teams. Within this suite, a comprehensive range of solutions awaits, including data sourcing, workflow design, cutting-edge data annotation tools, and the invaluable input of human experts, all seamlessly integrated into a single platform. This unique fusion of iMerit's technological prowess and radiology expertise ensures a smooth journey from training data all the way to regulatory benchmarking.
A significant hurdle in radiology AI applications is the demand for specialized tools and insights from domain experts to ensure the necessary accuracy and precision in training data. For developers, the Radiology Annotation Product Suite offers a one-stop solution, combining automation, annotation tools, and analytics within a single platform, facilitating the creation of precise data pipelines essential for quick scaling radiology AI solutions into production.
Sina Bari MD, Senior Director of Medical AI at iMerit, explained that their suite aims to combine human expertise, data management, and automation in a single solution.
Notable features include
Customized workflows designed for Radiologists with consensus and multistep capabilities.
Diagnostic-level annotation accuracy with multiplanar functionality and 3D volume rendering.
Efficient annotation facilitated by smart automation tools and model integration.
Stringent data security and regulatory compliance, including CFR 21 Part 11, HIPAA, and SOC2.
This solution seamlessly integrates U.S. and offshore experts, including board-certified radiologists, all within a highly secure end-to-end platform, ensuring cost-effective scaling of annotation efforts.
Radha Basu, Founder and CEO of iMerit, emphasized the importance of their work, stating that they understand the challenges of developing AI applications and that their suite was built to scale data annotation efforts with high quality, accuracy, and speed.
About iMerit
iMerit is a prominent AI data solutions company specializing in data services such as dataset creation, image tagging, sentiment analysis, data verification, and content aggregation. It serves Fortune 500 enterprises in various industries and have a global presence, with headquarters in the United States and teams in India, the US, Bhutan, and Europe. iMerit's investors include Omidyar Network, Dell.org, Khosla Ventures, and British International Investment.
Read More