BUSINESS INTELLIGENCE, BIG DATA MANAGEMENT
Comet | November 17, 2022
Comet, provider of the leading MLOps platform for machine learning (ML) teams from startup to enterprise, today announced a bold new product: Kangas. Open sourced to democratize large scale visual dataset exploration and analysis for the computer vision and machine learning community, Kangas helps users understand and debug their data in a new and highly intuitive way. With Kangas, visualizations are generated in real time; enabling ML practitioners to group, sort, filter, query and interpret their structured and unstructured data to derive meaningful information and accelerate model development.
Data scientists often need to analyze large scale datasets both during the data preparation stage and model training, which can be overwhelming and time-consuming, especially when working on large scale datasets. Kangas makes it possible to intuitively explore, debug and analyze data in real time to quickly gain insights, leading to better, faster decisions. With Kangas, users are able to transform datasets of any scale into clear visualizations.
“A key component of data-centric Machine Learning is being able to understand how your training data impacts model results and where your model predictions are wrong. “Kangas accomplishes both of these goals and dramatically improves the experience for ML practitioners.”
Gideon Mendels, CEO and co-founder of Comet
Putting Large Scale Machine Learning Dataset Analysis at Your Fingertips
Developed with the unique needs of ML practitioners in mind, Kangas is a scalable, dynamic and interoperable tool that allows for the discovery of patterns buried deep within oceans of datasets. With Kangas, data scientists can query their large-scale datasets in a manner that is natural to their problem, allowing them to interact and engage with their data in novel ways.
Noteworthy benefits of Kangas include:
Unparalleled Scalability: Kangas was developed to handle large datasets with high performance.
Purpose Built: Computer Vision/ML concepts like scoring, bounding boxes and more are supported out-of-the-box, and statistics/charts are generated automatically.
Support for Different Forms of Media: Kangas is not limited to traditional text queries. It also supports images, videos and more.
Interoperability: Kangas can run in a notebook, as a standalone local app or even deployed as a web app. It ingests data in a simple format that makes it easy to work with whatever tooling data scientists already use.
Open Source: Kangas is 100% open source and is built by and for the ML community.
Kangas was designed for the entire community, to be embraced by students, researchers and the enterprise. As individuals and teams work to further their ML initiatives, they will be able to leverage the full benefits of Kangas. Being open source, all are able to contribute and further enhance it as well.
“Interoperability and flexibility are inherent in Comet’s value proposition, and Comet aims to expand on that value through open source contributions,” added Mendels. “Kangas is a continuation of all of our efforts, and we couldn’t wait to get its capabilities into the hands of as many data scientists, data engineers and ML engineers as possible. We believe by open sourcing it, Comet can help teams get the most out of their ML projects in ways that have not been possible previously.”
Kangas is available as an open source package for any type of use case. It will be available under Apache License 2 and is open to contributions from community members.
Comet provides an MLOps platform that data scientists and machine learning teams use to manage, optimize, and accelerate the development process across the entire ML lifecycle, from training runs to monitoring models in production. Comet’s platform is trusted by over 150 enterprise customers including Affirm, Cepsa, Etsy, Uber and Zappos. Individuals and academic teams use Comet’s platform to advance research in their fields of study. Founded in 2017, Comet is headquartered in New York, NY with a remote workforce in nine countries on four continents. Comet is free to individuals and academic teams. Startup, team, and enterprise licensing is also available.
BUSINESS INTELLIGENCE, BIG DATA MANAGEMENT
Komprise | December 14, 2022
Komprise, the leader in analytics-driven unstructured data management and mobility, today announced the availability of Komprise Hypertransfer for Elastic Data Migration, which accelerates data transfer to the cloud while strengthening cloud security. As enterprises migrate more file data to the cloud, IT organizations face many barriers which cause migrations to often take weeks to months. SMB workloads such as user data, electronic design automation (EDA) and other multimedia workloads contain lots of small files and are a particular challenge since the protocol requires many back-and-forth handshakes that increase administrative traffic over the network.
Komprise Hypertransfer optimizes cloud data migration performance by minimizing the WAN roundtrips using dedicated channels to send data, mitigating the SMB protocol issues. According to recent tests performed by Komprise, Hypertransfer improves data transfer rates across the WAN by 25x over other alternatives for SMB datasets with predominantly small files. Komprise already delivers 27 times faster performance for NFS migrations. Komprise Hypertransfer also strengthens security and defense against ransomware by not accessing cloud file storage over the network during data migrations, since data transfers from source to target over private channels.
“Enterprises need to move file data to the cloud to cut costs and unlock strategic value. “With Komprise Hypertransfer you get measurably faster SMB cloud data migrations—shaving weeks off migration timelines while minimizing the chance for errors or data loss. Komprise Hypertransfer makes cloud migrations feasible.”
Kumar Goswami, CEO and co-founder of Komprise
Komprise Elastic Data Migration is a software as a service (SaaS) solution available with the Komprise Intelligent Data Management platform or as a standalone product. Designed to be fast, easy and reliable with elastic scale-out architecture and an analytics-driven approach, it is the market leader in file and object data migrations. Komprise Elastic Data Migration ensures preservation of data integrity with access control propagation and file-level data integrity checks such as SHA-1 and MD5 checks with audit logging.
With the latest release, Komprise administrators also have more flexible migration configuration settings to handle read-only sources and sources with access-time tracking disabled. Additionally, customers can now enable and disable data integrity checks to either ensure data accuracy or improve migration performance.
Komprise is a provider of unstructured data management and mobility software that frees enterprises to easily analyze, mobilize, and monetize the right file and object data across clouds without shackling data to any vendor. With Komprise Intelligent Data Management, you can cut 70% of enterprise storage, backup and cloud costs while making data easily available to cloud-based data lakes and analytics tools.
BUSINESS INTELLIGENCE, BIG DATA MANAGEMENT
Alation | November 29, 2022
Alation Inc., the leader in enterprise data intelligence, today announced the launch of Alation Connected Sheets, a new product stemming from the company’s acquisition of Kloud.io. Alation Connected Sheets enables business users to pull trusted, governed, and up-to-date data from data sources into spreadsheets, including Google Sheets and Microsoft Excel1, via Alation Data Catalog.
Spreadsheets are foundational for business decision-making across every industry. According to IDC2, there are 78 million advanced spreadsheet users worldwide. While both static in nature and disconnected from data sources, spreadsheets remain the preferred tool for business users; however, 90% contain errors. Data within spreadsheets is often copied-and-pasted from one spreadsheet to another or downloaded into CSV files via untraceable data sources. Without the proper management and effective data governance of spreadsheets, organizations suffer from significant productivity losses and expose themselves to increased risk because critical business decisions are based on inaccurate and ungoverned data.
Alation Connected Sheets solves this problem by meeting business users where they are: spreadsheets, including Google Sheets and Microsoft Excel. Now, business users across the enterprise can easily find, understand, and use trusted data from spreadsheets with confidence. Alation Connected Sheets increases business user productivity by enabling them to find, filter, import, and refresh data without having to learn a new tool or querying language. Alation Connected Sheets is available immediately as an add-on in Google Sheets, and for Microsoft Excel in early 2023.
Key benefits of Alation Connected Sheets:
Brings trust and governance to spreadsheets: By integrating with Alation Data Catalog, unique governance features such as TrustFlags signal if data is endorsed, warned, or deprecated, helping users pull in the best and most appropriate data, directly from the source.
Effortless setup, use, and maintenance: With a simple user interface and easy-to-use filters, data can be imported and automatically refreshed without relying on technical resources like data engineers or analysts.
Risk mitigation: By leveraging existing security credentials, only authorized users can pull in and use live, compliant, and trustworthy data in a spreadsheet.
“As it stands, business spreadsheets are often created using copy/paste functions or downloading static data that doesn’t sync with source data,” said David Menninger, SVP & Research Director, Ventana Research. “It's one of the most pervasive data governance challenges organizations face that significantly deteriorates productivity at scale. Without a traceable data lineage, multiple iterations of the same project are created using different formulas, data sources, and deprecated or low-quality data. Alation Connected Sheets helps solve this problem. Now, spreadsheet users can quickly retrieve governed data from a single source with confidence in the data they’re using – all natively within the spreadsheet.”
“Harnessing the power of an organization’s data to drive fast, accurate business decisions is challenging for any employee,” said Raj Gossain, Chief Product Officer, Alation. “Business users routinely rely on the simplicity, power, and ubiquity of spreadsheets to do their jobs. But the data that powers spreadsheets is typically copied from other sheets or downloaded into CSV files, creating real risk for every business. Alation Connected Sheets combines the power of trusted, governed data with the ease of use of the spreadsheet. Now spreadsheets can be a trusted enterprise data asset instead of an invisible liability, empowering every business user with the best data the enterprise has to offer.”
“Alation Connected Sheets will allow us to work natively in a familiar spreadsheet environment while connecting to the catalog and enabling spreadsheet governance. “Now, we can manage the countless spreadsheets we rely on to make critical business decisions, govern our spreadsheets, and limit regulatory compliance risk of exposing private information. Alation Connected Sheets will make our advanced spreadsheet users more productive because they can self-serve trusted data without being blocked by limited data engineering resources. Data teams will now be able to focus on complex analyses that drive the business forward; time otherwise spent pulling and verifying fresh data or tracing lineage.”
Sara Cook, Director of Data Science and CMC Statistics, Novavax
As the result of Alation’s second acquisition, Kloud.io CEO and co-founder Krishna Bhat and CTO and co-founder Sathish Raju have joined the company.
“We created this technology to make it easy for anyone in an enterprise to pull data from other sources into a spreadsheet, without needing the deep technical understanding of underlying databases or data lakes,” said Krishna Bhat, Senior Director of Product Management, Alation.
“Alation Connected Sheets is the result of a synergy between the technology that connects spreadsheets to source applications and Alation’s market-leading data intelligence platform,” said Sathish Raju, Senior Director of Engineering, Alation.
Alation is the leader in enterprise data intelligence solutions, including data search & discovery, data governance, data stewardship, analytics, and digital transformation. Alation’s initial offering dominates the data catalog market. Thanks to its powerful Behavioral Analysis Engine, inbuilt collaboration capabilities, and open interfaces, Alation combines machine learning with human insight to successfully tackle even the most demanding challenges in data and metadata management. Nearly 450 enterprises drive data culture, improve decision-making, and realize business outcomes with Alation, including AbbVie, Allianz Global Investors, American Family Insurance, Autozone, Cisco, Draft Kings, Exelon, Fifth Third Bank, Finnair, General Mills, Munich Re, NASDAQ, Parexel, Pfizer, Salesforce, Virgin Australia, and Vistaprint. Headquartered in Silicon Valley, Alation has been named to Inc. Magazine’s Best Workplaces list three times, is a 2022 UK’s Best Workplaces™ for Women, and recognized as a 2022 UK’s Best Workplaces™ in Tech. The company is backed by leading venture capitalists, including Blackstone, Costanoa, Databricks Ventures, Data Collective, Dell Technologies Capital, Hewlett Packard Enterprise, Icon, ISAI Cap, Riverwood Capital, Salesforce Ventures, Sanabil Investments, Sapphire, Snowflake Ventures, Thoma Bravo, and Union Grove.
Opaque Systems | December 08, 2022
Opaque Systems, the pioneers of secure multi-party analytics and AI for Confidential Computing, today announced the latest advancements in Confidential AI and Analytics with the unveiling of its platform. The Opaque platform, built to unlock use cases in Confidential Computing, is created by the inventors of the popular MC2 open source project which was conceived in the RISELab at UC Berkeley. The Opaque Platform uniquely enables data scientists within and across organizations to securely share data and perform collaborative analytics directly on encrypted data protected by Trusted Execution Environments (TEEs). The platform further accelerates Confidential Computing use cases by enabling data scientists to leverage their existing SQL and Python skills to run analytics and machine learning while working with confidential data, overcoming the data analytics challenges inherent in TEEs due to their strict protection of how data is accessed and used. The Opaque platform advancements come on the heels of Opaque announcing its $22M Series A funding,
Confidential Computing – projected to be a $54B market by 2026 by the Everest Group – provides a solution using TEEs or 'enclaves' that encrypt data during computation, isolating it from access, exposure and threats. However, TEEs have historically been challenging for data scientists due to the restricted access to data, lack of tools that enable data sharing and collaborative analytics, and the highly specialized skills needed to work with data encrypted in TEEs. The Opaque Platform overcomes these challenges by providing the first multi-party confidential analytics and AI solution that makes it possible to run frictionless analytics on encrypted data within TEEs, enable secure data sharing, and for the first time, enable multiple parties to perform collaborative analytics while ensuring each party only has access to the data they own.
"Traditional approaches for protecting data and managing data privacy leave data exposed and at risk when being processed by applications, analytics, and machine learning (ML) models, The Opaque Confidential AI and Analytics Platform solves this challenge by enabling data scientists and analysts to perform scalable, secure analytics and machine learning directly on encrypted data within enclaves to unlock Confidential Computing use cases."
-Rishabh Poddar, Co-founder & CEO, Opaque Systems.
Strict privacy regulations result in sensitive data being difficult to access and analyze, said a Data Science Leader at a top US bank. New multi-party secure analytics and computational capabilities and Privacy Enhancing Technology from Opaque Systems will significantly improve the accuracy of AI/ML/NLP models and speed insights.
The Opaque Confidential AI and Analytics Platform is designed to specifically ensure that both code and data within enclaves are inaccessible to other users or processes that are collocated on the system. Organizations can encrypt their confidential data on-premises, accelerate the transition of sensitive workloads to enclaves in Confidential Computing Clouds, and analyze encrypted data while ensuring it is never unencrypted during the lifecycle of the computation. Key capabilities and advancements include:
Secure, Multi-Party Collaborative Analytics – Multiple data owners can pool their encrypted data together in the cloud, and jointly analyze the collective data without compromising confidentiality. Policy enforcement capabilities ensure the data owned by each party is never exposed to other data owners.
Secure Data Sharing and Data Privacy – Teams across departments and across organizations can securely share data protected in TEEs while adhering to regulatory and compliance policies. Use cases requiring confidential data sharing include financial crime, drug research, ad targeting monetization and more.
Data Protection Throughout the Lifecycle – Protects all sensitive data, including PII and SHI data, using advanced encryption and secure hardware enclave technology, throughout the lifecycle of computation—from data upload, to analytics and insights.
Multi-tiered Security, Policy Enforcement, and Governance – Leverages multiple layers of security, including Intel® Software Guard Extensions, secure enclaves, advanced cryptography and policy enforcement to provide defense in depth, ensuring code integrity, data, and side-channel attack protection.
Scalability and Orchestration of Enclave Clusters – Provides distributed confidential data processing across managed TEE clusters and automates orchestration of clusters overcoming performance and scaling challenges and supports secure inter-enclave communication.
Confidential Computing is supported by all major cloud vendors including Microsoft Azure, Google Cloud and Amazon Web Services and major chip manufacturers including Intel and AMD.
About Opaque Systems:
Commercializing the open source MC2 technology invented at UC Berkeley by its founders, Opaque System provides the first collaborative analytics and AI platform for Confidential Computing. Opaque uniquely enables data to be securely shared and analyzed by multiple parties while maintaining complete confidentiality and protecting data end-to-end. The Opaque Platform leverages a novel combination of two key technologies layered on top of state-of-the-art cloud security—secure hardware enclaves and cryptographic fortification. This combination ensures that the overall computation is secure, fast, and scalable. The MC2 technology and Opaque innovation has already been adopted by several organizations, such as Ant Group, IBM, Scotiabank, and Ericsson.