Converging Workflows Pushing Converged Software onto HPC Platforms

Are we witnessing the convergence of HPC, big data analytics, and AI? Once, these were separate domains, each with its own system architecture and software stack, but the data deluge is driving their convergence. Traditional big science HPC is looking more like big data analytics and AI, while analytics and AI are taking on the flavor of HPC. The data deluge is real. In 2018, CERN’s Large Hadron Collider generated over 50 petabytes (1,000 terabytes, or 1015 bytes) of data, and expects that to increase tenfold by 2025. The average Internet user generates over a GB of data traffic every day; smart hospitals over 3,000 GB per day; a manufacturing plant over 1,000,000 GB per day. A single autonomous vehicle is estimated to generate 4,000 GB per day. Every day. The total annual digital data output is predicted to reach or exceed 163 zettabytes (one sextillion, or 1021 bytes) by 2025. This is data that needs to be analyzed at near-real-time speed and stored somewhere for easy access by multiple collaborators. Extreme performance, storage, networking sounds a lot like HPC.What characterized “traditional” HPC was achieving extreme performance on computationally complex problems, typically simulations of real-world systems (think explosions, oceanography, global weather hydrodynamics, even cosmological events like supernovae, etc.). This meant very large parallel processing systems with hundreds, even thousands, of dedicated compute nodes and vast multi-layer storage appliances, over vast high-speed networks.

Spotlight

Other News
Big Data

Airbyte Racks Up Awards from InfoWorld, BigDATAwire, Built In; Builds Largest and Fastest-Growing User Community

Airbyte | January 30, 2024

Airbyte, creators of the leading open-source data movement infrastructure, today announced a series of accomplishments and awards reinforcing its standing as the largest and fastest-growing data movement community. With a focus on innovation, community engagement, and performance enhancement, Airbyte continues to revolutionize the way data is handled and processed across industries. “Airbyte proudly stands as the front-runner in the data movement landscape with the largest community of more than 5,000 daily users and over 125,000 deployments, with monthly data synchronizations of over 2 petabytes,” said Michel Tricot, co-founder and CEO, Airbyte. “This unparalleled growth is a testament to Airbyte's widespread adoption by users and the trust placed in its capabilities.” The Airbyte community has more than 800 code contributors and 12,000 stars on GitHub. Recently, the company held its second annual virtual conference called move(data), which attracted over 5,000 attendees. Airbyte was named an InfoWorld Technology of the Year Award finalist: Data Management – Integration (in October) for cutting-edge products that are changing how IT organizations work and how companies do business. And, at the start of this year, was named to the Built In 2024 Best Places To Work Award in San Francisco – Best Startups to Work For, recognizing the company's commitment to fostering a positive work environment, remote and flexible work opportunities, and programs for diversity, equity, and inclusion. Today, the company received the BigDATAwire Readers/Editors Choice Award – Big Data and AI Startup, which recognizes companies and products that have made a difference. Other key milestones in 2023 include the following. Availability of more than 350 data connectors, making Airbyte the platform with the most connectors in the industry. The company aims to increase that to 500 high-quality connectors supported by the end of this year. More than 2,000 custom connectors were created with the Airbyte No-Code Connector Builder, which enables data connectors to be made in minutes. Significant performance improvement with database replication speed increased by 10 times to support larger datasets. Added support for five vector databases, in addition to unstructured data sources, as the first company to build a bridge between data movement platforms and artificial intelligence (AI). Looking ahead, Airbyte will introduce data lakehouse destinations, as well as a new Publish feature to push data to API destinations. About Airbyte Airbyte is the open-source data movement infrastructure leader running in the safety of your cloud and syncing data from applications, APIs, and databases to data warehouses, lakes, and other destinations. Airbyte offers four products: Airbyte Open Source, Airbyte Self-Managed, Airbyte Cloud, and Powered by Airbyte. Airbyte was co-founded by Michel Tricot (former director of engineering and head of integrations at Liveramp and RideOS) and John Lafleur (serial entrepreneur of dev tools and B2B). The company is headquartered in San Francisco with a distributed team around the world. To learn more, visit airbyte.com.

Read More