BIG DATA MANAGEMENT

GE Digital Releases GE Digital software solution to Optimize Airline Function via Teradata Vantage

GE Digital | February 16, 2021

GE Digital and Teradata today announced the release of another GE Digital programming arrangement that incorporates with Teradata Vantage, the cloud data analytics stage, to give mixed endeavor and tasks data in an aeronautics explicit data model. The arrangement, Flight Data Link, quickens airlines' digital change - driving decreased costs and more noteworthy bits of knowledge into tasks by coordinating client, support, and production network data.

Flight Data Link permits flight data from GE Digital's Event Measurement System (EMS), a far reaching flight data handling framework conveying quick, precise, and noteworthy experiences, to be made accessible through Teradata Vantage. Vantage controls an endeavors' data scientific biological system - coordinating all applicable data into a solitary stage to make analytics accessible across the undertaking. By blending flight data with operational data in Vantage, airlines can perform complex investigation of different trips across their armadas. Consolidating EMS and Vantage assists airlines with coming to an obvious conclusion from operational data to flight data, enabling them to acquire new understanding across their airline.

“We’re proud that GE Digital’s innovative new product will utilize our Vantage platform, leveraging Teradata’s world class platform for aviation analytics so that leading airlines across the globe will be able to use data as their greatest asset,” said Scott Collins, VP Global Partnership Organization at Teradata. “Vantage provides GE Digital with the flexibility of a multi-cloud platform that makes it easy to deliver their new offerings and the two companies are also working together on integrated software that makes it easier for our joint customers to consume more data and insights.”

Interfacing full flight data with Teradata Vantage gives the capacity to do complex logical functions across various flights, courses, and resources. Business clients can apply a BI apparatus or programming language (for example Python, R, Java) to perform examination, and afterward airlines can leverage the incorporated data across their biological system to suggest explicit activities dependent on flight conduct and data understanding. Further, airlines can augment hardware and operational data (outside of and notwithstanding the wellbeing group) by utilizing full flight data for prescient upkeep and – with the mix of inventory network data – help improve tasks.

“The combination of GE’s aviation experience and GE Digital’s software expertise, integrated with the world’s most robust platform for scalable analytics, allows airlines to prioritize passenger experience as well as revenue growth,” said Andrew Coleman, General Manager for GE Digital’s Aviation Software group. “Our deep partnership with Teradata allows us to build exciting solutions that integrate multiple data types, both from the enterprise as well as operations to drive powerful outcomes.

Flight Data Link permits flight security groups to effortlessly give flight data to different groups across an airline for their logical purposes. The arrangement computerizes this progression of important flight data while as yet giving administration. By using close to ongoing flight data for prescient support and coordinating store network data into upkeep arranging and occasions, Flight Data Link drives proficiency in tasks and diminishes costs by overseeing flights and resources better.

Teradata Vantage is the main multi-cloud data analytics programming stage that empowers environment improvement by binding together analytics, data lakes and data stockrooms. With the endeavor size of Vantage, organizations can take out storehouses and cost-successfully inquiry all their data to get a total perspective on their business.

About GE Digital

GE Digital is transforming how industry solves its toughest challenges. GE Digital’s mission is to bring simplicity, speed, and scale to its customers’ digital transformation activities, with software that helps them to better operate, analyze and optimize their business processes. GE Digital’s product portfolio – including grid optimization and analytics, asset and operations performance management, and manufacturing operations and automation – helps industrial companies in the utility, power generation, oil & gas, aviation, and manufacturing sectors put their industrial data to work.

About Teradata

Teradata is the cloud data analytics platform company, built for a multi-cloud reality, solving the world’s most complex data challenges at scale. We help businesses unlock value by turning data into their greatest asset.

Spotlight

For many companies, risk aversion still drives IT. Given that today’s businesses depend upon digital services, IT organizations have traditionally and prudently prioritized reliability and predictability. If an IT service becomes unavailable, often some portion of the business stops. And if the business stops even for just a few moments, revenue can be impacted. Modern business demands fast and flexible IT. This new reality is causing a shift in priorities – Modern IT organizations favor speed and agility over predictability and reliability.


Other News
BIG DATA MANAGEMENT

Arcion Partners With Databricks for Real-Time Data Replication on the Lakehouse

Arcion | April 21, 2022

Arcion today announced a partnership to bring the world’s only cloud-native, CDC-based data replication platform to Databricks. Arcion is the first partner to offer preconfigured, validated data replication for users of Databricks through that company’s new Partner Connect program. Arcion’s product enables faster, more agile analytics and AI/ML by empowering enterprises to integrate mission-critical transactional systems with their Databricks Lakehouse in real time, at scale, and with guaranteed transactional integrity. It is the only fully managed, distributed data replication as a service on the market today, offering zero-code, zero-maintenance change data capture (CDC) pipelines that can be deployed in just minutes. It empowers data teams to move high-volume data from transactional databases like Oracle and MySQL, without a single line of code. Partner Connect makes it possible for customers to implement Arcion’s technology directly within their Databricks Lakehouse. With just a few clicks, Partner Connect automatically configures the resources necessary to begin using streaming data pipelines. Enable real-time data ingestion with powerful pipelines between Oracle, MySQL, and Snowflake (additional sources coming soon) to the Databricks Lakehouse. “Through Partner Connect, Arcion and Databricks are deepening our thriving relationship and working together to deliver a unified experience for our customers that offers simplicity, security, rock-solid reliability, and scale. Companies across the globe are using ML and advanced analytics to turn raw data into tangible business value, but they need the right tools to help them get there. Arcion helps companies unify their data by delivering it to Databricks, where everything is available in one place, with zero delay.” Arcion’s CEO Gary Hagmueller Arcion Cloud uses CDC to identify and track changes to data in transactional systems, whether they are deployed on-premise, in the cloud, or across a hybrid landscape. Arcion detects any changes made within those systems and replicates them to Databricks in real time. Capable of handling petabyte-scale integration, Arcion handles high transaction volumes easily, without adversely impacting the source system’s performance. “Arcion’s replication for Databricks’ Lakehouse provides extraordinarily rapid time to value for analytics and AI/ML,” said Adam Conway, SVP of Products at Databricks. “By making Arcion available via Partner Connect, we’re enabling thousands of Databricks customers to discover and take advantage of Arcion’s highly scalable, efficient and flexible CDC technology. With just a few clicks, users can set up a trial account and start streaming real-time data from transactional systems to their Lakehouse.” About Arcion Fortune 500 companies around the world rely on Arcion’s distributed, CDC-based data replication solution to drive fast and accurate data insights. Arcion helps enterprises eliminate slow, brittle data pipelines and high-maintenance overheads. Break down data silos through high-volume, scalable change data capture pipelines with guaranteed transactional integrity.

Read More

BIG DATA MANAGEMENT

Datafold Launches Open Source data-diff to Compare Tables of Any Size Across Databases

Datafold | June 23, 2022

Datafold, a data reliability company, today announced data-diff, a new open source cross-database diffing package. This new product is an open source extension to Datafold’s original Data Diff tool for comparing data sets. Open source data-diff validates the consistency of data across databases using high-performance algorithms. In the modern data stack, companies extract data from sources, load that data into a warehouse, and transform that data so that it can be used for analysis, activation, or data science use cases. Datafold has been focused on automated testing during the transformation step with Data Diff, ensuring that any change made to a data model does not break a dashboard or cause a predictive algorithm to have the wrong data. With the launch of open source data-diff, Datafold can now help with the extract and load part of the process. Open source data-diff verifies that the data that has been loaded matches the source of that data where it was extracted. All parts of the data stack need testing for data engineers to create reliable data products, and Datafold now gives them coverage throughout the extract, load, transform (ELT) process. “data-diff fulfills a need that wasn’t previously being met. “Every data-savvy business today replicates data between databases in some way, for example, to integrate all available data in a warehouse or data lake to leverage it for analytics and machine learning. Replicating data at scale is a complex and often error-prone process, and although multiple vendors and open source tools provide replication solutions, there was no tooling to validate the correctness of such replication. As a result, engineering teams resorted to manual one-off checks and tedious investigations of discrepancies, and data consumers couldn’t fully trust the data replicated from other systems. Gleb Mezhanskiy, Datafold founder and CEO Mezhanskiy continued, “data-diff solves this problem elegantly by providing an easy way to validate consistency of data sets across databases at scale. It relies on state-of-the art algorithms to achieve incredible speed: e.g., comparing one-billion-row data sets across different databases takes less than five minutes on a regular laptop. And, as an open source tool, it can be easily embedded into existing workflows and systems.” Answering an Important Need Today’s organizations are using data replication to consolidate information from multiple sources into data warehouses or data lakes for analytics. They’re integrating operational systems with real-time data pipelines, consolidating data for search, and migrating data from legacy systems to modern databases. Thanks to amazing tools like Fivetran, Airbyte and Stitch, it’s easier than ever to sync data across multiple systems and applications. Most data synchronization scenarios call for 100% guaranteed data integrity, yet the practical reality is that in any interconnected system, records are sometimes lost due to dropped packets, general replication issues, or configuration errors. To ensure data integrity, it’s necessary to perform validation checks using a data diff tool. Datafold’s approach constitutes a significant step forward for developers and data analysts who wish to compare multiple databases rapidly and efficiently, without building a makeshift diff tool themselves. Currently, data engineers use multiple comparison methods, ranging from simple row counts to comprehensive row-level analysis. The former is fast but not comprehensive, whereas the latter approach is slow but guarantees complete validation. Open source data-diff is fast and provides complete validation. Open Source data-diff for Building and Managing Data Quality Available today, data-diff uses checksums to verify 100% consistency between two different data sources quickly and efficiently. This method allows for a row-level comparison of 100 million records to be done in just a few seconds, without sacrificing the granularity of the resulting comparison. Datafold has released data-diff under the MIT license. Currently, the software includes connectors for Postgres, MySQL, Snowflake, BigQuery, Redshift, Presto and Oracle. Datafold plans to invite contributors to build connectors for additional data sources and for specific business applications. About Datafold Datafold is a data reliability platform that helps data teams deliver reliable data products faster. It has a unique ability to identify, prioritize and investigate data quality issues proactively before they affect production. Founded in 2020 by veteran data engineers, Datafold has raised $22 million from investors including NEA, Amplify Partners, and YCombinator. Customers include Thumbtack, Patreon, Truebill, Faire, and Dutchie.

Read More

DATA SCIENCE

J.P. Morgan launches U.S. Applied Data Science Value Fund, harnessing the power of data science to amplify expertise in fundamental investing

J.P. Morgan | December 18, 2021

J.P. Morgan Asset Management has recently launched its first mutual fund employing a data science-driven investment process, combining fundamental research, data insights, and risk management to identify attractively priced equity securities. The J.P. Morgan U.S. Applied Data Science Value Fund (JPIVX) combines the firm's decades of information and data sets accumulated by equity research analysts with the breadth and scale provided by J.P. Morgan's data science capabilities. "The end-to-end data science investment process behind the product is the culmination of many years of building, iterating, and improving on the application of AI and ML techniques," said Hamilton Reiner, Head of U.S. Structured Equity for J.P. Morgan Asset Management. "The investment process is driven by machine learning and works off the core belief that there is significant alpha potential in portfolio construction, creating value through both security selection and allocation decisions." The fund represents the collaboration of existing teams including data scientists, technologists, and fundamental analysts. Together the teams created infrastructure to construct a cloud-based process that can analyze information at tremendous scale. "Our investors have long used data as part of their research process. We have used our decades of proprietary data and expertise to build upon that tradition, leveraging the power of the cloud and our data science capabilities in order to analyze an ever-increasing volume of information. We're able to apply that scale of information and insight to our investment decision-making processes," said Mr. Reiner. The firm has been working toward this combination of fundamental investment management and data science for some time, culminating in a new business unit to focus on the application of AI/ML to its business. "We started the build out of our data science and equity data science teams about six years ago and have been applying capabilities across our investment teams,Our new Investment Platform unit seeks to amplify the application of those capabilities, creating future-state strategies for our clients, and bringing under one roof our unique talent in investment data, data science, equity trading & analytics, derivatives and broker relationships." Kristian West, Global Head of Investment Platform for J.P. Morgan Asset Management The U.S. Applied Data Science Value Fund is managed by portfolio managers Eric Moreau, Wonseok Choi, and Andrew Stern, part of the U.S. Structured Equity team led by Hamilton Reiner. About J.P. Morgan Asset Management J.P. Morgan Asset Management, with assets under management of $2.7 trillion (as of 9/30/2021), is a global leader in investment management. J.P. Morgan Asset Management's clients include institutions, retail investors and high net worth individuals in every major market throughout the world. J.P. Morgan Asset Management offers global investment management in equities, fixed income, real estate, hedge funds, private equity and liquidity.

Read More

BIG DATA MANAGEMENT

Databricks Recognized as a Leader in 2021 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Databricks | December 20, 2021

Databricks, the Data and AI company and pioneer of data lakehouse architecture, today announced that Gartner has positioned Databricks as a Leader in its 2021 Magic Quadrant for Cloud Database Management Systems (DBMS) report for the first time. In combination with its positioning as a Leader in the 2021 Gartner® Magic Quadrant for Data Science and Machine Learning Platforms (DSML) report earlier this year, Databricks is now the only cloud native vendor to be recognized as a Leader in both Magic Quadrant reports. The Gartner report evaluated 20 different vendors based on completeness of vision and ability to execute within the rapidly evolving market. "We consider our positioning as a Leader in both of these reports to be a defining moment for the Databricks Lakehouse Platform and confirmation of the vision for lakehouse as the data architecture of the future. We're honored to be recognized by Gartner as we've brought this lakehouse vision to life. We will continue to invest in simplifying customers' data platform through our unified, open approach." Ali Ghodsi, CEO and Co-Founder of Databricks We believe the uniqueness of the achievement is in how it was accomplished. It is not uncommon for vendors to show up in multiple Magic Quadrants each year across many domains. But, they are assessed on disparate products in their portfolio that individually accomplish the specific criteria of the report. The results definitively show that one copy of data, one processing engine, one approach to management and governance that's built on open source and open standards – across all clouds – can deliver class-leading outcomes for both data warehousing and data science/machine learning workloads. We feel our position as a leader in the 2021 Magic Quadrant for DBMS underscores a year of substantial growth for the company. These milestones include the announcement of its fifth major open source project Delta Sharing, the acquisition of cutting-edge German low-code/no-code startup, 8080 Labs, as well as raising a total of $2.6 billion in funding in 2021 at a current valuation of $38 billion to accelerate the global adoption of its lakehouse platform. Gartner, "2021 Cloud Database Management Systems," Henry Cook, Merv Adrian, Rick Greenwald, Adam Ronthal, Philip Russom, December 14, 2021 Gartner, "2021 Magic Quadrant for Data Science and Machine Learning Platforms," Peter Krensky, Carlie Idoine, Erick Brethenoux, Pieter den Hamer, Farhan Choudhary, Afraz Jaffri, Shubhangi Vashisth, March 1, 2021 Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner's Research & Advisory organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. About Databricks Databricks is the data and AI company. More than 5,000 organizations worldwide — including Comcast, Condé Nast, H&M, and over 40% of the Fortune 500 — rely on the Databricks Lakehouse Platform to unify their data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe. Founded by the original creators of Apache Spark™, Delta Lake and MLflow, Databricks is on a mission to help data teams solve the world's toughest problems.

Read More

Spotlight

For many companies, risk aversion still drives IT. Given that today’s businesses depend upon digital services, IT organizations have traditionally and prudently prioritized reliability and predictability. If an IT service becomes unavailable, often some portion of the business stops. And if the business stops even for just a few moments, revenue can be impacted. Modern business demands fast and flexible IT. This new reality is causing a shift in priorities – Modern IT organizations favor speed and agility over predictability and reliability.

Resources

Whitepaper

Whitepaper

Whitepaper