Virtualized Hadoop Performance with VMware vSphere® 6 on HighPerformance Servers

November 2, 2015

Large advances have been made in hardware and every level of the software stack since the virtualized Hadoop tests published in April 2013. This paper shows how to take advantage of these advances to achieve maximum performance. The cluster size remains at 32 two-processor 2U hosts; however, the processor, memory, network, and storage capabilities are all roughly doubled from those reported in the earlier paper. The performance of native and several VMware vSphere® 6 virtualized configurations were compared using the same TeraSort application suite as before. It was found that the more powerful hosts give a larger advantage to multi-VM per host configurations: virtualized TeraSort is now up to 12% faster than the optimized native configuration. The apples-to-apples case of a single virtual machine per host again shows performance close to that of native Linux. The origins of the improvements are examined and recommendations for optimal hardware and software configurations are given.

Spotlight

TADA Cognitive Solutions

TADA's cloud-based platform outperforms alternative products by delivering business solutions ten times faster at one-tenth the cost. The magic of TADA starts with the creation of a digital duplicate of your entire operation that is structured using the language of your own business. Then data is harmonized from disparate sources to enable a completely elastic 360 view of your business. TADA aligns organizational thinking, inspiring real-time collaboration and problem solving on any device. By revolutionizing the way your organization utilizes its data, TADA transforms business complexity into a massive advantage.

OTHER WHITEPAPERS
news image

The State of Security Data Management 2022

whitePaper | October 20, 2022

Over the past two years, security has emerged as the most important aspect of data management. Data volumes are increasing at a 23% CAGR, according to IDC. Digital transformation means that organizations across all industries have become tech companies tasked with managing massive, sprawling data sources.

Read More
news image

Architecting for HIPAA Security and Compliance on Amazon Web Services

whitePaper | January 27, 2020

AWS maintains a standards-based risk management program to ensure that the HIPAA-eligible services specifically support the administrative, technical, and physical safeguards required under HIPAA. Using these services to store, process, and transmit PHI allows our customers and AWS to address the HIPAA requirements applicable to the AWS utility-based operating model.

Read More
news image

The Evolution of Clinical Data Management into Clinical Data Science

whitePaper | September 12, 2022

Clinical Data Management (CDM) evolved over the last two decades from managing data entered on paper Case Report Forms (CRFs) to managing data transcribed into Electronic Data Capture (EDC) systems. The Society for Clinical Data Management (SCDM) strongly contributed to this first significant CDM evolution through its Good Clinical Data Management Practice 1 (GCDMP©) Chapters first published in 2000 and its certification program for Clinical Data Managers2 subsequently released in 2004.

Read More
news image

A Review of BioPharma Sponsor Data Sharing Policies and Protection Methodologies

whitePaper | September 12, 2022

This whitepaper examines clinical trial data contribution policies and the data protection methodologies applied to protect patient privacy. Information published by 29 biopharma sponsors was collected across three data-sharing platforms, collated by sponsor size. Results showed that large sponsor contribution policies can provide helpful benchmarks for medium and smaller sponsors.

Read More
news image

Future of care: Patient-centricity with real-world predictive analytics

whitePaper | February 8, 2023

For centuries, patients have sought medical help for their ailments. Just as in the past, however, there are still many illnesses – both wellknown, widespread diseases and rare conditions – that initially cause few or inconclusive symptoms, and many patients leave the doctor’s office with an incorrect diagnosis. In addition, diseases may progress slowly or quickly depending on the individual.

Read More
news image

Build a thriving business using data and analytics at scale

whitePaper | February 7, 2020

The problem is that growing volumes of data can lead to data paralysis, because no one knows quite where to begin or how to use it. Siloed information sources, no data management strategy, disconnected spreadsheets or analytical tools and poor data quality all work to compound the issues you’re facing.

Read More

Spotlight

TADA Cognitive Solutions

TADA's cloud-based platform outperforms alternative products by delivering business solutions ten times faster at one-tenth the cost. The magic of TADA starts with the creation of a digital duplicate of your entire operation that is structured using the language of your own business. Then data is harmonized from disparate sources to enable a completely elastic 360 view of your business. TADA aligns organizational thinking, inspiring real-time collaboration and problem solving on any device. By revolutionizing the way your organization utilizes its data, TADA transforms business complexity into a massive advantage.

Events