Virtualized Hadoop Performance with VMware vSphere® 5.1

June 26, 2013

A cluster of 32 high-performance hosts was used to run three demanding Hadoop applications. The performance of native and several VMware vSphere® configurations was compared. The apples-to-apples case of a single virtual machine per host shows performance close to that of native. Improvements in elapsed time of up to 13% can be achieved by partitioning each host into two or four virtual machines, resulting in competitive or even better than native performance. The origins of the improvements are examined and recommendations for optimal hardware and software configuration are given. Apache Hadoop provides a platform for building distributed systems for massive data storage and analysis [1] using a large cluster of standard x86-based servers. It uses data replication across hosts and racks of hosts to protect against individual disk, host, and even rack failures.

Spotlight

Analytics8

Analytics8 is a pure BI company. Everything we do evolves around Business Intelligence and we are extremely proud of the experience and expertise we bring to the field and the quality of our people and our customer base. We go beyond the tools and technology to drive the BI Initiative by aligning business expectations with BI capability. We have strong relationships with large and small clients in Austraila and the US from a diverse cross-section of industries, including finance (banking and insurance), manufacturing, retail, publishing, utilities, pharmaceuticals, and government. Our team is committed to a process that centers on partnering with your staff to blend our product and systems knowledge with your business expertise.

OTHER WHITEPAPERS
news image

Understanding The Right Fit for Your Organization: Data Fabric or Data Mesh?

whitePaper | December 22, 2022

The key objective of setting up data mesh or data fabric architecture is to enable the availability of quality data in a timely fashion to the right people in the right format. A data fabric is an architecture framework and a set of data services that provide frictionless data capabilities across a choice of endpoint applications or services spanning hybrid or multi-cloud and on-premises, by using rich metadata foundation and artificial intelligence/machine learning (AI/ML) automation.

Read More
news image

Enhancing data mesh: How distributed ledger solutions empower decentralized governance

whitePaper | June 27, 2023

Data mesh is an innovative approach to data management that offers a solution to the most common challenges organizations face when dealing with large-scale data management. Data mesh concept consists of four key principles for large-scale data management in a multi-tiered, multi-domain organizational structure: domainoriented ownership, data as product, federated computational governance and a self-service data platform. These all promote a more agile, efficient and scalable data architecture. However, one of the central challenges to implementing data mesh is designing decentralized governance mechanisms that align with its core principles.

Read More
news image

Data Analytics Techniques for Internal Audit

whitePaper | April 27, 2023

Data analytics are used to test controls and validate that business risks are managed. This would generally occur at a point-in-time when an assurance activity is scheduled. Rather than test a number of transactions, the entire population of transactions can be reviewed for greater coverage. Data analytics includes automated tools such as generalised audit software, test data generators, computerised audit programs, specialised audit utilities and computer-assisted audit techniques (CAATs).

Read More
news image

Effective Data Management Is Essential for Taming the 5G Network Complexity Beast

whitePaper | September 26, 2022

Communications service provider (SP) traffic volumes are progressively increasing as networks evolve, user devices gain greater capacity, and applications engaging with advanced network data such as that from the 5G standalone (5G SA) network exposure function (NEF) start to deliver on customer expectations. Cloudification of network functions — virtualized network functions (VNFs) and cloud-native network functions (CNFs) — is placing new burdens on old systems and processes not designed to manage intent-based network architecture. Silos of data locked within many installed systems keep transformation efforts from reaching anticipated objectives.

Read More
news image

The Evolution of Data 3.0

whitePaper | October 17, 2022

Today, massive amounts of data are collected, processed, and stored for a range of analytical purposes around the world. Every customer, device, transaction, email, and image leaves a data trail. At present, this data is growing too big, changing too fast, and becoming hyper distributed. The traditional ways of doing integration and analytics are no longer viable or scalable. It is not feasible to create millions of data pipelines and to continue moving large amounts of raw data to a data lake or a centralized data warehouse.

Read More
news image

Why Open Architectures Matter in BI: A White Paper on Openness

whitePaper | September 1, 2022

Lock-in occurs when the cost or effort of moving away from a particular choice (platform, vendor) outweighs the benefit of doing so, even if that choice is good for the business overall. The pain of moving is simply too great to consider doing it.

Read More

Spotlight

Analytics8

Analytics8 is a pure BI company. Everything we do evolves around Business Intelligence and we are extremely proud of the experience and expertise we bring to the field and the quality of our people and our customer base. We go beyond the tools and technology to drive the BI Initiative by aligning business expectations with BI capability. We have strong relationships with large and small clients in Austraila and the US from a diverse cross-section of industries, including finance (banking and insurance), manufacturing, retail, publishing, utilities, pharmaceuticals, and government. Our team is committed to a process that centers on partnering with your staff to blend our product and systems knowledge with your business expertise.

Events