The Biggest Challenge of Hadoop Analytics: It’s all about Query Performance

As Big Data gets bigger and more complex, scalability and performance turn out to be major areas of concern for business users. When organizations use SQL on Hadoop for their business intelligence, they often find it difficult to cope up with the growing needs of business users who expect instant answers to complex queries. Forrester estimates that more than 20% of Big Data projects fail every year. So, why do these projects fail? Business users need their answers and they need them fast. Historically, that meant two technologies were put in place – an Enterprise Data Warehouse that stored vast amounts of data and a SQL Engine that “power users” could use to access the data. In most cases, these data warehouses were difficult to modify when needed and as a result, it became a “black box” to the data consumer. The SQL solution gave the power user access to the underlying data, but it was potentially “hazardous” – what if the user runs an “endless” query, what if they perform the wrong query on the wrong data and return the wrong answer? Why Hadoop Analytics gets difficult? The promise of Big Data is performance and storage. Therefore, IT groups see Big Data as the answer to the EDW/SQL conundrum. IT creates a Hadoop platform and opens the data up to everyone. Data analytics tools can connect directly to Hadoop or we can write SQL against it. And because the Hadoop cluster has “infinite storage and processing”, it is expected to provide answer to all questions.

Spotlight

Other News

Dom Nicastro | April 03, 2020

Read More

Dom Nicastro | April 03, 2020

Read More

Dom Nicastro | April 03, 2020

Read More

Dom Nicastro | April 03, 2020

Read More