Five Common Hadoopable Problems

| November 8, 2016

article image
Apache Hadoop has evolved into the standard platform solution for data storage and analysis. Large, successful companies are increasingly adopting Hadoop to perform powerful analyses of their ever-growing business data.Two key aspects of Hadoop have driven its rapid adoption by companies hungry for improved insights into the data they collect: Hadoop can store data of any type and from any source—inexpensively and at very large scale. Hadoop enables the sophisticated analysis of even very large data sets, easily and quickly. However, Hadoop concepts are unfamiliar to many people with a background in traditional databaseand data warehousing systems, and its business value is often underappreciated.

Spotlight

Sitecore

Sitecore is the global leader in experience management software that combines content management, commerce, and customer insights. The Sitecore Experience Cloud™ empowers marketers to deliver personalized content in real time and at scale across every channel—before, during, and after a sale. More than 5,200 brands––including American Express, Carnival Cruise Lines, Dow Chemical, and L’Oréal––have trusted Sitecore to deliver the personalized interactions that delight audiences, build loyalty, and drive revenue.

OTHER ARTICLES

A Tale of Two Data-Centric Services

Article | April 13, 2020

The acronym DMaaS can refer to two related but separate things: data center management-as-a-service referred to here by its other acronym, DCMaaS and data management-as-a-service. The former looks at infrastructure-level questions such as optimization of data flows in a cloud service, the latter refers to master data management and data preparation as applied to federated cloud services.DCMaaS has been under development for some years; DMaaS is slightly younger and is a product of the growing interest in machine learning and big data analytics, along with increasing concern over privacy, security, and compliance in a cloud environment.DMaaS responds to a developing concern over data quality in machine learning due to the large amount of data that must be used for training and the inherent dangers posed by divergence in data structure from multiple sources. To use the rapidly growing array of cloud data, including public cloud information and corporate internal information from hybrid clouds, you must aggregate data in a normalized way so it can be available for model training and processing with ML algorithms. As data volumes and data diversity increase, this becomes increasingly difficult.

Read More

MODERNIZED REQUIREMENTS OF EFFICIENT DATA SCIENCE SUCCESS ACROSS ORGANIZATIONS

Article | February 23, 2020

Does the success of companies like Google depend on that of the algorithms or that of data? Today’s fascination with artificial intelligence (AI) reflects both our appetite for data and our excitement about the new opportunities in machine learning. Amalio Telenti, Chief Data Scientist and Head of Computational Biology at Vir Biotechnology Inc. argue that newcomers to the field of data science are blinded by the shiny object of magical algorithms and that they forget the critical infrastructures that are needed to create and to manage data in the first place.Data management and infrastructures are the little ugly duckling of data science but they are necessary for a successful program and therefore need to be built with purpose. This requires careful consideration of strategies for data capture, storage of raw and processed data and instruments for retrieval. Beyond the virtues of analysis, there are also the benefits of facilitated retrieval. While there are many solutions for visualization of corporate or industrial data, there is still a need for flexible retrieval tools in the form of search engines that query the diverse sources and forms of data and information that are generated at a given company or institution.

Read More

CAN QUANTUM COMPUTING BE THE NEW BUZZWORD

Article | March 30, 2020

Quantum Mechanics created their chapter in the history of the early 20th Century. With its regular binary computing twin going out of style, quantum mechanics led quantum computing to be the new belle of the ball! While the memory used in a classical computer encodes binary ‘bits’ – one and zero, quantum computers use qubits (quantum bits). And Qubit is not confined to a two-state solution, but can also exist in superposition i.e., qubits can be employed at 0, 1 and both 1 and 0 at the same time.

Read More

Why Data Science Needs DataOps

Article | March 31, 2020

DataOps helps reduce the time data scientists spend preparing data for use in applications. Such tasks consume roughly 80% of their time now.We’re still hopeful that the digital transformation will provide the insights businesses need from big data. As a data scientist, you’re probably aware of the growing pressure from companies to extract meaningful insights from data and find the stories needed for impact.No matter how in-demand data science is in the employment numbers, equal pressure is rising for data scientists to deliver business value and no wonder. We’re approaching the age where data science and AI draw a line in the sand for which companies remain competitive and which ones collapse.One answer to this pressure is the rise of DataOps. Let’s take a look at what it is and how it could provide a path for data scientists to give businesses what they’ve been after.

Read More

Spotlight

Sitecore

Sitecore is the global leader in experience management software that combines content management, commerce, and customer insights. The Sitecore Experience Cloud™ empowers marketers to deliver personalized content in real time and at scale across every channel—before, during, and after a sale. More than 5,200 brands––including American Express, Carnival Cruise Lines, Dow Chemical, and L’Oréal––have trusted Sitecore to deliver the personalized interactions that delight audiences, build loyalty, and drive revenue.

Events