How to Become Citizen Data Scientist

alteryx.com

As organisations look to enrich their approach to advanced analytics, employees company-wide are filling the skills gap. These are todays citizen data scientists, dedicated to solving problems and building solutions with data, no matter what their background. For them to truly exceed, they need powerful, enabling technology and a framework to use it.
Watch Now

Spotlight

This white paper introduces SparkR, a package for the R statistical programming language that enables programmers and data scientists to access large-scale in-memory data processing. The R runtime is single-threaded and can therefore normally only run on a single computer, processing data sets that fit within that machine’s memory. By providing a bridge to Spark’s distributed computation engine, SparkR enables large R jobs to run across multiple cores within a single machine or across nodes in massively parallel clusters, with access to all the memory in the cluster.

OTHER ON-DEMAND WEBINARS

Activate Your Data Governance Policy

DATAVERSITY

What does it mean to activate a Data Governance policy? Can an inactive policy be effective? Data Governance policies can address different things depending on the organization. Some policies are very general and introduce the awareness of formal Data Governance to the organization. Other policies address specific needs like Data Quality, data documentation, and data protection. Join Bob Seiner and a special guest for this RWDG webinar where they will tackle of the subject of how to develop and deploy an active Data Governance policy. Bob and his guest will provide specific examples of policy components and examples of how organizations use policies to govern their data.
Watch Now

Modern Data Analytics in the Cloud: Achieving an End-to-End Strategy

TDWI

Businesses today need fast, scalable, and agile data and analytics, and cloud-based solutions are proving critical to satisfying these requirements. They enable organizations to rapidly and easily spin up systems and services for collecting, managing, and analyzing data. More important, cloud-based solutions deliver value from “data gravity”the surging volumes of new data created in the cloud by social media, the IoT, multichannel customer behavior, and other activity.
Watch Now

TiVo: How to Scale New Products with a Data Lake on AWS and Qubole

Amazon Web Services

Big data technologies can be both complex and involve time consuming manual processes. Organizations that intelligently automate big data operations lower their costs, make their teams more productive, scale more efficiently, and reduce the risk of failure. In our webinar, representatives from TiVo, creator of a digital recording platform for television content, will explain how they implemented a new big data and analytics platform that dynamically scales in response to changing demand. You’ll learn how the solution enables TiVo to easily orchestrate big data clusters using Amazon Elastic Cloud Compute (Amazon EC2) and Amazon EC2 Spot instances that read data from a data lake on Amazon Simple Storage Service (Amazon S3) and how this reduces the development cost and effort needed to support its network and advertiser users. TiVo will share lessons learned and best practices for quickly and affordably ingesting, processing, and making available for analysis terabytes of streaming and batch viewership data from millions of households.
Watch Now

Modernize Your Data Architecture to Deliver Oracle as a Service

robin.io

The migration to cloud-based data architectures continues at a rapid pace, including databases and data management. Oracle databases are part of this trend, and during this webinar you will learn how to automate the provisioning and management of Oracle databases so that you can deliver an “as-a-service” experience with 1-click simplicity. Experts will walk you through the process of
Watch Now

Spotlight

This white paper introduces SparkR, a package for the R statistical programming language that enables programmers and data scientists to access large-scale in-memory data processing. The R runtime is single-threaded and can therefore normally only run on a single computer, processing data sets that fit within that machine’s memory. By providing a bridge to Spark’s distributed computation engine, SparkR enables large R jobs to run across multiple cores within a single machine or across nodes in massively parallel clusters, with access to all the memory in the cluster.

resources