Much like project management and home improvements, Data Governance sounds a lot simpler than it actually is. In a nutshell, Data Governance can be explained as “managing data with guidance.” In general, the perceived utility of these programs increases with the specificity of desired data and processing improvements. Whether restarting or starting your Data Governance programs, it is critical to be guided by a periodically revised Data Strategy that links support for organizational strategy to specific operational data improvements. Understanding these and other aspects of governance is necessary to eliminate the ambiguity that often surrounds the implementation of effective Data Management and stewardship programs.
To preview your data, you must first have access to it. This operation can be difficult when distributed into different systems and silos which is why many companies choose to bring them together in a Data Lake. However, without governance or stewardship, these data lakes are rapidly turning into swamps, veritable brakes on innovation.
Data tends to pile up and can be rendered unusable or obsolete without careful maintenance processes. Reference and Master Data Management (MDM) has been a popular Data Management approach to effectively gain mastery over not just the data but the supporting architecture for processing it. This webinar presents MDM as a strategic approach to improving and formalizing practices around those data items that provide context for many organizational transactions: its master data. Too often, MDM has been implemented technology-first and achieved the same very poor track record (one-third succeeding on-time, within budget, and achieving planned functionality). MDM success depends on a coordinated approach typically involving Data Governance and Data Quality activities.
In this webinar we will discuss a more modern view of the data lake and consider best practices for planning and implementing a scalable enterprise data lake. The flaws in early data lakes were often rooted in the expectations of data consumers who put a premium on self-service data analytics. However, with no data governance mechanisms, data lakes quickly became more of a glorified “dumping ground,” “data swamp,” or “beta lake” for organizational data.In recent years, though, some innovations have allowed the data lake to evolve into an agile yet managed environment for accumulating shared data resources that can be optimally used for competitive advantage. Data lakes have evolved beyond the original on-premises concept based solely on Hadoop and now include pretty much any distributed computing platform (Hadoop, Spark, EMR, serverless, etc.) and any storage mechanism (HDFS, S3, ADLS), either on-premises or in the cloud.