Planning for a Scalable Enterprise Data Lake
In this webinar we will discuss a more modern view of the data lake and consider best practices for planning and implementing a scalable enterprise data lake. The flaws in early data lakes were often rooted in the expectations of data consumers who put a premium on self-service data analytics. However, with no data governance mechanisms, data lakes quickly became more of a glorified “dumping ground,” “data swamp,” or “beta lake” for organizational data.In recent years, though, some innovations have allowed the data lake to evolve into an agile yet managed environment for accumulating shared data resources that can be optimally used for competitive advantage. Data lakes have evolved beyond the original on-premises concept based solely on Hadoop and now include pretty much any distributed computing platform (Hadoop, Spark, EMR, serverless, etc.) and any storage mechanism (HDFS, S3, ADLS), either on-premises or in the cloud.