Google Introduces Cloud Storage Connector for Hadoop Big Data Workloads

In a recent blog post, Google announced a new Cloud Storage connector for Hadoop. This new capability allows organizations to substitute their traditional Hadoop Distributed File System (HDFS) with Google Cloud Storage. Columnar file formats such as Parquet and ORC may realize increased throughput, and customers will also benefit from Cloud Storage directory isolation, lower latency, increased parallelization and intelligent defaults.While HDFS is a popular storage solution for Hadoop customers, it can be operationally complex, for example when maintaining long-running HDFS clusters.Google Cloud Storage is a unified object store that exposes data through a unified API. It is a managed solution that supports both high-performance and archival use cases. The Cloud Storage connector is an open-source Java client library that implements Hadoop Compatible FileSystem (HCFS) and runs inside Hadoop JVMs, which allows big-data processes, like Hadoop or Spark jobs, access to underlying data from Google Cloud Storage.

Spotlight

Spotlight

Related News