Article | October 27, 2020
Data Platforms and frameworks have been constantly evolving. At some point of time; we are excited by Hadoop (well for almost 10 years); followed by Snowflake or as I say Snowflake Blizzard (who managed to launch biggest IPO win historically) and the Google (Google solves problems and serves use cases in a way that few companies can match).
The end of the data warehouse
Once upon a time, life was simple; or at least, the basic approach to Business Intelligence was fairly easy to describe… A process of collecting information from systems, building a repository of consistent data, and bolting on one or more reporting and visualisation tools which presented information to users. Data used to be managed in expensive, slow, inaccessible SQL data warehouses. SQL systems were notorious for their lack of scalability. Their demise is coming from a few technological advances. One of these is the ubiquitous, and growing, Hadoop.
On April 1, 2006, Apache Hadoop was unleashed upon Silicon Valley. Inspired by Google, Hadoop’s primary purpose was to improve the flexibility and scalability of data processing by splitting the process into smaller functions that run on commodity hardware.
Hadoop’s intent was to replace enterprise data warehouses based on SQL. Unfortunately, a technology used by Google may not be the best solution for everyone else. It’s not that others are incompetent: Google solves problems and serves use cases in a way that few companies can match. Google has been running massive-scale applications such as its eponymous search engine, YouTube and the Ads platform. The technologies and infrastructure that make the geographically distributed offerings perform at scale are what make various components of Google Cloud Platform enterprise ready and well-featured. Google has shown leadership in developing innovations that have been made available to the open-source community and are being used extensively by other public cloud vendors and Gartner clients. Examples of these include the Kubernetes container management framework, TensorFlow machine learning platform and the Apache Beam data processing programming model. GCP also uses open-source offerings in its cloud while treating third-party data and analytics providers as first-class citizens on its cloud and providing unified billing for its customers. The examples of the latter include DataStax, Redis Labs, InfluxData, MongoDB, Elastic, Neo4j and Confluent.
Silicon Valley tried to make Hadoop work. The technology was extremely complicated and nearly impossible to use efficiently. Hadoop’s lack of speed was compounded by its focus on unstructured data — you had to be a “flip-flop wearing” data scientist to truly make use of it.
Unstructured datasets are very difficult to query and analyze without deep knowledge of computer science. At one point, Gartner estimated that 70% of Hadoop deployments would not achieve the goal of cost savings and revenue growth, mainly due to insufficient skills and technical integration difficulties. And seventy percent seems like an understatement.
Data storage through the years: from GFS to Snowflake or Snowflake blizzard
Developing in parallel with Hadoop’s journey was that of Marcin Zukowski — co-founder and CEO of Vectorwise. Marcin took the data warehouse in another direction, to the world of advanced vector processing. Despite being almost unheard of among the general public, Snowflake was actually founded back in 2012. Firstly, Snowflake is not a consumer tech firm like Netflix or Uber. It's business-to-business only, which may explain its high valuation – enterprise companies are often seen as a more "stable" investment. In short, Snowflake helps businesses manage data that's stored on the cloud. The firm's motto is "mobilising the world's data", because it allows big companies to make better use of their vast data stores.
Marcin and his teammates rethought the data warehouse by leveraging the elasticity of the public cloud in an unexpected way: separating storage and compute. Their message was this: don’t pay for a data warehouse you don’t need. Only pay for the storage you need, and add capacity as you go. This is considered one of Snowflake’s key innovations: separating storage (where the data is held) from computing (the act of querying). By offering this service before Google, Amazon, and Microsoft had equivalent products of their own, Snowflake was able to attract customers, and build market share in the data warehousing space.
Naming the company after a discredited database concept was very brave. For those of us not in the details of the Snowflake schema, it is a logical arrangement of tables in a multidimensional database such that the entity-relationship diagram resembles a snowflake shape. … When it is completely normalized along all the dimension tables, the resultant structure resembles a snowflake with the fact table in the middle. Needless to say, the “snowflake” schema is as far from Hadoop’s design philosophy as technically possible.
While Silicon Valley was headed toward a dead end, Snowflake captured an entire cloud data market.
Article | July 23, 2020
2 Top Data Storage Trends That Simplify Data Management
2.1 AI Storage Continues to be The Chief
2.2 Price Markdown in Flash Storage
2.3 Hybrid Multi Cloud for The Win
2.4 Increased Significance of Software-Defined Storage
2.5 Non-Volatile Memory Express (NVMe) Beats Data Center Fabrics
2.6 Acceleration of Storage Class Memory
2.7 Hyperconverged Storage – A Push to Edge Computing
3 The Future of Data Storage
There’s more to data than just to store it. Organizations not only have the responsibility of dealing with a plethora of data, but are also anticipated of safeguarding it. One of the primary alternatives that enterprises are indulging in to keep up with the continuous data expansion is data storage entities and applications.
A recent study conducted by Statista revealed that worldwide spending on data storage units is expected to exceed 78 billion U.S. dollars by 2021. Going by these storage stats, it can be certainly said that data is going to be amplified at a much faster rate, and companies do not have a choice but to be geared up for a data boom and still be relevant.
When it comes to data management/storage, information technology has risen to all its glory with concepts like machine learning. While the idea of such profound approaches is thrilling, the real question boils down to whether organizations are ready as well as equipped enough to handle them. The answer to this might be NO.
But, can companies make changes and still thrive? Most definitely, YES!
To make this concept more understandable, here is a list of changes/trends that companies should adopt to make data storage a lot more easy and secure.
2. Top data storage trends that simplify data management
Data corruption is one big issue that most companies face. The complications that unfold further because of the corruption of data are even more complicated to resolve. To fix this and other such data storage problems, companies have come up with trends that are resilient and flexible. These trends have the capability of making history in the world of technology, so, you better gear up to learn and later adapt to them.
2.1 AI storage continues to be the chief
The speed with which AI hit the IT world just doesn’t seem to slow down even after all these years. We say this because, amongst all other concepts that were and are constantly being introduced, artificial intelligence is one applied science that has made the most amount of innovations. To further add to this, AI is now making enterprise data storage process easier with its various subsets like machine learning and deep learning. This technology is helping companies in accumulating multiple layers of data in a more assorted format. It is automating IT storages including data migrating, archiving, protecting, etc. With AI, companies will be able to control data storage across multiple locations and storage platforms.
2.2 Price markdown in Flash storage
As per a report by Markets and Markets, the overall All-Flash Array Market was valued at USD 5.9 billion in 2018 and is expected to reach USD 17.8 billion by 2023, at a CAGR of 24.53% during this period. This growth only states that the need for all-flash storage is only going to broaden. Flash storage has always been a choice that most companies stayed away from mainly because of the price. But with this new trend of adopting flexible data storage ways coming in, flash storage has been offered at a much-depreciated price. The drop in the cost of this storage technology will finally enable businesses of all sizes to invest in this high-performance solution.
READ MORE: HOW BUSINESS ANALYTICS ACCELERATES YOUR BUSINESS GROWTH
2.3 Hybrid multi cloud for the win
With data growing every minute, just a “cloud” strategy will not be enough. In this wave of data storage services, hybrid multi-cloud is one concept that is helping manage off-premises data. With this growing concept, IT authorities will be able to collect, segregate and store, on-premises, and off-premises data in a much-sophisticated manner. This will enable in centrally managing while reducing the effort of data storage by automating policy-based data placement across a hybrid of multi-cloud and storage types.
2.4 Increased significance of software-defined storage
More the data, less reliability on hardware devices – this is the growing attitude of most companies. This fear certainly has the possibility of becoming a reality. Hence, an addition to the cybersecurity strategy that companies can make is adopting software-defined storage. This approach of data storage disconnects the underlying physical storage hardware. It is programmed in a way that can function on policy-based management of resources, automated provision, and computerized storage capacity reassignment. Due to the automated function, scaling up and down of data is also faster. Some of the biggest advantages of this trend will be the governance, data protection, and security it will provide to the entire loop.
2.5 Non-Volatile Memory Express (NVMe) beats data center fabrics
NVMe – as ornate as the name sounds, is a concept that is freshly introduced with the aim of making data storage simpler. Non-Volatile Memory Express is a concept that enables accessibility of high-speed storage media. It is a protocol that is showing great results in a short amount of time of its inception. NVMe not only increases the performance value of existing applications, but also enables new applications to real-time workload processing. This feature of high performance and low latency is surely a highlight of the concept. All in all, this entire trend seems to have a lot of potential that are yet to be explored.
READ MORE: HOW TO MAXIMIZE VALUE FROM DATA COLLECTED FOR BUSINESSES SUCCESS
2.6 Acceleration of storage class memory
Storage class memory is a perfect combination of flash storage and NVMe. This is because it perfectly fills in the gap between server storage and external storage. As data protection is one of the major concerns of enterprises, this upcoming trend, does not only protect data but also continually stores and improves it for easier segregation. A clear advantage that storage class memory has over flash and NVMe storages is that it provides memory-like byte-addressable access to data thus reducing piling up of irrelevant data. Another benefit of this trend is that it indulges in deeper integration of data for ensuring high performance and top-level data security.
2.7 Hyperconverged storage – a push to edge computing
The increased demand for hyper converged storage is a result of the growth of hybrid cloud and software-defined infrastructure. Besides these technologies, its suitability for retail settings and remote offices is add on to its already existing set of features. It’s the capability of capturing data from a distance also enables cost-effectiveness and scales down the need to store everything on a public cloud. Hyper converged storage if used in its true essence can simplify IT operations and data storage for enterprises of all sizes.
3. The future of data storage
According to the Internet World Stats, more than 4.5 billion internet users around the world relentlessly create an astronomical amount of data. This translates to propel companies into discovering methods or applications that help them store this data safe from harmful ransomware attacks and still use it productively for their advantage. One of the prime changes that can be estimated about the future of data storage is that companies will have to adapt to the rapid changes, and mould their process to enable quick and seamless storage of data. Another enhancement would be that IT managers and responsible authorities would have to be updated and proactive at all times to know what data storage has been newly introduced, and how it can be used for the company’s advantage.
Here’s a thing, amongst all the research that enterprises are conducting, not all data storage technologies will end up becoming a hit, and will fulfil the specification of high-speed storage. But, looking at all the efforts that researchers are taking, we don’t think they are going to stop any sooner and neither is the augmentation of data!
Article | February 25, 2020
Internet of Things, according to congressional research service (CRS) report 2020, is a system of interrelated devices connected to a network and/or to one another, exchanging data without necessarily requiring human to machine interaction.The report cites smart factories, smart home devices, medical monitoring devices, wearable fitness trackers, smart city infrastructures, and vehicular telematics as examples of IoT.
Article | March 30, 2020
Most businesses do not have contingency or business continuity plans that correlate to the world we see unfold before us—one in which we seem to wake up to an entirely new reality each day. Broad mandates to work at home are now a given. But how do we move beyond this and strategically prepare for—and respond to—business implications resulting from the coronavirus pandemic? Some of our customers are showing us how. These organizations have developed comprehensive, real-time operational intelligence views of their global teams—some in only 24-48 hours—that help them better protect their remote workforces, customers, and business at hand.