Article | February 28, 2020
An enormous amount of data is generated daily through various medium and amid this their storage becomes a great concern for organizations. Currently, two significant styles of data storage capacities are available Cloud and Data Centre.The main difference between the cloud vs. data centre is that a data centre refers to on-premise hardware while the cloud refers to off-premise computing. The cloud stores the data in the public cloud, while a data centre stores the data on company’s own hardware. Many businesses are turning to the cloud. In fact, Gartner, Inc. predicted that the worldwide public cloud services market has grown to 17.5 percent in 2019 to total US$214.3 billion. For many businesses, utilizing the cloud makes sense. While, in many other cases, having an in-house data centre is a better option. Often, maintaining an in-house data centre is expensive, but it can be beneficial to be in total control of computing environment.
Article | February 18, 2021
While digital transformation is proving to have many benefits for businesses, what is perhaps the most significant, is the vast amount of data there is available. And now, with an increasing number of businesses turning their focus to online, there is even more to be collected on competitors and markets than ever before.
Having all this information to hand may seem like any business owner’s dream, as they can now make insightful and informed commercial decisions based on what others are doing, what customers want and where markets are heading.
But according to Nate Burke, CEO of Diginius, a propriety software and solutions provider for ecommerce businesses, data should not be all a company relies upon when making important decisions.
Instead, there is a line to be drawn on where data is required and where human expertise and judgement can provide greater value.
Undeniably, the power of data is unmatched. With an abundance of data collection opportunities available online, and with an increasing number of businesses taking them, the potential and value of such information is richer than ever before.
And businesses are benefiting. Particularly where data concerns customer behaviour and market patterns. For instance, over the recent Christmas period, data was clearly suggesting a preference for ecommerce, with marketplaces such as Amazon leading the way due to greater convenience and price advantages.
Businesses that recognised and understood the trend could better prepare for the digital shopping season, placing greater emphasis on their online marketing tactics to encourage purchases and allocating resources to ensure product availability and on-time delivery.
While on the other hand, businesses who ignored, or simply did not utilise the information available to them, would have been left with overstocked shops and now, out of season items that would have to be heavily discounted or worse, disposed of.
Similarly, search and sales data can be used to understand changing consumer needs, and consequently, what items businesses should be ordering, manufacturing, marketing and selling for the best returns.
For instance, understandably, in 2020, DIY was at its peak, with increases in searches for “DIY facemasks”, “DIY decking” and “DIY garden ideas”. For those who had recognised the trend early on, they had the chance to shift their offerings and marketing in accordance, in turn really reaping the rewards.
So, paying attention to data certainly does pay off. And thanks to smarter and more sophisticated ways of collecting data online, such as cookies, and through AI and machine learning technologies, the value and use of such information is only likely to increase.
The future, therefore, looks bright. But even with all this potential at our fingertips, there are a number of issues businesses may face if their approach relies entirely on a data and insight-driven approach. Just like disregarding its power and potential can be damaging, so can using it as the sole basis upon which important decisions are based.
While the value of data for understanding the market and consumer patterns is undeniable, its value is only as rich as the quality of data being inputted. So, if businesses are collecting and analysing their data on their own activity, and then using this to draw meaningful insight, there should be strong focus on the data gathering phase, with attention given to what needs to be collected, why it should be collected, how it will be collected, and whether in fact this is an accurate representation of what it is you are trying to monitor or measure.
Human error can become an issue when this is done by individuals or teams who do not completely understand the numbers and patterns they are seeing. There is also an obstacle presented when there are various channels and platforms which are generating leads or sales for the business. In this case, any omission can skew results and provide an inaccurate picture. So, when used in decision making, there is the possibility of ineffective and unsuccessful changes.
But while data gathering becomes more and more autonomous, the possibility of human error is lessened. Although, this may add fuel to the next issue.
Drawing a line
The benefits of data and insights are clear, particularly as the tasks of collection and analysis become less of a burden for businesses and their people thanks to automation and AI advancements. But due to how effortless data collection and analysis is becoming, we can only expect more businesses to be doing it, meaning its ability to offer each individual company something unique is also being lessened.
So, businesses need to look elsewhere for their edge. And interestingly, this is where a line should be drawn and human judgement should be used in order to set them apart from the competition and differentiate from what everyone else is doing.
It makes perfect sense when you think about it. Your business is unique for a number of reasons, but mainly because of the brand, its values, reputation and perceptions of the services you are upheld by. And it’s usually these aspects that encourage consumers to choose your business rather than a competitor.
But often, these intangible aspects are much more difficult to measure and monitor through data collection and analysis, especially in the autonomous, number-driven format that many platforms utilise.
Here then, there is a great case for businesses to use their own judgements, expertise and experiences to determine what works well and what does not. For instance, you can begin to determine consumer perceptions towards a change in your product or services, which quantitative data may not be able to pick up until much later when sales figures begin to rise or fall. And while the data will eventually pick it up, it might not necessarily be able to help you decide on what an appropriate alternative solution may be, should the latter occur.
Human judgement, however, can listen to and understand qualitative feedback and consumer sentiments which can often provide much more meaningful insights for businesses to base their decisions on.
So, when it comes to competitor analysis, using insights generated from figure-based data sets and performance metrics is key to ensuring you are doing the same as the competition.
But if you are looking to get ahead, you may want to consider taking a human approach too.
Article | October 27, 2020
Data Platforms and frameworks have been constantly evolving. At some point of time; we are excited by Hadoop (well for almost 10 years); followed by Snowflake or as I say Snowflake Blizzard (who managed to launch biggest IPO win historically) and the Google (Google solves problems and serves use cases in a way that few companies can match).
The end of the data warehouse
Once upon a time, life was simple; or at least, the basic approach to Business Intelligence was fairly easy to describe… A process of collecting information from systems, building a repository of consistent data, and bolting on one or more reporting and visualisation tools which presented information to users. Data used to be managed in expensive, slow, inaccessible SQL data warehouses. SQL systems were notorious for their lack of scalability. Their demise is coming from a few technological advances. One of these is the ubiquitous, and growing, Hadoop.
On April 1, 2006, Apache Hadoop was unleashed upon Silicon Valley. Inspired by Google, Hadoop’s primary purpose was to improve the flexibility and scalability of data processing by splitting the process into smaller functions that run on commodity hardware.
Hadoop’s intent was to replace enterprise data warehouses based on SQL. Unfortunately, a technology used by Google may not be the best solution for everyone else. It’s not that others are incompetent: Google solves problems and serves use cases in a way that few companies can match. Google has been running massive-scale applications such as its eponymous search engine, YouTube and the Ads platform. The technologies and infrastructure that make the geographically distributed offerings perform at scale are what make various components of Google Cloud Platform enterprise ready and well-featured. Google has shown leadership in developing innovations that have been made available to the open-source community and are being used extensively by other public cloud vendors and Gartner clients. Examples of these include the Kubernetes container management framework, TensorFlow machine learning platform and the Apache Beam data processing programming model. GCP also uses open-source offerings in its cloud while treating third-party data and analytics providers as first-class citizens on its cloud and providing unified billing for its customers. The examples of the latter include DataStax, Redis Labs, InfluxData, MongoDB, Elastic, Neo4j and Confluent.
Silicon Valley tried to make Hadoop work. The technology was extremely complicated and nearly impossible to use efficiently. Hadoop’s lack of speed was compounded by its focus on unstructured data — you had to be a “flip-flop wearing” data scientist to truly make use of it.
Unstructured datasets are very difficult to query and analyze without deep knowledge of computer science. At one point, Gartner estimated that 70% of Hadoop deployments would not achieve the goal of cost savings and revenue growth, mainly due to insufficient skills and technical integration difficulties. And seventy percent seems like an understatement.
Data storage through the years: from GFS to Snowflake or Snowflake blizzard
Developing in parallel with Hadoop’s journey was that of Marcin Zukowski — co-founder and CEO of Vectorwise. Marcin took the data warehouse in another direction, to the world of advanced vector processing. Despite being almost unheard of among the general public, Snowflake was actually founded back in 2012. Firstly, Snowflake is not a consumer tech firm like Netflix or Uber. It's business-to-business only, which may explain its high valuation – enterprise companies are often seen as a more "stable" investment. In short, Snowflake helps businesses manage data that's stored on the cloud. The firm's motto is "mobilising the world's data", because it allows big companies to make better use of their vast data stores.
Marcin and his teammates rethought the data warehouse by leveraging the elasticity of the public cloud in an unexpected way: separating storage and compute. Their message was this: don’t pay for a data warehouse you don’t need. Only pay for the storage you need, and add capacity as you go. This is considered one of Snowflake’s key innovations: separating storage (where the data is held) from computing (the act of querying). By offering this service before Google, Amazon, and Microsoft had equivalent products of their own, Snowflake was able to attract customers, and build market share in the data warehousing space.
Naming the company after a discredited database concept was very brave. For those of us not in the details of the Snowflake schema, it is a logical arrangement of tables in a multidimensional database such that the entity-relationship diagram resembles a snowflake shape. … When it is completely normalized along all the dimension tables, the resultant structure resembles a snowflake with the fact table in the middle. Needless to say, the “snowflake” schema is as far from Hadoop’s design philosophy as technically possible.
While Silicon Valley was headed toward a dead end, Snowflake captured an entire cloud data market.
Article | March 9, 2021
For many, 2021 has brought hope that they can cautiously start to prepare for a world after Covid. That includes living with the possibility of future pandemics, and starting to reflect on what has been learned from such a brutal shared experience. One of the areas that has come into its own during Covid has been artificial intelligence (AI), a technology that helped bring the pandemic under control, and allow life to continue through lockdowns and other disruptions.
Plenty has been written about how AI has supported many aspects of life at work and home during Covid, from videoconferencing to online food ordering. But the role of AI in preventing Covid causing even more havoc is not necessarily as widely known. Perhaps even more importantly, little has been said about the role AI is likely to play in preparing for, responding to and even preventing future pandemics.
From what we saw in 2020, AI will help prevent global outbreaks of new diseases in three ways: prediction, diagnosis and treatment.
Predicting pandemics is all about tracking data that could be possible early signs that a new disease is spreading in a disturbing way. The kind of data we’re talking about includes public health information about symptoms presenting to hospitals and doctors around the world. There is already plenty of this captured in healthcare systems globally, and is consolidated into datasets such as the Johns Hopkins reports that many of us are familiar with from news briefings.
Firms like Bluedot and Metabiota are part of a growing number of organisations which use AI to track both publicly available and private data and make relevant predictions about public health threats. Both of these received attention in 2020 by reporting the appearance of Covid before it had been officially acknowledged. Boston Children’s Hospital is an example of a healthcare institution doing something similar with their Healthmap resource.
In addition to conventional healthcare data, AI is uniquely able to make use of informal data sources such as social media, news aggregators and discussion forums. This is because of AI techniques such as natural language processing and sentiment analysis. Firms such as Stratifyd use AI to do this in other business settings such as marketing, but also talk publicly about the use of their platform to predict and prevent pandemics. This is an example of so-called augmented intelligence, where AI is used to guide people to noteworthy data patterns, but stops short of deciding what it means, leaving that to human judgement.
Another important part of preventing a pandemic is keeping track of the transmission of disease through populations and geographies. A significant issue in 2020 was difficulty tracing people who had come into contact with infection. There was some success using mobile phones for this, and AI was critical in generating useful knowledge from mobile phone data.
The emphasis of Covid tracing apps in 2020 was keeping track of how the disease had already spread, but future developments are likely to be about predicting future spread patterns from such data. Prediction is a strength of AI, and the principles used to great effect in weather forecasting are similar to those used to model likely pandemic spread.
To prevent future pandemics, it won’t be enough to predict when a disease is spreading rapidly. To make the most of this knowledge, it’s necessary to diagnose and treat cases. One of the greatest early challenges with Covid was the lack of speedy, reliable tests.
For future pandemics, AI is likely to be used to create such tests more quickly than was the case in 2020. Creating a useful test involves modelling a disease’s response to different testing reagents, finding right balance between speed, convenience and accuracy. AI modelling simulates in a computer how individual cells respond to different stimuli, and could be used to perform virtual testing of many different types of test to accelerate how quickly the most promising ones reach laboratory and field trials.
In 2020 there were also several novel uses of AI to diagnose Covid, but there were few national and global mechanisms to deploy these at scale. One example was the use of AI imaging, diagnosing Covid by analysing chest x-rays for features specific to Covid. This would have been especially valuable in places that didn’t have access to lab testing equipment. Another example was using AI to analyse the sound of coughs to identify unique characteristics of a Covid cough.
AI research to systematically investigate innovative diagnosis techniques such as these should result in better planning for alternatives to laboratory testing. Faster and wider rollout of this kind of diagnosis would help control spread of a future disease during the critical period waiting for other tests to be developed or shared. This would be another contribution of AI to preventing a localised outbreak becoming a pandemic.
Historically, vaccination has proven to be an effective tool for dealing with pandemics, and was the long term solution to Covid for most countries. AI was used to accelerate development of Covid vaccines, helping cut the development time from years or decades to months. In principle, the use of AI was similar to that described above for developing diagnostic tests.
Different drug development teams used AI in different ways, but they all relied on mathematical modelling of how the Covid virus would respond to many forms of treatment at a microscopic level.
Much of the vaccine research and modelling focused on the “spike” proteins that allow Covid to attack human cells and enter the body. These are also found in other viruses, and were already the subject of research before the 2020 pandemic. That research allowed scientists to quickly develop AI models to represent the spikes, and simulate the effects of different possible treatments. This was crucial in trialling thousands of possible treatments in computer models, pinpointing the most likely successes for further investigation.
This kind of mathematical simulation using AI continued during drug development, and moved substantial amounts of work from the laboratory to the computer.
This modelling also allowed the impact of Covid mutations on vaccines to be assessed quickly. It is why scientists were reasonably confident of developing variants of vaccines for new Covid mutations in days and weeks rather than months.
As a result of the global effort to develop Covid vaccines, the body of data and knowledge about virus behaviour has grown substantially. This means it should be possible to understand new pathogens even more rapidly than Covid, potentially in hours or days rather than weeks.
AI has also helped create new ways of approaching vaccine development, for example the use of pre-prepared generic vaccines designed to treat viruses from the same family as Covid. Modifying one of these to the specific features of a new virus is much faster than starting from scratch, and AI may even have already simulated exactly such a variation.
AI has been involved in many parts of the fight against Covid, and we now have a much better idea than in 2020 of how to predict, diagnose and treat pandemics, especially similar viruses to Covid. So we can be cautiously optimistic that vaccine development for any future Covid-like viruses will be possible before it becomes a pandemic. Perhaps a trickier question is how well we will be able to respond if the next pandemic is from a virus that is nothing like Covid.
Was Rahman is an expert in the ethics of artificial intelligence, the CEO of AI Prescience and the author of AI and Machine Learning. See more at www.wasrahman.com