Article | September 2, 2021
Massive amount of data is collected and stored by companies in the search for the “Holy Grail”. One crucial component is the discovery and application of novel approaches to achieve a more complete picture of datasets provided by the local (sometimes global) event-based analytic strategy that currently dominates a specific field.
Bringing qualitative data to life is essential since it provides management decisions’ context and nuance. An NLP perspective for uncovering word-based themes across documents will facilitate the exploration and exploitation of qualitative data which are often hard to “identify” in a global setting. NLP can be used to perform different analysis mapping drivers.
Broadly speaking, drivers are factors that cause change and affect institutions, policies and management decision making. Being more precise, a “driver” is a force that has a material impact on a specific activity or an entity, which is contextually dependent, and which affects the financial market at a specific time. (Litterio, 2018). Major drivers often lie outside the immediate institutional environment such as elections or regional upheavals, or non-institutional factors such as Covid or climate change. In Total global strategy: Managing for worldwide competitive advantage, Yip (1992) develops a framework based on a set of four industry globalization drivers, which highlights the conditions for a company to become more global but also reflecting differentials in a competitive environment. In The lexicons: NLP in the design of Market Drivers Lexicon in Spanish, I have proposed a categorization into micro, macro drivers and temporality and a distinction among social, political, economic and technological drivers. Considering the “big picture”, “digging” beyond usual sectors and timeframes is key in state-of-the-art findings.
Working with qualitative data.
There is certainly not a unique “recipe” when applying NLP strategies. Different pipelines could be used to analyse any sort of textual data, from social media and reviews to focus group notes, blog comments and transcripts to name just a few when a MetaQuant team is looking for drivers.
Generally, being textual data the source, it is preferable to avoid manual task on the part of the analyst, though sometimes, depending on the domain, content, cultural variables, etc. it might be required. If qualitative data is the core, then the preferred format is .csv. because of its plain nature which typically handle written responses better. Once the data has been collected and exported, the next step is to do some pre-processing. The basics include normalisation, morphosyntactic analysis, sentence structural analysis, tokenization, lexicalization, contextualization. Just simplify the data to make analysis easier.
Topic modelling refers to the task of recognizing words from the main topics that best describe a document or the corpus of data. LAD (Latent Dirichlet Allocation) is one of the most powerful algorithms with excellent implementations in the Python’s Gensim package.
The challenge: how to extract good quality of topics that are clear and meaningful. Of course, this depends mostly on the nature of text pre-processing and the strategy of finding the optimal number of topics, the creation of a lexicon(s) and the corpora. We can say that a topic is defined or construed around the most representative keywords. But are keywords enough? Well, there are some other factors to be observed such as:
1. The variety of topics included in the corpora.
2. The choice of topic modelling algorithm.
3. The number of topics fed to the algorithm.
4. The algorithms tuning parameters.
As you probably have noticed finding “the needle in the haystack” is not that easy. And only those who can use creatively NLP will have the advantage of positioning for global success.
Article | September 2, 2021
2 Top Data Storage Trends That Simplify Data Management
2.1 AI Storage Continues to be The Chief
2.2 Price Markdown in Flash Storage
2.3 Hybrid Multi Cloud for The Win
2.4 Increased Significance of Software-Defined Storage
2.5 Non-Volatile Memory Express (NVMe) Beats Data Center Fabrics
2.6 Acceleration of Storage Class Memory
2.7 Hyperconverged Storage – A Push to Edge Computing
3 The Future of Data Storage
There’s more to data than just to store it. Organizations not only have the responsibility of dealing with a plethora of data, but are also anticipated of safeguarding it. One of the primary alternatives that enterprises are indulging in to keep up with the continuous data expansion is data storage entities and applications.
A recent study conducted by Statista revealed that worldwide spending on data storage units is expected to exceed 78 billion U.S. dollars by 2021. Going by these storage stats, it can be certainly said that data is going to be amplified at a much faster rate, and companies do not have a choice but to be geared up for a data boom and still be relevant.
When it comes to data management/storage, information technology has risen to all its glory with concepts like machine learning. While the idea of such profound approaches is thrilling, the real question boils down to whether organizations are ready as well as equipped enough to handle them. The answer to this might be NO.
But, can companies make changes and still thrive? Most definitely, YES!
To make this concept more understandable, here is a list of changes/trends that companies should adopt to make data storage a lot more easy and secure.
2. Top data storage trends that simplify data management
Data corruption is one big issue that most companies face. The complications that unfold further because of the corruption of data are even more complicated to resolve. To fix this and other such data storage problems, companies have come up with trends that are resilient and flexible. These trends have the capability of making history in the world of technology, so, you better gear up to learn and later adapt to them.
2.1 AI storage continues to be the chief
The speed with which AI hit the IT world just doesn’t seem to slow down even after all these years. We say this because, amongst all other concepts that were and are constantly being introduced, artificial intelligence is one applied science that has made the most amount of innovations. To further add to this, AI is now making enterprise data storage process easier with its various subsets like machine learning and deep learning. This technology is helping companies in accumulating multiple layers of data in a more assorted format. It is automating IT storages including data migrating, archiving, protecting, etc. With AI, companies will be able to control data storage across multiple locations and storage platforms.
2.2 Price markdown in Flash storage
As per a report by Markets and Markets, the overall All-Flash Array Market was valued at USD 5.9 billion in 2018 and is expected to reach USD 17.8 billion by 2023, at a CAGR of 24.53% during this period. This growth only states that the need for all-flash storage is only going to broaden. Flash storage has always been a choice that most companies stayed away from mainly because of the price. But with this new trend of adopting flexible data storage ways coming in, flash storage has been offered at a much-depreciated price. The drop in the cost of this storage technology will finally enable businesses of all sizes to invest in this high-performance solution.
READ MORE: HOW BUSINESS ANALYTICS ACCELERATES YOUR BUSINESS GROWTH
2.3 Hybrid multi cloud for the win
With data growing every minute, just a “cloud” strategy will not be enough. In this wave of data storage services, hybrid multi-cloud is one concept that is helping manage off-premises data. With this growing concept, IT authorities will be able to collect, segregate and store, on-premises, and off-premises data in a much-sophisticated manner. This will enable in centrally managing while reducing the effort of data storage by automating policy-based data placement across a hybrid of multi-cloud and storage types.
2.4 Increased significance of software-defined storage
More the data, less reliability on hardware devices – this is the growing attitude of most companies. This fear certainly has the possibility of becoming a reality. Hence, an addition to the cybersecurity strategy that companies can make is adopting software-defined storage. This approach of data storage disconnects the underlying physical storage hardware. It is programmed in a way that can function on policy-based management of resources, automated provision, and computerized storage capacity reassignment. Due to the automated function, scaling up and down of data is also faster. Some of the biggest advantages of this trend will be the governance, data protection, and security it will provide to the entire loop.
2.5 Non-Volatile Memory Express (NVMe) beats data center fabrics
NVMe – as ornate as the name sounds, is a concept that is freshly introduced with the aim of making data storage simpler. Non-Volatile Memory Express is a concept that enables accessibility of high-speed storage media. It is a protocol that is showing great results in a short amount of time of its inception. NVMe not only increases the performance value of existing applications, but also enables new applications to real-time workload processing. This feature of high performance and low latency is surely a highlight of the concept. All in all, this entire trend seems to have a lot of potential that are yet to be explored.
READ MORE: HOW TO MAXIMIZE VALUE FROM DATA COLLECTED FOR BUSINESSES SUCCESS
2.6 Acceleration of storage class memory
Storage class memory is a perfect combination of flash storage and NVMe. This is because it perfectly fills in the gap between server storage and external storage. As data protection is one of the major concerns of enterprises, this upcoming trend, does not only protect data but also continually stores and improves it for easier segregation. A clear advantage that storage class memory has over flash and NVMe storages is that it provides memory-like byte-addressable access to data thus reducing piling up of irrelevant data. Another benefit of this trend is that it indulges in deeper integration of data for ensuring high performance and top-level data security.
2.7 Hyperconverged storage – a push to edge computing
The increased demand for hyper converged storage is a result of the growth of hybrid cloud and software-defined infrastructure. Besides these technologies, its suitability for retail settings and remote offices is add on to its already existing set of features. It’s the capability of capturing data from a distance also enables cost-effectiveness and scales down the need to store everything on a public cloud. Hyper converged storage if used in its true essence can simplify IT operations and data storage for enterprises of all sizes.
3. The future of data storage
According to the Internet World Stats, more than 4.5 billion internet users around the world relentlessly create an astronomical amount of data. This translates to propel companies into discovering methods or applications that help them store this data safe from harmful ransomware attacks and still use it productively for their advantage. One of the prime changes that can be estimated about the future of data storage is that companies will have to adapt to the rapid changes, and mould their process to enable quick and seamless storage of data. Another enhancement would be that IT managers and responsible authorities would have to be updated and proactive at all times to know what data storage has been newly introduced, and how it can be used for the company’s advantage.
Here’s a thing, amongst all the research that enterprises are conducting, not all data storage technologies will end up becoming a hit, and will fulfil the specification of high-speed storage. But, looking at all the efforts that researchers are taking, we don’t think they are going to stop any sooner and neither is the augmentation of data!
Article | September 2, 2021
In 2020, the gaming market generated over 177 billion dollars, marking an astounding 23% growth from 2019. While it may be incredible how much revenue the industry develops, what’s more impressive is the massive amount of data generated by today’s games.
There are more than 2 billion gamers globally, generating over 50 terabytes of data each day. The largest game companies in the world can host 2.5 billion unique gaming sessions in a single month and host 50 billion minutes of gameplay in the same period.
The gaming industry and big data are intrinsically linked. Companies that develop capabilities in using that data to understand their customers will have a sizable advantage in the future. But doing this comes with its own unique challenges.
Games have many permutations, with different game types, devices, user segments, and monetization models. Traditional analytics approaches, which rely on manual processes and interventions by operators viewing dashboards, are insufficient in the face of the sheer volume of complex data generated by games.
Unchecked issues lead to costly incidents or missed opportunities that can significantly impact the user experience or the company’s bottom line. That’s why many leading gaming companies are turning to AI and Machine Learning to address these challenges.
Gaming Analytics AI
Gaming companies have all the data they need to understand who their users are, how they engage with the product, and whether they are likely to churn. The challenge is gaining valuable business insights into the data and taking action before opportunities pass and users leave the game.
AI/ML helps bridge this gap by providing real-time, actionable insights on near limitless data streams so companies can design around these analytics and act more quickly to resolve issues. There are two fundamental categories that companies should hone in on to make the best use of their gaming data:
The revenue generating opportunities in the gaming industry is one reason it’s a highly competitive market. Keeping gamers engaged requires emphasizing the user experience and continuous delivery of high-quality content personalized to a company’s most valued customers.
Customer Engagement and User Experience
Graphics and creative storylines are still vital, and performance issues, in particular, can be a killer for user enjoyment and drive churn. But with a market this competitive, it might not be enough to focus strictly on these issues.
Games can get an edge on the competition by investing in gaming AI analytics to understand user behaviors, likes, dislikes, seasonality impacts and even hone in on what makes them churn or come back to the game after a break.
AI-powered business monitoring solutions deliver value to the customer experience and create actionable insights to drive future business decisions and game designs to acquire new customers and prevent churn.
AI-Enhanced Monetization and Targeted Advertising
All games need a way to monetize. It’s especially true in today’s market, where users expect games to always be on and regularly deliver new content and features. A complex combination of factors influences how monetization practices and models enhance or detract from a user’s experience with a game.
When monetization frustrates users, it’s typically because of aggressive, irrelevant advertising campaigns or models that aren’t well suited to the game itself or its core players. Observe the most successful products in the market, and one thing you will consistently see is highly targeted interactions.
Developers can use metrics gleaned from AI analytics combined with performance marketing to appeal to their existing users and acquire new customers. With AI/ML, games can use personalized ads that cater to users’ or user segments’ behavior in real-time, optimizing the gaming experience and improving monetization outcomes.
Using AI based solutions, gaming studios can also quickly identify growth opportunities and trends with real-time insight into high performing monetization models and promotions.
Mobile Gaming Company Reduces Revenue Losses from Technical Incident
One mobile gaming company suffered a massive loss when a bug in a software update disrupted a marketing promotion in progress. The promotion involved automatically pushing special offers and opportunities for in-app purchases across various gaming and marketing channels. When a bug in an update disrupted the promotions process, the analytics team couldn’t take immediate action because they were unaware of the issue.
Their monitoring process was ad hoc, relying on the manual review of multiple dashboards, and unfortunately, by the time they discovered the problem, it was too late. The result was a massive loss for the company – a loss of users, a loss of installations, and in the end, more than 15% revenue loss from in-app purchases.
The company needed a more efficient and timely way to track its cross-promotional metrics, installations, and revenue. A machine learning-based approach, like Anodot’s AI-powered gaming analytics, provides notifications in real-time to quickly find and react to any breakdowns in the system and would have prevented the worst of the impacts.
Anodot’s AI-Powered Analytics for Gaming
The difference between success and failure is how companies respond to the ocean of data generated by their games and their users. Anodot’s AI-powered Gaming Analytics solutions can learn expected behavior in the complex gaming universe across all permutations of gaming, including devices, levels, user segments, pricing, and ads.
Anodot’s Gaming AI platform is specifically designed to monitor millions of gaming metrics and help ensure a seamless gaming experience. Anodot monitors every critical metric and establishes a baseline of standard behavior patterns to quickly alert teams to anomalies that might represent issues or opportunities.
Analytics teams see how new features impact user behavior, with clear, contextual alerts for spikes, drops, purchases, and app store reviews without the need to comb over dashboards trying to find helpful information.
The online gaming space represents one of the more recent areas where rapid data collection and analysis can provide a competitive differentiation. Studios using AI powered analytics will keep themselves and their players ahead of the game.
Article | September 2, 2021
Nowadays, everyone with some technical expertise and a data science bootcamp under their belt calls themselves a data scientist. Also, most managers don't know enough about the field to distinguish an actual data scientist from a make-believe one someone who calls themselves a data science professional today but may work as a cab driver next year. As data science is a very responsible field dealing with complex problems that require serious attention and work, the data scientist role has never been more significant. So, perhaps instead of arguing about which programming language or which all-in-one solution is the best one, we should focus on something more fundamental. More specifically, the thinking process of a data scientist.
The challenges of the Data Science professional
Any data science professional, regardless of his specialization, faces certain challenges in his day-to-day work. The most important of these involves decisions regarding how he goes about his work. He may have planned to use a particular model for his predictions or that model may not yield adequate performance (e.g., not high enough accuracy or too high computational cost, among other issues). What should he do then? Also, it could be that the data doesn't have a strong enough signal, and last time I checked, there wasn't a fool-proof method on any data science programming library that provided a clear-cut view on this matter. These are calls that the data scientist has to make and shoulder all the responsibility that goes with them.
Why Data Science automation often fails
Then there is the matter of automation of data science tasks. Although the idea sounds promising, it's probably the most challenging task in a data science pipeline. It's not unfeasible, but it takes a lot of work and a lot of expertise that's usually impossible to find in a single data scientist. Often, you need to combine the work of data engineers, software developers, data scientists, and even data modelers. Since most organizations don't have all that expertise or don't know how to manage it effectively, automation doesn't happen as they envision, resulting in a large part of the data science pipeline needing to be done manually.
The Data Science mindset overall
The data science mindset is the thinking process of the data scientist, the operating system of her mind. Without it, she can't do her work properly, in the large variety of circumstances she may find herself in. It's her mindset that organizes her know-how and helps her find solutions to the complex problems she encounters, whether it is wrangling data, building and testing a model or deploying the model on the cloud. This mindset is her strategy potential, the think tank within, which enables her to make the tough calls she often needs to make for the data science projects to move forward.
Specific aspects of the Data Science mindset
Of course, the data science mindset is more than a general thing. It involves specific components, such as specialized know-how, tools that are compatible with each other and relevant to the task at hand, a deep understanding of the methodologies used in data science work, problem-solving skills, and most importantly, communication abilities. The latter involves both the data scientist expressing himself clearly and also him understanding what the stakeholders need and expect of him. Naturally, the data science mindset also includes organizational skills (project management), the ability to work well with other professionals (even those not directly related to data science), and the ability to come up with creative approaches to the problem at hand.
The Data Science process
The data science process/pipeline is a distillation of data science work in a comprehensible manner. It's particularly useful for understanding the various stages of a data science project and help plan accordingly. You can view one version of it in Fig. 1 below. If the data science mindset is one's ability to navigate the data science landscape, the data science process is a map of that landscape. It's not 100% accurate but good enough to help you gain perspective if you feel overwhelmed or need to get a better grip on the bigger picture.
Learning more about the topic
Naturally, it's impossible to exhaust this topic in a single article (or even a series of articles). The material I've gathered on it can fill a book! If you are interested in such a book, feel free to check out the one I put together a few years back; it's called Data Science Mindset, Methodologies, and Misconceptions and it's geared both towards data scientist, data science learners, and people involved in data science work in some way (e.g. project leaders or data analysts). Check it out when you have a moment. Cheers!