Article | July 13, 2021
All business functions whether it is finance, marketing, procurement, or others find using data and analytics to drive success an imperative for today. They want to make informed decisions and be able to predict trends that are based on trusted data and insights from the business, operations, and customers. The criticality of delivering these capabilities was emphasised in a recent report, “The Importance of Unified Data and Analytics, Why and How Preintegrated Data and Analytics Solutions Drive Busines Success,” from Forrester Consulting. For approximately two-thirds of the global data warehouse and analytics strategy decision-makers surveyed in the research, their key data and analytics priorities are:
Article | September 2, 2021
Massive amount of data is collected and stored by companies in the search for the “Holy Grail”. One crucial component is the discovery and application of novel approaches to achieve a more complete picture of datasets provided by the local (sometimes global) event-based analytic strategy that currently dominates a specific field.
Bringing qualitative data to life is essential since it provides management decisions’ context and nuance. An NLP perspective for uncovering word-based themes across documents will facilitate the exploration and exploitation of qualitative data which are often hard to “identify” in a global setting. NLP can be used to perform different analysis mapping drivers.
Broadly speaking, drivers are factors that cause change and affect institutions, policies and management decision making. Being more precise, a “driver” is a force that has a material impact on a specific activity or an entity, which is contextually dependent, and which affects the financial market at a specific time. (Litterio, 2018). Major drivers often lie outside the immediate institutional environment such as elections or regional upheavals, or non-institutional factors such as Covid or climate change. In Total global strategy: Managing for worldwide competitive advantage, Yip (1992) develops a framework based on a set of four industry globalization drivers, which highlights the conditions for a company to become more global but also reflecting differentials in a competitive environment. In The lexicons: NLP in the design of Market Drivers Lexicon in Spanish, I have proposed a categorization into micro, macro drivers and temporality and a distinction among social, political, economic and technological drivers. Considering the “big picture”, “digging” beyond usual sectors and timeframes is key in state-of-the-art findings.
Working with qualitative data.
There is certainly not a unique “recipe” when applying NLP strategies. Different pipelines could be used to analyse any sort of textual data, from social media and reviews to focus group notes, blog comments and transcripts to name just a few when a MetaQuant team is looking for drivers.
Generally, being textual data the source, it is preferable to avoid manual task on the part of the analyst, though sometimes, depending on the domain, content, cultural variables, etc. it might be required. If qualitative data is the core, then the preferred format is .csv. because of its plain nature which typically handle written responses better. Once the data has been collected and exported, the next step is to do some pre-processing. The basics include normalisation, morphosyntactic analysis, sentence structural analysis, tokenization, lexicalization, contextualization. Just simplify the data to make analysis easier.
Topic modelling refers to the task of recognizing words from the main topics that best describe a document or the corpus of data. LAD (Latent Dirichlet Allocation) is one of the most powerful algorithms with excellent implementations in the Python’s Gensim package.
The challenge: how to extract good quality of topics that are clear and meaningful. Of course, this depends mostly on the nature of text pre-processing and the strategy of finding the optimal number of topics, the creation of a lexicon(s) and the corpora. We can say that a topic is defined or construed around the most representative keywords. But are keywords enough? Well, there are some other factors to be observed such as:
1. The variety of topics included in the corpora.
2. The choice of topic modelling algorithm.
3. The number of topics fed to the algorithm.
4. The algorithms tuning parameters.
As you probably have noticed finding “the needle in the haystack” is not that easy. And only those who can use creatively NLP will have the advantage of positioning for global success.
Article | March 31, 2020
The analysis of a large volume of data is already an indispensable part of the decision-making process for any business, regardless of its volume. Big data is used to resolve routine problems, such as improving the conversion rate or to achieve customer loyalty for an eCommerce business. But did you know that you can also use it to predict situations before they occur? This is the added value of predictive analytics, the use of big data to anticipate user behaviour based on historical data and act accordingly to optimise sales.For online businesses, periodically performing predictive analytics is synonymous with improving your understanding of the customer and identifying changes in the market before they happen. The predictive models extract patterns from historical and transactional data to identify risks and opportunities. Self-learning software will automatically analyse the data at hand and offer solutions for future problems. This will allow you to design new sales strategies to adapt to changes and boost profit growth.
Article | September 7, 2021
Data has settled into regular business practices. Executives in every industry are looking for ways to optimize processes through the implementation of data. Doing business without analytics is just shooting yourself in the foot.
Yet, global business efforts to embrace data-transformation haven't had resounding success. There are many reasons for the challenging course, however, people and process management has been cited as the common thread.
A combination of people touting data as the “new oil” and everyone scrambling to obtain business intelligence has led to information being considered an end in itself. While the idea of becoming a data-driven organization is extremely beneficial, the execution is often lacking. In some areas of business, action over strategy can bring tremendous results.
However, in data governance such an approach often results in a hectic period of implementations, new processes, and uncoordinated decision-making. What I propose is to proceed with a good strategy and sound data governance principles in mind.
Auditing data for quality
Within a data governance framework, information turns into an asset. Proper data governance is essentially informational accounting. There are numerous rules, regulations, and guidelines to make governance ensure quality.
While boiling down the process into one concept would be reductionist, by far the most important topic in all information management and governance is data quality. Data quality can be loosely defined as the degree to which data is accurate, complete, timely, consistent, adherent to rules and requirements, and relevant.
Generally, knowledge workers (i.e. those who are heavily involved in data) have an intuitive grasp of when data quality is lacking. However, pinpointing the problem should be the goal. Only if the root cause, which is generally behavioral or process-based rather than technical, of the issue is discovered can the problem be resolved.
Lack of consistent data quality assurance leads to the same result with varying degrees of terribleness - decision making based on inaccurate information. For example, mismanaging company inventory is most often due to lack of data quality. Absence of data governance is all cost and no benefit. In the coming years, the threat of a lack of quality assurance will only increase as more businesses try to take advantage of data of any kind.
Luckily, data governance is becoming a more well-known phenomenon. According to a survey we conducted with Censuswide, nearly 50% of companies in the financial sector have put data quality assurement as part of their overall data strategy for the coming year.
Data governance prerequisites
Information management used to be thought of as an enterprise-level practice. While that still rings true in many cases today, overall data load within companies has significantly risen in the past few years. With the proliferation of data-as-a-service companies and overall improvement in information acquisition, medium-size enterprises can now derive beneficial results from implementing data governance if they are within a data-heavy field.
However, data governance programs will differ according to several factors. Each of these will influence the complexity of the strategy:
Business model - the type of organization, its hierarchy, industry, and daily activities.
Content - the volume, type (e.g. internal and external data, general information, documents, etc.) and location of content being governed.
Federation - the extent and intensity of governance.
Smaller businesses will barely have to think about the business model as they will usually have only one. Multinational corporations, on other hand, might have several branches and arms of action, necessitating different data governance strategies for each.
However, the hardest prerequisite for data governance is proving its efficacy beforehand. Since the process itself deals with abstract concepts (e.g. data as an asset, procedural efficiency), often only platitudes of “improved performance” and “reduced operating costs” will be available as arguments. Regardless of the distinct data governance strategy implemented, the effects become visible much later down the line. Even then, for people who have an aversion to data, the effects might be nearly invisible.
Therefore, while improved business performance and efficiency is a direct result of proper data governance, making the case for implementing such a strategy is easiest through risk reduction. Proper management of data results in easier compliance with laws and regulations, reduced data breach risk, and better decision making due to more streamlined access to information.
“Why even bother?”
Data governance is difficult, messy, and, sometimes, brutal. After all, most bad data is created out of human behavior, not technical error. That means telling people they’re doing something wrong (through habit or semi-intentional action). Proving someone wrong, at times repeatedly, is bound to ruffle some feathers.
Going to a social war for data might seem like overkill. However, proper data governance prevents numerous invisible costs and opens up avenues for growth. Without it, there’s an increased likelihood of:
Costs associated with data. Lack of consistent quality control can lead to the derivation of unrealistic conclusions. Noticing these has costs as retracing steps and fixing the root cause takes a considerable amount of time. Not noticing these can cause invisible financial sinks.
Costs associated with opportunity. All data can deliver insight. However, messy, inaccurate, or low-quality data has its potential significantly reduced. Some insights may simply be invisible if a business can’t keep up with quality.
As data governance is associated with an improvement in nearly all aspects of the organization, its importance cannot be overstated. However, getting everyone on board and keeping them there throughout the implementation will be painful. Delivering carefully crafted cost-benefit and risk analyses of such a project will be the initial step in nearly all cases.
Luckily, an end goal to all data governance programs is to disappear. As long as the required practices and behaviors remain, data quality can be maintained. Eventually, no one will even notice they’re doing something they may have considered “out of the ordinary” previously.