5 Key Requirements of Big Data Projects

| May 13, 2016

article image
Thor Olavsrud recently wrote inCIO.com, “Successful big data projects have five key requirements, says Amy Gaskins, a data scientist with more than a decade of experience designing and implementing data and intelligence projects for the private sector, government agencies and the U.S. military. In her keynote presentation at the Apache: Big Data North America conference in Vancouver on Monday, Gaskins stressed that five factors can make or break big data projects.”

Spotlight

Tembo Inc.

Smart analytics. Engaging designs. Better decisions. Tembo is a leading provider of data-related products and services in K-12 education. We give state and local education agencies the tools they need to make more informed choices. We have particular expertise in data and reporting on accountability systems, school choice, assessments, and school quality & equity.

OTHER ARTICLES

Rethinking and Recontextualizing Context(s) in Natural Language Processing

Article | June 10, 2021

We discursive creatures are construed within a meaningful, bounded communicative environment, namely context(s) and not in a vacuum. Context(s) co-occur in different scenarios, that is, in mundane talk as well as in academic discourse where the goal of natural language communication is mutual intelligibility, hence the negotiation of meaning. Discursive research focuses on the context-sensitive use of the linguistic code and its social practice in particular settings, such as medical talk, courtroom interactions, financial/economic and political discourse which may restrict its validity when ascribing to a theoretical framework and its propositions regarding its application. This is also reflected in the case of artificial intelligence approaches to context(s) such as the development of context-sensitive parsers, context-sensitive translation machines and context-sensitive information systems where the validity of an argument and its propositions is at stake. Context is at the heart of pragmatics or even better said context is the anchor of any pragmatic theory: sociopragmatics, discourse analysis and ethnomethodological conversation analysis. Academic disciplines, such as linguistics, philosophy, anthropology, psychology and literary theory have also studied various aspects of the context phenomena. Yet, the concept of context has remained fuzzy or is generally undefined. It seems that the denotation of the word [context] has become murkier as its uses have been extended in many directions. Context or/ and contexts? Now in order to be “felicitous” integrated into the pragmatic construct, the definition of context needs some delimitations. Depending on the frame of research, context is delimitated to the global surroundings of the phenomenon to be investigated, for instance if its surrounding is of extra-linguistic nature it is called the socio-cultural context, if it comprises features of a speech situation, it is called the linguistic context and if it refers to the cognitive material, that is a mental representation, it is called the cognitive context. Context is a transcendental notion which plays a key role in interpretation. Language is no longer considered as decontextualized sentences. Instead language is seen as embedded in larger activities, through which they become meaningful. In a dynamic outlook on communication, the acts of speaking (which generates a form discourse, for instance, conversational discourse, lecture or speech) and interpreting build contexts and at the same time constrain the building of such contexts. In Heritage’s terminology, “the production of talk is doubly contextual” (Heritage 1984: 242). An utterance relies upon the existing context for its production and interpretation, and it is, in its own right, an event that shapes a new context for the action that will follow. A linguistic context can be decontextualized at a local level, and it can be recontextualized at a global level. There is intra-discursive recontextualization anchored to local decontextualization, and there is interdiscursive recontextualization anchored to global recontextualization. “A given context not only 'legislates' the interpretation of indexical elements; indexical elements can also mold the background of the context” (Ochs, 1990). In the case of recontextualization, in a particular scenario, it is valid to ask what do you mean or how do you mean. Making a reference to context and a reference to meaning helps to clarify when there is a controversy about the communicative status and at the same time provides a frame for the recontextualization. A linguistic context is intrinsically linked to a social context and a subcategory of the latter, the socio-cultural context. The social context can be considered as unmarked, hence a default context, whereas a socio-cultural context can be conceived as a marked type of context in which specific variables are interpreted in a particular mode. Culture provides us, the participants, with a filter mechanism which allows us to interpret a social context in accordance with particular socio-cultural context constraints and requirements. Besides, socially constitutive qualities of context are unavoidable since each interaction updates the existing context and prepares new ground for subsequent interaction. Now, how these aforementioned conceptualizations and views are reflected in NLP? Most of the research work has focused in the linguistic context, that is, in the word level surroundings and the lexical meaning. An approach to producing sense embeddings for the lexical meanings within a lexical knowledge base which lie in a space that is comparable to that of contextualized word vectors. Contextualized word embeddings have been used effectively across several tasks in Natural Language Processing, as they have proved to carry useful semantic information. The task of associating a word in context with the most suitable meaning from a predefined sense inventory is better known as Word Sense Disambiguation (Navigli, 2009). Linguistically speaking, “context encompasses the total linguistic and non-linguistic background of a text” (Crystal, 1991). Notice that the nature of context(s) is clearly crucial when reconstructing the meaning of a text. Therefore, “meaning-in-context should be regarded as a probabilistic weighting, of the list of potential meanings available to the user of the language.” The so-called disambiguating role of context should be taken with a pinch of salt. The main reason for language models such as BERT (Devlin et al., 2019), RoBERTA (Liu et al., 2019) and SBERT (Reimers, 2019) proved to be beneficial in most NLP task is that contextualized embeddings of words encode the semantics defined by their input context. In the same vein, a novel method for contextualized sense representations has recently been employed: SensEmBERT (Scarlini et al., 2020) which computes sense representations that can be applied directly to disambiguation. Still, there is a long way to go regarding context(s) research. The linguistic context is just one of the necessary conditions for sentence embeddedness in “a” context. For interpretation to take place, well-formed sentences and well-formed constructions, that is, linguistic strings which must be grammatical but may be constrained by cognitive sentence-processability and pragmatic relevance, particular linguistic-context and social-context configurations, which make their production and interpretation meaningful, will be needed.

Read More

Will Quantum Computers Make Supercomputers Obsolete in the Field of High Performance Computing?

Article | May 12, 2021

If you want an explicit answer without having to know the extra details, then here it is: Yes, there is a possibility that quantum computers can replace supercomputers in the field of high performance computing, under certain conditions. Now, if you want to know how and why this scenario is a possibility and what those conditions are, I’d encourage you to peruse the rest of this article. To start, we will run through some very simple definitions. Definitions If you work in the IT sector, you probably would have heard of the terms ‘high performance computing’, ‘supercomputers’ and ‘quantum computers’ many times. These words are thrown around quite often nowadays, especially in the area of data science and artificial intelligence. Perhaps you would have deduced their meanings from their context of use, but you may not have gotten the opportunity to explicitly sit down and do the required research on what they are and why they are used. Therefore, it is a good idea to go through their definitions, so that you have a better understanding of each concept. High Performance Computing: It is the process of carrying out complex calculations and computations on data at a very high speed. It is much faster than regular computing. Supercomputer: It is a type of computer that is used to efficiently perform powerful and quick computations. Quantum Computing: It is a type of computer that makes use of quantum mechanics’ concepts like entanglement and superposition, in order to carry out powerful computations. Now that you’ve gotten the gist of these concepts, let’s dive in a little more to get a wider scope of how they are implemented throughout the world. Background High performance computing is a thriving area in the sector of information technology, and rightly so, due to the rapid surge in the amount of data that is produced, stored, and processed every second. Over the last few decades, data has become increasingly significant to large corporations, small businesses, and individuals, as a result of its tremendous potential in their growth and profit. By properly analysing data, it is possible to make beneficial predictions and determine optimal strategies. The challenge is that there are huge amounts of data being generated every day. If traditional computers are used to manage and compute all of this data, the outcome would take an irrationally long time to be produced. Massive amounts of resources like time, computational power, and expenses would also be required in order to effectuate such computations. Supercomputers were therefore introduced into the field of technology to tackle this issue. These computers facilitate the computation of huge quantities of data at much higher speeds than a regular computer. They are a great investment for businesses that require data to be processed often and in large amounts at a time. The main advantage of supercomputers is that they can do what regular computers need to do, but much more quickly and efficiently. They have an overall high level of performance. Till date, they have been applied in the following domains: • Nuclear Weapon Design • Cryptography • Medical Diagnosis • Weather Forecasting • Online Gaming • Study of Subatomic Particles • Tackling the COVID-19 Pandemic Quantum computers, on the other hand, use a completely different principle when functioning. Unlike regular computers that use bits as the smallest units of data, quantum computers generate and manipulate ‘qubits’ or ‘quantum bits’, which are subatomic particles like electrons or photons. These qubits have two interesting quantum properties which allow them to powerfully compute data – • Superposition: Qubits, like regular computer bits, can be in a state of 1 or 0. However, they also have the ability to be in both states of 1 and 0 simultaneously. This combined state allows quantum computers to calculate a large number of possible outcomes, all at once. When the final outcome is determined, the qubits fall back into a state of either 1 or 0. This property iscalled superposition. • Entanglement: Pairs of qubits can exist in such a way that two members of a pair of qubits exist in a single quantum state. In such a situation, changing the state of one of the qubits can instantly change the state of the other qubit. This property is called entanglement. Their most promising applications so far include: • Cybersecurity • Cryptography • Drug Designing • Financial Modelling • Weather Forecasting • Artificial Intelligence • Workforce Management Despite their distinct features, both supercomputers and quantum computers are immensely capable of providing users with strong computing facilities. The question is, how do we know which type of system would be the best for high performance computing? A Comparison High performance computing requires robust machines that can deal with large amounts of data - This involves the collection, storage, manipulation, computation, and exchange of data in order to derive insights that are beneficial to the user. Supercomputers have successfully been used so far for such operations. When the concept of a quantum computer first came about, it caused quite a revolution within the scientific community. People recognised its innumerable and widespread abilities, and began working on ways to convert this theoretical innovation into a realistic breakthrough. What makes a quantum computer so different from a supercomputer? Let’s have a look at Table 1.1 below. From the table, we can draw the following conclusions about supercomputers and quantum computers - 1. Supercomputers have been around for a longer duration of time, and are therefore more advanced. Quantum computers are relatively new and still require a great depth of research to sufficiently comprehend their working and develop a sustainable system. 2. Supercomputers are easier to provide inputs to, while quantum computers need a different input mechanism. 3. Supercomputers are fast, but quantum computers are much faster. 4. Supercomputers and quantum computers have some similar applications. 5. Quantum computers can be perceived as extremely powerful and highly advanced supercomputers. Thus, we find that while supercomputers surpass quantum computers in terms of development and span of existence, quantum computers are comparatively much better in terms of capability and performance. The Verdict We have seen what supercomputers and quantum computers are, and how they can be applied in real-world scenarios, particularly in the field of high performance computing. We have also gone through their differences and made significant observations in this regard. We find that although supercomputers have been working great so far, and they continue to provide substantial provisions to researchers, organisations, and individuals who require intense computational power for the quick processing of enormous amounts of data, quantum computers have the potential to perform much better and provide faster and much more adequate results. Thus, quantum computers can potentially make supercomputers obsolete, especially in the field of high performance computing, if and only if researchers are able to come up with a way to make the development, deployment, and maintenance of these computers scalable, feasible, and optimal for consumers.

Read More

Soft Skills in Data Science

Article | April 29, 2021

We live in a world convulsed by new technologies and we are witnessing how more and more processes are automated in order to be executed with the same skill or even with better results than if they were carried out by a human, all this in order to be more efficient and effective. In this context the world of work is becoming increasingly competitive, because to remain employable we need to learn to manage or find a way to adapt our knowledge and skills to new technologies. With the spread of e-learning platforms and the tutorials that we can find available on the internet, acquiring new knowledge is within everyone's reach. For this reason, it is necessary to differentiate ourselves in order to stand out from other professionals, who have the hard skills similar to ours and this is precisely where Soft Skills play a very important role. What are Soft Skills? Soft skills are actually a combination of individual social skills, communication skills, personality traits, attitudes, social intelligence and emotional intelligence. Which facilitate relationships with others, making us more effective when interacting with other people. We could say that Soft Skills are the human interface that allow us to adapt to different working environments and industries. They are powerful tools for personal and professional growth. Why are Soft Skills key in our professional growth? Nowadays, standing out in the world of work is getting increasingly difficult, regardless of whether you are part of a corporation or work independently, due to the great competition within the labor market. That is why we must develop certain skills and attitudes that help us to function properly and successfully meet professional demands. Soft Skills are the point of differentiation that allows us to be selected for a position. The reason is very simple, we could be applying for a position and competing with people that are equal or even more qualified than us at a technical level, but to achieve the collaborative objectives of the company, more is required than just the technical and rational part. Also the way of communicating, values, ethics, as well as personality traits are highly valued factors since they help to drive organizations through high-performance teams, guaranteeing the achievement of their objectives. The background of the Soft Skills that we have trained throughout our lives make us unique, because it is unlikely that two people have the same combination of Soft Skills and been trained in a similar way, and that makes us more competitive against certain job opportunities where perhaps many will have the same Hard Skills, but where our Soft Skills will be the ones that will make us stand out to continue advancing in our professional career. How to sharpen our Soft Skills? To perform in any job we necessarily need to interact with other people, even if we work independently or remotely, so we must have the necessary skills that allow us to connect successfully with our teammates and stakeholders. Starting from the fact that Soft Skills are human skills, we can say that we have them pre-installed and the way to start using them (installing them) is through the experiences we undergo every day. Imagine being able to communicate assertively in your work environment and in your personal life. Master the use of tools installed in you to improve your interpersonal relationships within your work teams and reduce conflict. This would allow you to foster a healthy working environment and be able to lead any team in any environment in a strategic and effective way. Think of Soft Skills as a set of Apps that are ready to be used (like a toolbox) and that according to the experiences that are presented in our personal and / or professional lives, we are going to choose to use these applications to achieve our goals. Every time we access one of these applications, we are giving it the opportunity to collect data that will allow it to personalize its insights according to our needs and to fine-tune its effectiveness each time we use it. One of the best ways to train our Soft Skills is by leaving our comfort zone, because that will allow us to 'install' more and more Soft Skills. Another way to refine our Soft Skills is by participating in activities that involve people we do not know and even better if we involve people from other cultures, because we will achieve a beneficial exchange of experiences and knowledge for both parties that will enrich and make the training of our Soft Skills even more valuable. Some examples of activities that will enhance your Soft Skills: • Participate in competitions (e.g. Hackathons) • Found or be a lead of a community that shares your interests, and organizes small or large projects. • Organize a study group aimed at carrying out a technical or business project in order to confront professionals from various fields or industries. • Find resources and experts to help you. There are Soft Skills trainers who know useful techniques and tips to develop/sharpen your skills. • Participate in volunteer activities. You will meet new people with whom to put your Soft Skills in action. These activities will train/sharpen your leadership skills, teamwork, delegation, interpersonal communication, persuasion, etc. These are skills that we do not have as much facility to train while we are students or when we have just started working after finishing our studies, and that are required in the labor market to continue climbing in our professional career. Why do Soft Skills matter in the Data Science universe? A consequence of the use of Artificial Intelligence and Data Science is that many of the jobs that we know today will be automated and this is a matter of concern for many professionals who see their careers are in danger, but the good news is that in the future many new jobs the Soft Skills will be the main protagonists, this is what John Thompson explains us in his book "Building Analytics Teams" In other words, it is precisely our human skills that will allow us to be more employable in the future, and they will be highly requested skills because according to what the experts envision which is, that the machines will not be able to match us in this field, and that is why training our Soft Skills becomes a priority because they will allow us to be the key players of the future. On the other hand, Data Science is an interdisciplinary field where Soft Skills such as cooperation and communication are essential to achieve the goals set. Denis Rothman, author of the book "Transformers for Natural Language Processing" in an interview that I conducted, mentioned that The Human Quality is the most important thing for him when choosing the members of his work team. These are some considerations to take into account to generate a culture of cooperation: • People work harder and need less supervision, when they themselves control their work and have more freedom to choose how to do it. When they work as a team, they show greater motivation, their sense of pride increases and productivity reaches higher levels. • Solid teams that seek quality and excellence correct themselves; that is, they identify problems and correct them very quickly. Thus, they gain work experience and increase their performance. • Forming a solid and efficient work team requires patience. You need to give them time to see your results. They will have to establish procedures to complete tasks, handle administrative functions and work together efficiently, they will even have to adapt to their own decisions and accept their consequences. • A manager or team leader must recognize the team building process without expecting immediate results. The group will have to go through a learning process and this will take longer in some groups than in others. Another key component to achieving high levels of cooperation is fluid communication among team members and stakeholders. For instance defining the communication channels and the contact points in the different teams involved, guarantees the constant flow of communication during the life cycle of a Data Science project. One of the most critical moments is the presentation of the results to the stakeholders. In some cases the results of a project are not taken into consideration not so much because the expected results are not achieved, but because the way in which these results are presented are not meaningful for the stakeholders, and this, in most cases, it is due to the existence of communication barriers that is a consequence of the use of a language (terminologies) used in the technical world but not in the business world. After taking a tour of the world of Soft Skills, we can conclude by saying that Soft Skills are like superpowers that are waiting for the opportunity to be put into action, to make you a superhero or superheroine. Keep climbing positions in your professional career depends on you, on how much you use these superpowers but above all on your skills to refine them and make them available to the work team of which you are part. Don't wait any longer and start discovering your potential, start training your Soft Skills! If you want to know more about Soft Skills, I invite you to visit The Soft Skills Show

Read More

Here’s How Analytics are Transforming the Marketing Industry

Article | July 13, 2021

When it comes to marketing today, big data analytics has become a powerful being. The raw material marketers need to make sense of the information they are presented with so they can do their jobs with accuracy and excellence. Big data is what empowers marketers to understand their customers based on any online action they take. Thanks to the boom of big data, marketers have learned more about new marketing trends and preferences, and behaviors of the consumer. For example, marketers know what their customers are streaming to what groceries they are ordering, thanks to big data. Data is readily available in abundance due to digital technology. Data is created through mobile phones, social media, digital ads, weblogs, electronic devices, and sensors attached through the internet of things (IoT). Data analytics helps organizations discover newer markets, learn how new customers interact with online ads, and draw conclusions and effects of new strategies. Newer sophisticated marketing analytics software and analytics tools are now being used to determine consumers’ buying patterns and key influencers in decision-making and validate data marketing approaches that yield the best results. With the integration of product management with data science, real-time data capture, and analytics, big data analytics is helping companies increase sales and improve the customer experience. In this article, we will examine how big data analytics are transforming the marketing industry. Personalized Marketing Personalized Marketing has taken an essential place in direct marketing to the consumers. Greeting consumers with their first name whenever they visit the website, sending them promotional emails of their favorite products, or notifying them with personalized recipes based on their grocery shopping are some of the examples of data-driven marketing. When marketers collect critical data marketing pieces about customers at different marketing touchpoints such as their interests, their name, what they like to listen to, what they order most, what they’d like to hear about, and who they want to hear from, this enables marketers to plan their campaigns strategically. Marketers aim for churn prevention and onboarding new customers. With customer’s marketing touchpoints, these insights can be used to improve acquisition rates, drive brand loyalty, increase revenue per customer, and improve the effectiveness of products and services. With these data marketing touchpoints, marketers can build an ideal customer profile. Furthermore, these customer profiles can help them strategize and execute personalized campaigns accordingly. Predictive Analytics Customer behavior can be traced by historical data, which is the best way to predict how customers would behave in the future. It allows companies to correctly predict which customers are interested in their products at the right time and place. Predictive analytics applies data mining, statistical techniques, machine learning, and artificial intelligence for data analysis and predict the customer’s future behavior and activities. Take an example of an online grocery store. If a customer tends to buy healthy and sugar-free snacks from the store now, they will keep buying it in the future too. This predictable behavior from the customer makes it easy for brands to capitalize on that and has been made easy by analytics tools. They can automate their sales and target the said customer. What they would be doing gives the customer chances to make “repeat purchases” based on their predictive behavior. Marketers can also suggest customers purchase products related to those repeat purchases to get them on board with new products. Customer Segmentation Customer segmentation means dividing your customers into strata to identify a specific pattern. For example, customers from a particular city may buy your products more than others, or customers from a certain age demographic prefer some products more than other age demographics. Specific marketing analytics software can help you segment your audience. For example, you can gather data like specific interests, how many times they have visited a place, unique preferences, and demographics such as age, gender, work, and home location. These insights are a golden opportunity for marketers to create bold campaigns optimizing their return on investment. They can cluster customers into specific groups and target these segments with highly relevant data marketing campaigns. The main goal of customer segmentation is to identify any interesting information that can help them increase revenue and meet their goals. Effective customer segmentation can help marketers with: • Identifying most profitable and least profitable customers • Building loyal relationships • Predicting customer patterns • Pricing products accordingly • Developing products based on their interests Businesses continue to invest in collecting high-quality data for perfect customer segmentation, which results in successful efforts. Optimized Ad Campaigns Customers’ social media data like Facebook, LinkedIn, and Twitter makes it easier for marketers to create customized ad campaigns on a larger scale. This means that they can create specific ad campaigns for particular groups and successfully execute an ad campaign. Big data also makes it easier for marketers to run ‘remarketing’ campaigns. Remarketing campaigns ads follow your customers online, wherever they browse, once they have visited your website. Execution of an online ad campaign makes all the difference in its success. Chasing customers with paid ads can work as an effective strategy if executed well. According to the rule 7, prospective customers need to be exposed to an ad minimum of seven times before they make any move on it. When creating online ad campaigns, do keep one thing in mind. Your customers should not feel as if they are being stalked when you make any remarketing campaigns. Space out your ads and their exposure, so they appear naturally rather than coming on as pushy. Consumer Impact Advancements in data science have vastly impacted consumers. Every move they make online is saved and measured. In addition, websites now use cookies to store consumer data, so whenever these consumers visit these websites, product lists based on their shopping habits pop up on the site. Search engines and social media data enhance this. This data can be used to analyze their behavior patterns and market to them accordingly. The information gained from search engines and social media can be used to influence consumers into staying loyal and help their businesses benefit from the same. These implications can be frightening, like seeing personalized ads crop up on their Facebook page or search engine. However, when consumer data is so openly available to marketers, they need to use it wisely and safeguard it from falling into the wrong hands. Fortunately, businesses are taking note and making sure that this information remains secure. Conclusion The future of marketing because of big data and analytics seems bright and optimistic. Businesses are collecting high-quality data in real-time and analyzing it with the help of machine learning and AI; the marketing world seems to be up for massive changes. Analytics are transforming marketing industry to a different level. And with sophisticated marketers behind the wheel, the sky is the only limit. Frequently Asked Questions Why is marketing analytics so important these days? Marketing analytics helps us see how everything plays off each other, and decide how we might want to invest moving forward. Re-prioritizing how you spend your time, how you build out your team, and the resources you invest in channels and efforts are critical steps to achieving marketing team success. What is the use of marketing analytics? Marketing analytics is used to measure how well your marketing efforts are performing and to determine what can be done differently to get better results across marketing channels. Which companies use marketing analytics? Marketing analytics enables you to improve your overall marketing program performance by identifying channel deficiencies, adjusting strategies and tactics as needed, optimizing processes, etc. Companies like Netflix, Sephora, EasyJet, and Spotify use marketing analytics to improve their markeitng performance as well. { "@context": "https://schema.org", "@type": "FAQPage", "mainEntity": [{ "@type": "Question", "name": "Why is marketing analytics so important these days?", "acceptedAnswer": { "@type": "Answer", "text": "Marketing analytics helps us see how everything plays off each other, and decide how we might want to invest moving forward. Re-prioritizing how you spend your time, how you build out your team and the resources you invest in channels and efforts are critical steps to achieving marketing team success" } },{ "@type": "Question", "name": "What is the use of marketing analytics?", "acceptedAnswer": { "@type": "Answer", "text": "Marketing analytics is used to measure how well your marketing efforts are performing and to determine what can be done differently to get better results across marketing channels." } },{ "@type": "Question", "name": "Which companies use marketing analytics?", "acceptedAnswer": { "@type": "Answer", "text": "Marketing analytics enables you to improve your overall marketing program performance by identifying channel deficiencies, adjusting strategies and tactics as needed, optimizing processes, etc. Companies like Netflix, Sephora, EasyJet, and Spotify use marketing analytics to improve their markeitng performance as well." } }] }

Read More

Spotlight

Tembo Inc.

Smart analytics. Engaging designs. Better decisions. Tembo is a leading provider of data-related products and services in K-12 education. We give state and local education agencies the tools they need to make more informed choices. We have particular expertise in data and reporting on accountability systems, school choice, assessments, and school quality & equity.

Events