The Future of Business Intelligence

| August 4, 2016

article image
Josh James, CEO of Domo and Co-Founder of Omniture, discusses his success with turning his business venture into a high-growth company and shared some best practices for fellow entrepreneurs.

Spotlight

CrowdFlower Inc.

CrowdFlower is the essential human-in-the-loop platform for data science teams. CrowdFlower helps customers generate high quality customized training data for their machine learning initiatives, or automate a business process with easy-to-deploy models and integrated human-in-the-loop workflows. The CrowdFlower platform supports a wide range of use cases including self-driving cars, intelligent personal assistants, medical image labeling, content categorization, customer support ticket classification, social data insight, CRM data enrichment, product categorization, and search relevance.

OTHER ARTICLES

HOW THE CORONAVIRUS (COVID-19) MIGHT BE STOPPED BY DATA SCIENCE

Article | March 16, 2020

We know that data and analytics play a role in everyday products from recommendations on what music we might like to hear to automated re-routing by our GPS system. But how might the power of analytics be brought to bear on a disease that is currently threatening the health and economic welfare of people across the globe?If we rewind the clock to the 1850s, there are two significant examples of how early pioneers in data science made incredible impacts on the world that can provide some insight into what we might see happen next.

Read More

7 Data Storage Trends You Cannot Miss in a Data Center

Article | July 23, 2020

Contents: 1 Introduction 2 Top Data Storage Trends That Simplify Data Management 2.1 AI Storage Continues to be The Chief 2.2 Price Markdown in Flash Storage 2.3 Hybrid Multi Cloud for The Win 2.4 Increased Significance of Software-Defined Storage 2.5 Non-Volatile Memory Express (NVMe) Beats Data Center Fabrics 2.6 Acceleration of Storage Class Memory 2.7 Hyperconverged Storage – A Push to Edge Computing 3 The Future of Data Storage 1. Introduction There’s more to data than just to store it. Organizations not only have the responsibility of dealing with a plethora of data, but are also anticipated of safeguarding it. One of the primary alternatives that enterprises are indulging in to keep up with the continuous data expansion is data storage entities and applications. A recent study conducted by Statista revealed that worldwide spending on data storage units is expected to exceed 78 billion U.S. dollars by 2021. Going by these storage stats, it can be certainly said that data is going to be amplified at a much faster rate, and companies do not have a choice but to be geared up for a data boom and still be relevant. When it comes to data management/storage, information technology has risen to all its glory with concepts like machine learning. While the idea of such profound approaches is thrilling, the real question boils down to whether organizations are ready as well as equipped enough to handle them. The answer to this might be NO. But, can companies make changes and still thrive? Most definitely, YES! To make this concept more understandable, here is a list of changes/trends that companies should adopt to make data storage a lot more easy and secure. 2. Top data storage trends that simplify data management Data corruption is one big issue that most companies face. The complications that unfold further because of the corruption of data are even more complicated to resolve. To fix this and other such data storage problems, companies have come up with trends that are resilient and flexible. These trends have the capability of making history in the world of technology, so, you better gear up to learn and later adapt to them. 2.1 AI storage continues to be the chief The speed with which AI hit the IT world just doesn’t seem to slow down even after all these years. We say this because, amongst all other concepts that were and are constantly being introduced, artificial intelligence is one applied science that has made the most amount of innovations. To further add to this, AI is now making enterprise data storage process easier with its various subsets like machine learning and deep learning. This technology is helping companies in accumulating multiple layers of data in a more assorted format. It is automating IT storages including data migrating, archiving, protecting, etc. With AI, companies will be able to control data storage across multiple locations and storage platforms. 2.2 Price markdown in Flash storage As per a report by Markets and Markets, the overall All-Flash Array Market was valued at USD 5.9 billion in 2018 and is expected to reach USD 17.8 billion by 2023, at a CAGR of 24.53% during this period. This growth only states that the need for all-flash storage is only going to broaden. Flash storage has always been a choice that most companies stayed away from mainly because of the price. But with this new trend of adopting flexible data storage ways coming in, flash storage has been offered at a much-depreciated price. The drop in the cost of this storage technology will finally enable businesses of all sizes to invest in this high-performance solution. READ MORE: HOW BUSINESS ANALYTICS ACCELERATES YOUR BUSINESS GROWTH 2.3 Hybrid multi cloud for the win With data growing every minute, just a “cloud” strategy will not be enough. In this wave of data storage services, hybrid multi-cloud is one concept that is helping manage off-premises data. With this growing concept, IT authorities will be able to collect, segregate and store, on-premises, and off-premises data in a much-sophisticated manner. This will enable in centrally managing while reducing the effort of data storage by automating policy-based data placement across a hybrid of multi-cloud and storage types. 2.4 Increased significance of software-defined storage More the data, less reliability on hardware devices – this is the growing attitude of most companies. This fear certainly has the possibility of becoming a reality. Hence, an addition to the cybersecurity strategy that companies can make is adopting software-defined storage. This approach of data storage disconnects the underlying physical storage hardware. It is programmed in a way that can function on policy-based management of resources, automated provision, and computerized storage capacity reassignment. Due to the automated function, scaling up and down of data is also faster. Some of the biggest advantages of this trend will be the governance, data protection, and security it will provide to the entire loop. 2.5 Non-Volatile Memory Express (NVMe) beats data center fabrics NVMe – as ornate as the name sounds, is a concept that is freshly introduced with the aim of making data storage simpler. Non-Volatile Memory Express is a concept that enables accessibility of high-speed storage media. It is a protocol that is showing great results in a short amount of time of its inception. NVMe not only increases the performance value of existing applications, but also enables new applications to real-time workload processing. This feature of high performance and low latency is surely a highlight of the concept. All in all, this entire trend seems to have a lot of potential that are yet to be explored. READ MORE: HOW TO MAXIMIZE VALUE FROM DATA COLLECTED FOR BUSINESSES SUCCESS 2.6 Acceleration of storage class memory Storage class memory is a perfect combination of flash storage and NVMe. This is because it perfectly fills in the gap between server storage and external storage. As data protection is one of the major concerns of enterprises, this upcoming trend, does not only protect data but also continually stores and improves it for easier segregation. A clear advantage that storage class memory has over flash and NVMe storages is that it provides memory-like byte-addressable access to data thus reducing piling up of irrelevant data. Another benefit of this trend is that it indulges in deeper integration of data for ensuring high performance and top-level data security. 2.7 Hyperconverged storage – a push to edge computing The increased demand for hyper converged storage is a result of the growth of hybrid cloud and software-defined infrastructure. Besides these technologies, its suitability for retail settings and remote offices is add on to its already existing set of features. It’s the capability of capturing data from a distance also enables cost-effectiveness and scales down the need to store everything on a public cloud. Hyper converged storage if used in its true essence can simplify IT operations and data storage for enterprises of all sizes. 3. The future of data storage According to the Internet World Stats, more than 4.5 billion internet users around the world relentlessly create an astronomical amount of data. This translates to propel companies into discovering methods or applications that help them store this data safe from harmful ransomware attacks and still use it productively for their advantage. One of the prime changes that can be estimated about the future of data storage is that companies will have to adapt to the rapid changes, and mould their process to enable quick and seamless storage of data. Another enhancement would be that IT managers and responsible authorities would have to be updated and proactive at all times to know what data storage has been newly introduced, and how it can be used for the company’s advantage. Here’s a thing, amongst all the research that enterprises are conducting, not all data storage technologies will end up becoming a hit, and will fulfil the specification of high-speed storage. But, looking at all the efforts that researchers are taking, we don’t think they are going to stop any sooner and neither is the augmentation of data!

Read More

Man Vs. Machine: Peaking into the Future of Artificial Intelligence

Article | March 15, 2021

Stephen Hawking, one of the finest minds to have ever lived, once famously said, “AI is likely to be either the best or the worst thing to happen to humanity.” This is of course true, with valid arguments both for and against the proliferation of AI. As a practitioner, I have witnessed the AI revolution at close quarters as it unfolded at breathtaking pace over the last two decades. My personal view is that there is no clear black and white in this debate. The pros and cons are very contextual – who is developing it, for what application, in what timeframe, towards what end? It always helps to understand both sides of the debate. So let’s try to take a closer look at what the naysayers say. The most common apprehensions can be clubbed into three main categories: A. Large-scale Unemployment: This is the most widely acknowledged of all the risks of AI. Technology and machines replacing humans for doing certain types of work isn’t new. We all know about entire professions dwindling, and even disappearing, due to technology. Industrial Revolution too had led to large scale job losses, although many believe that these were eventually compensated for by means of creating new avenues, lowering prices, increasing wages etc. However, a growing number of economists no longer subscribe to the belief that over a longer term, technology has positive ramifications on overall employment. In fact, multiple studies have predicted large scale job losses due to technological advancements. A 2016 UN report concluded that 75% of jobs in the developing world are expected to be replaced by machines! Unemployment, particularly at a large scale, is a very perilous thing, often resulting in widespread civil unrest. AI’s potential impact in this area therefore calls for very careful political, sociological and economic thinking, to counter it effectively. B. Singularity: The concept of Singularity is one of those things that one would have imagined seeing only in the pages of a futuristic Sci-Fi novel. However, in theory, today it is a real possibility. In a nutshell, Singularity refers to that point in human civilization when Artificial Intelligence reaches a tipping point beyond which it evolves into a superintelligence that surpasses human cognitive powers, thereby potentially posing a threat to human existence as we know it today. While the idea around this explosion of machine intelligence is a very pertinent and widely discussed topic, unlike the case of technology driven unemployment, the concept remains primarily theoretical. There is as yet no consensus amongst experts on whether this tipping point can ever really be reached in reality. C. Machine Consciousness: Unlike the previous two points, which can be regarded as risks associated with the evolution of AI, the aspect of machine consciousness perhaps is best described as an ethical conundrum. The idea deals with the possibility of implanting human-like consciousness into machines, taking them beyond the realm of ‘thinking’ to that of ‘feeling, emotions and beliefs’. It’s a complex topic and requires delving into an amalgamation of philosophy, cognitive science and neuroscience. ‘Consciousness’ itself can be interpreted in multiple ways, bringing together a plethora of attributes like self-awareness, cause-effect in mental states, memory, experiences etc. To bring machines to a state of human-like consciousness would entail replicating all the activities that happen at a neural level in a human brain – by no means a meagre task. If and when this were to be achieved, it would require a paradigm shift in the functioning of the world. Human society, as we know it, will need a major redefinition to incorporate machines with consciousness co-existing with humans. It sounds far-fetched today, but questions such as this need pondering right now, so as to be able to influence the direction in which we move when it comes to AI and machine consciousness, while things are still in the ‘design’ phase so to speak. While all of the above are pertinent questions, I believe they don’t necessarily outweigh the advantages of AI. Of course, there is a need to address them systematically, control the path of AI development and minimize adverse impact. In my opinion, the greatest and most imminent risk is actually a fourth item, not often taken into consideration, when discussing the pitfalls of AI. D. Oligarchy: Or to put it differently, the question of control. Due to the very nature of AI – it requires immense investments in technology and science – there are realistically only a handful of organizations (private or government) that can make the leap into taking AI into the mainstream, in a scalable manner, and across a vast array of applications. There is going to be very little room for small upstarts, however smart they might be, to compete at scale against these. Given the massive aspects of our lives that will likely be steered by AI enabled machines, those who control that ‘intelligence’ will hold immense power over the rest of us. That all familiar phrase ‘with great power, comes great responsibility’ will take a whole new meaning – the organizations and/or individuals that are at the forefront of the generally available AI applications would likely have more power than the most despotic autocrats in history. This is a true and real hazard, aspects of which are already becoming areas of concern in the form of discussions around things like privacy. In conclusion, AI, like all major transformative events in human history, is certain to have wide reaching ramifications. But with careful forethought these can be addressed. In the short to medium term, the advantages of AI in enhancing our lives, will likely outweigh these risks. Any major conception that touches human lives in a broad manner, if not handled properly, can pose immense danger. The best analogy I can think of is religion – when not channelled appropriately, it probably poses a greater threat than any technological advancement ever could.

Read More

What is Data Integrity and Why is it Important?

Article | July 19, 2021

In an era of big data, data health has become a pressing issue when more and more data is being stored and processed. Therefore, preserving the integrity of the collected data is becoming increasingly necessary. Understanding the fundamentals of data integrity and how it works is the first step in safeguarding the data. Data integrity is essential for the smooth running of a company. If a company’s data is altered, deleted, or changed, and if there is no way of knowing how it can have significant impact on any data-driven business decisions. Data integrity is the reliability and trustworthiness of data throughout its lifecycle. It is the overall accuracy, completeness, and consistency of data. It can be indicated by lack of alteration between two updates of a data record, which means data is unchanged or intact. Data integrity refers to the safety of data regarding regulatory compliance- like GDPR compliance- and security. A collection of processes, rules, and standards implemented during the design phase maintains the safety and security of data. The information stored in the database will remain secure, complete, and reliable no matter how long it’s been stored; that’s when you know that the integrity of data is safe. A data integrity framework also ensures that no outside forces are harming this data. This term of data integrity may refer to either the state or a process. As a state, the data integrity framework defines a data set that is valid and accurate. Whereas as a process, it describes measures used to ensure validity and accuracy of data set or all data contained in a database or a construct. Data integrity can be enforced at both physical and logical levels. Let us understand the fundamentals of data integrity in detail: Types of Data Integrity There are two types of data integrity: physical and logical. They are collections of processes and methods that enforce data integrity in both hierarchical and relational databases. Physical Integrity Physical integrity protects the wholeness and accuracy of that data as it’s stored and retrieved. It refers to the process of storage and collection of data most accurately while maintaining the accuracy and reliability of data. The physical level of data integrity includes protecting data against different external forces like power cuts, data breaches, unexpected catastrophes, human-caused damages, and more. Logical Integrity Logical integrity keeps the data unchanged as it’s used in different ways in a relational database. Logical integrity checks data accuracy in a particular context. The logical integrity is compromised when errors from a human operator happen while entering data manually into the database. Other causes for compromised integrity of data include bugs, malware, and transferring data from one site within the database to another in the absence of some fields. There are four types of logical integrity: Entity Integrity A database has columns, rows, and tables. These elements need to be as numerous as required for the data to be accurate, but no more than necessary. Entity integrity relies on the primary key, the unique values that identify pieces of data, making sure the data is listed just once and not more to avoid a null field in the table. The feature of relational systems that store data in tables can be linked and utilized in different ways. Referential Integrity Referential integrity means a series of processes that ensure storage and uniform use of data. The database structure has rules embedded into them about the usage of foreign keys and ensures only proper changes, additions, or deletions of data occur. These rules can include limitations eliminating duplicate data entry, accurate data guarantee, and disallowance of data entry that doesn’t apply. Foreign keys relate data that can be shared or null. For example, let’s take a data integrity example, employees that share the same work or work in the same department. Domain Integrity Domain Integrity can be defined as a collection of processes ensuring the accuracy of each piece of data in a domain. A domain is a set of acceptable values a column is allowed to contain. It includes constraints that limit the format, type, and amount of data entered. In domain integrity, all values and categories are set. All categories and values in a database are set, including the nulls. User-Defined Integrity This type of logical integrity involves the user's constraints and rules to fit their specific requirements. The data isn’t always secure with entity, referential, or domain integrity. For example, if an employer creates a column to input corrective actions of the employees, this data would fall under user-defined integrity. Difference between Data Integrity and Data Security Often, the terms data security and data integrity get muddled and are used interchangeably. As a result, the term is incorrectly substituted for data integrity, but each term has a significant meaning. Data integrity and data security play an essential role in the success of each other. Data security means protecting data against unauthorized access or breach and is necessary to ensure data integrity. Data integrity is the result of successful data security. However, the term only refers to the validity and accuracy of data rather than the actual act of protecting data. Data security is one of the many ways to maintain data integrity. Data security focuses on reducing the risk of leaking intellectual property, business documents, healthcare data, emails, trade secrets, and more. Some facets of data security tactics include permissions management, data classification, identity, access management, threat detection, and security analytics. For modern enterprises, data integrity is necessary for accurate and efficient business processes and to make well-intentioned decisions. Data integrity is critical yet manageable for organizations today by backup and replication processes, database integrity constraints, validation processes, and other system protocols through varied data protection methods. Threats to Data Integrity Data integrity can be compromised by human error or any malicious acts. Accidental data alteration during the transfer from one device to another can be compromised. There is an assortment of factors that can affect the integrity of the data stored in databases. Following are a few of the examples: Human Error Data integrity is put in jeopardy when individuals enter information incorrectly, duplicate, or delete data, don’t follow the correct protocols, or make mistakes in implementing procedures to protect data. Transfer Error A transfer error occurs when data is incorrectly transferred from one location in a database to another. This error also happens when a piece of data is present in the destination table but not in the source table in a relational database. Bugs and Viruses Data can be stolen, altered, or deleted by spyware, malware, or any viruses. Compromised Hardware Hardware gets compromised when a computer crashes, a server gets down, or problems with any computer malfunctions. Data can be rendered incorrectly or incompletely, limit, or eliminate data access when hardware gets compromised. Preserving Data Integrity Companies make decisions based on data. If that data is compromised or incorrect, it could harm that company to a great extent. They routinely make data-driven business decisions, and without data integrity, those decisions can have a significant impact on the company’s goals. The threats mentioned above highlight a part of data security that can help preserve data integrity. Minimize the risk to your organization by using the following checklist: Validate Input Require an input validation when your data set is supplied by a known or an unknown source (an end-user, another application, a malicious user, or any number of other sources). The data should be validated and verified to ensure the correct input. Validate Data Verifying data processes haven’t been corrupted is highly critical. Identify key specifications and attributes that are necessary for your organization before you validate the data. Eliminate Duplicate Data Sensitive data from a secure database can easily be found on a document, spreadsheet, email, or shared folders where employees can see it without proper access. Therefore, it is sensible to clean up stray data and remove duplicates. Data Backup Data backups are a critical process in addition to removing duplicates and ensuring data security. Permanent loss of data can be avoided by backing up all necessary information, and it goes a long way. Back up the data as much as possible as it is critical as organizations may get attacked by ransomware. Access Control Another vital data security practice is access control. Individuals in an organization with any wrong intent can harm the data. Implement a model where users who need access can get access is also a successful form of access control. Sensitive servers should be isolated and bolted to the floor, with individuals with an access key are allowed to use them. Keep an Audit Trail In case of a data breach, an audit trail will help you track down your source. In addition, it serves as breadcrumbs to locate and pinpoint the individual and origin of the breach. Conclusion Data collection was difficult not too long ago. It is no longer an issue these days. With the amount of data being collected these days, we must maintain the integrity of the data. Organizations can thus make data-driven decisions confidently and take the company ahead in a proper direction. Frequently Asked Questions What are integrity rules? Precise data integrity rules are short statements about constraints that need to be applied or actions that need to be taken on the data when entering the data resource or while in the data resource. For example, precise data integrity rules do not state or enforce accuracy, precision, scale, or resolution. What is a data integrity example? Data integrity is the overall accuracy, completeness, and consistency of data. A few examples where data integrity is compromised are: • When a user tries to enter a date outside an acceptable range • When a user tries to enter a phone number in the wrong format • When a bug in an application attempts to delete the wrong record What are the principles of data integrity? The principles of data integrity are attributable, legible, contemporaneous, original, and accurate. These simple principles need to be part of a data life cycle, GDP, and data integrity initiatives. { "@context": "https://schema.org", "@type": "FAQPage", "mainEntity": [{ "@type": "Question", "name": "What are integrity rules?", "acceptedAnswer": { "@type": "Answer", "text": "Precise data integrity rules are short statements about constraints that need to be applied or actions that need to be taken on the data when entering the data resource or while in the data resource. For example, precise data integrity rules do not state or enforce accuracy, precision, scale, or resolution." } },{ "@type": "Question", "name": "What is a data integrity example?", "acceptedAnswer": { "@type": "Answer", "text": "Data integrity is the overall accuracy, completeness, and consistency of data. A few examples where data integrity is compromised are: When a user tries to enter a date outside an acceptable range When a user tries to enter a phone number in the wrong format When a bug in an application attempts to delete the wrong record" } },{ "@type": "Question", "name": "What are the principles of data integrity?", "acceptedAnswer": { "@type": "Answer", "text": "The principles of data integrity are attributable, legible, contemporaneous, original, and accurate. These simple principles need to be part of a data life cycle, GDP, and data integrity initiatives." } }] }

Read More

Spotlight

CrowdFlower Inc.

CrowdFlower is the essential human-in-the-loop platform for data science teams. CrowdFlower helps customers generate high quality customized training data for their machine learning initiatives, or automate a business process with easy-to-deploy models and integrated human-in-the-loop workflows. The CrowdFlower platform supports a wide range of use cases including self-driving cars, intelligent personal assistants, medical image labeling, content categorization, customer support ticket classification, social data insight, CRM data enrichment, product categorization, and search relevance.

Events