SparkR: Transforming R into a tool for big data analytics

| May 23, 2017

article image
This white paper introduces SparkR, a package for the R statistical programming language that enables programmers and data scientists to access large-scale in-memory data processing. The R runtime is single-threaded and can therefore normally only run on a single computer, processing data sets that fit within that machine’s memory. By providing a bridge to Spark’s distributed computation engine, SparkR enables large R jobs to run across multiple cores within a single machine or across nodes in massively parallel clusters, with access to all the memory in the cluster.

Spotlight

Intelligent Solutions , Inc

Intelligent Solutions, Inc. (ISI) is a professional services firm dedicated to assessing, planning, and guiding business intelligence and customer analytics efforts. ISI offers consulting, education, and literature based on two conceptual architectures: Corporate Information Factory - an industry icon, used by hundreds of organizations to plan, architect, and deliver their BI capabilities, and Customer Life Cycle - a framework for using technology to support CRM and other customer-facing initiatives ISI's continued success at leveraging these architectures and maximizing their value for our clients has earned ISI the reputation of being leaders and experts in this field. Intelligent Solutions' mission is simple: help our clients help themselves. ISI's offerings are designed to help our clients plan, design, and execute their BI and CRM initiatives, using their own resources wherever possible. ISI's services are based on the three primary keys to success: architecture, strategy, and edu

OTHER ARTICLES

What Is The Value Of A Big Data Project

Article | April 7, 2020

According to software vendors executing the big data projects, the answer is clear: More data means more options. Then add a bit of machine learning (ML) for good measure to get told what to do, and the revenue will thrive.This is not really feasible. Therefore, before starting a big data project, a checklist might come in handy.Make sure that the insights gained through machine learning are actionable. Gaining insights is always good, but it is even better if you can act on this new knowledge.A shopping basket analysis shows which products are sold together. What to do with that information?Companies could place the two products in opposite corners of the shop, so customers walk through all areas and will find other products to buy in addition. Or they could place both products next to each other so each boosts the sales of the other. Or how about discounting one product to gain more customers?As all actions have unknown side effects, companies have to decide for themselves which action makes sense to take in their case.

Read More

Can you really trust Amazon Product Recommendation?

Article | January 28, 2021

Since the internet became popular, the way we purchase things has evolved from a simple process to a more complicated process. Unlike traditional shopping, it is not possible to experience the products first-hand when purchasing online. Not only this, but there are more options or variants in a single product than ever before, which makes it more challenging to decide. To not make a bad investment, the consumer has to rely heavily on the customer reviews posted by people who are using the product. However, sorting through relevant reviews at multiple eCommerce platforms of different products and then comparing them to choose can work too much. To provide a solution to this problem, Amazon has come up with sentiment analysis using product review data. Amazon performs sentiment analysis on product review data with Artificial Intelligence technology to develop the best suitable products for the customer. This technology enables Amazon to create products that are most likely to be ideal for the customer. A consumer wants to search for only relevant and useful reviews when deciding on a product. A rating system is an excellent way to determine the quality and efficiency of a product. However, it still cannot provide complete information about the product as ratings can be biased. Textual detailed reviews are necessary to improve the consumer experience and in helping them make informed choices. Consumer experience is a vital tool to understand the customer's behavior and increase sales. Amazon has come up with a unique way to make things easier for their customers. They do not promote products that look similar to the other customer's search history. Instead, they recommend products that are similar to the product a user is searching for. This way, they guide the customer using the correlation between the products. To understand this concept better, we must understand how Amazon's recommendation algorithm has upgraded with time. The history of Amazon's recommendation algorithm Before Amazon started a sentiment analysis of customer product reviews using machine learning, they used the same collaborative filtering to make recommendations. Collaborative filtering is the most used way to recommend products online. Earlier, people used user-based collaborative filtering, which was not suitable as there were many uncounted factors. Researchers at Amazon came up with a better way to recommend products that depend on the correlation between products instead of similarities between customers. In user-based collaborative filtering, a customer would be shown recommendations based on people's purchase history with similar search history. In item-to-item collaborative filtering, people are shown recommendations of similar products to their recent purchase history. For example, if a person bought a mobile phone, he will be shown hints of that phone's accessories. Amazon's Personalization team found that using purchase history at a product level can provide better recommendations. This way of filtering also offered a better computational advantage. User-based collaborative filtering requires analyzing several users that have similar shopping history. This process is time-consuming as there are several demographic factors to consider, such as location, gender, age, etc. Also, a customer's shopping history can change in a day. To keep the data relevant, you would have to update the index storing the shopping history daily. However, item-to-item collaborative filtering is easy to maintain as only a tiny subset of the website's customers purchase a specific product. Computing a list of individuals who bought a particular item is much easier than analyzing all the site's customers for similar shopping history. However, there is a proper science between calculating the relatedness of a product. You cannot merely count the number of times a person bought two items together, as that would not make accurate recommendations. Amazon research uses a relatedness metric to come up with recommendations. If a person purchased an item X, then the item Y will only be related to the person if purchasers of item X are more likely to buy item Y. If users who purchased the item X are more likely to purchase the item Y, then only it is considered to be an accurate recommendation. Conclusion In order to provide a good recommendation to a customer, you must show products that have a higher chance of being relevant. There are countless products on Amazon's marketplace, and the customer will not go through several of them to figure out the best one. Eventually, the customer will become frustrated with thousands of options and choose to try a different platform. So Amazon has to develop a unique and efficient way to recommend the products that work better than its competition. User-based collaborative filtering was working fine until the competition increased. As the product listing has increased in the marketplace, you cannot merely rely on previous working algorithms. There are more filters and factors to consider than there were before. Item-to-item collaborative filtering is much more efficient as it automatically filters out products that are likely to be purchased. This limits the factors that require analysis to provide useful recommendations. Amazon has grown into the biggest marketplace in the industry as customers trust and rely on its service. They frequently make changes to fit the recent trends and provide the best customer experience possible.

Read More

Choosing External Data Sources: 4 Characteristics to Look For

Article | May 12, 2021

Decision-makers at consumer brands are finally realizing the full transformative potential of external data - but they’re also realizing how difficult it is to source. Forrester reports that 87% of decision-makers in data and analytics have implemented or are planning initiatives to source more external data. And those initiatives are growing outside of the IT team; 29% of those surveyed say that IT has primary ownership of data sourcing, down from 37% in 2016. To support these projects, organizations are increasingly turning to a new specialist: the data hunter, who identifies and vets external data sources. It’s a lot of work to build external data-focused teams, and many leaders are realizing that external data is difficult to scale as the source list grows. Perhaps that’s why 66% of those decision-makers surveyed by Forrester report that they’re using or planning to use external service providers for data, analytics, and insights.

Read More

The case for hybrid artificial intelligence

Article | March 4, 2020

Deep learning, the main innovation that has renewed interest in artificial intelligence in the past years, has helped solve many critical problems in computer vision, natural language processing, and speech recognition. However, as the deep learning matures and moves from hype peak to its trough of disillusionment, it is becoming clear that it is missing some fundamental components.

Read More

Spotlight

Intelligent Solutions , Inc

Intelligent Solutions, Inc. (ISI) is a professional services firm dedicated to assessing, planning, and guiding business intelligence and customer analytics efforts. ISI offers consulting, education, and literature based on two conceptual architectures: Corporate Information Factory - an industry icon, used by hundreds of organizations to plan, architect, and deliver their BI capabilities, and Customer Life Cycle - a framework for using technology to support CRM and other customer-facing initiatives ISI's continued success at leveraging these architectures and maximizing their value for our clients has earned ISI the reputation of being leaders and experts in this field. Intelligent Solutions' mission is simple: help our clients help themselves. ISI's offerings are designed to help our clients plan, design, and execute their BI and CRM initiatives, using their own resources wherever possible. ISI's services are based on the three primary keys to success: architecture, strategy, and edu

Events