EFFECTIVE STRATEGIES TO DEMOCRATIZE DATA SCIENCE IN YOUR ORGANIZATION

analyticsindiamag.com | April 13, 2020

Chip Conley, head of global hospitality and strategy at Airbnb, believed that experience of staying in Airbnb is the soul of customer strategy. He accepted that data analytics is the key to understand and connect customer’s voice to Airbnb product line so envisage the experience customers are seeking when select Airbnb.In 2011, when Airbnb expanded out of SFO to 22 other locations internationally, they faced challenges with the team’s mix, collaboration, and business transformation. Airbnb survived the challenge by building data-driven culture and empowering its employees to work with data. Instead of the building spoke teams, democratizing data practices helped them go faster in the decision-making process.Today, no industry is untouched from digital disruption and the global economy is perceiving the business of data as one of the emerging sectors. What empowers Amazon to venture into healthcare? It’s the data that enable Netflix and Uber to transform business and build new markets. Per Forbes, the data economy is expected to grow to 35 zettabytes and to analyze this huge heap, organizations need analysts, if not scientists. The skills to analyze data and extract meaningful insights are still a niche in the industry. Most of the organizations relegate data and analytics responsibilities to a central team. No issues with this equation, but the challenge is how to scale and stay sustainable in the long run.The other major challenge is teaming and communication. Data scientists and business teams spend days in pointless discussions trying to dig through a problem. The lack of data literacy and ability to paint the business context causes brings friction between teams. This impacts business’s decision velocity and the ability to react on the fly, to internal or external factors.

Spotlight

It’s exciting to think about all the cool new frameworks and data sources out there, and the promise they hold. Spark and MapReduce, Kafka and NiFi – they all have their place in the ecosystem, but we can’t forget about legacy systems, which often hold a wealth of historical and customer data. Present day initiatives must find ways to integrate big iron with big data, while looking toward the future.


Other News
BUSINESS INTELLIGENCE

BigID launches Data Insights Studio to close the gap between insight and action

BigID | April 08, 2022

BigID, the leading data intelligence platform that enables organizations to know their enterprise data and take action for privacy, security, and governance, today launched Data Insights Studio, a new capability that provides rich and insightful reporting and analytics about the state of data across the entire organization. Data Insights Studio gives privacy, security, and governance teams the power to create rich, insightful, and actionable reporting best suited for their organization and easily monitor relevant metrics to better assess the progress of their data initiatives. Data Insights Studio seeks to close the gap between insight and action so that teams have the speed to make the best decisions about their data. BigID Data Studio enables organizations to know when and where to take action on their data through accurate reporting and analytics. Capabilities include: Driving proactive executive decision-making with actionable insights about data security, privacy, and governance initiatives. Actively monitoring trends, metrics, and important KPIs over time while also allowing for forecastability. Empowering users through self-service, configurable reporting, and alleviating the burden off of IT admins. Centralizing data intelligence holistically for organizations with global data footprints. "Data Insights Studio brings a whole new meaning to data intelligence. We're giving customers the power to visualize their data like never before. We strongly believe that providing easy and customizable reporting of how metadata changes over time and across multiple sites, as well as an ability to analyze trends and monitor critical KPIs, is the key to facilitating and accelerating better data management and value on data across the organization." Nimrod Vax, Co-Founder and Head of Product at BigID Reporting can take on various forms and provides immense value to help illustrate the direction and progress of initiatives. Unfortunately, developing proper executive reporting that addresses the needs of an organization can be nuanced and cumbersome. IT teams are often required to manually create reports and analytics to fit their needs, a method that not only fails to scale well but also lacks proper speed and accuracy to drive the right decisions. Data Insights Studio aims to close the gap between insight and action so that IT teams have the agility to act on their data with certainty. About BigID BigID's data intelligence platform enables organizations to know their enterprise data and take action for privacy, protection, and perspective. Customers deploy BigID to proactively discover, manage, protect, and get more value from their regulated, sensitive, and personal data across their data landscape. BigID has been recognized for its data intelligence innovation as a 2019 World Economic Forum Technology Pioneer, named to the 2021 Forbes Cloud 100, the 2021 Inc 5000 as the #19th fastest growing company and #1 in Security, a Business Insider 2020 AI Startup to Watch, and an RSA Innovation Sandbox winner.

Read More

BIG DATA MANAGEMENT

New Release of Talend Trust Score Enables Data Teams to Establish a Foundation for Data Health

Talend | May 09, 2022

Talend, a global leader in data integration and management, announced today at Gartner Data & Analytics Summit in London, the latest version of Talend Data Fabric. In its Spring '22 announcement, Talend will add advanced capabilities to Talend Trust Score™, including aggregation and historical views into the health of any dataset. These new features will help businesses analyze combined data quality metrics to evaluate data trust at macro and micro levels, including across all datasets, groups of datasets, or individual datasets. According to a survey taken of global executives in 2021, 78% say they face challenges in using their data, and more than a third say they simply aren't using it to make decisions. In fact, Gartner recently reported that inconsistent business outcomes due to unreliable data and poor data quality are responsible for an average of $12.8M per year in losses for organizations. As the first advanced trust score available in the industry, Talend Trust Score helps businesses assess the quality of their datasets. Talend Trust Score intelligently evaluates and scores data in Talend customer environments by using crawlers that automatically scan datasets in on-premises and cloud data warehouses such as Snowflake, AWS, Microsoft Azure, or Google. Businesses can also identify quality issues with incoming data from third-party systems/source systems and remedy them immediately, before there is a negative impact. Talend's Spring '22 enables businesses to see trends and measure data trust over time and identify data drift issues to ensure reliable information is used to drive optimal business outcomes. New features include: Talend Trust Score by Groups provides more targeted insights into the health of data assets instantly with trust score grouping via metadata. Now users can filter any group of datasets, or individual datasets, and see an aggregate trust score that can serve as an enhanced "single-pane-of-glass" view into data health. This provides a fully tailored view of datasets that are relevant to each user, for an at-a-glance view of actionable data quality metrics. Talend Trust Score Trending provides a temporal view of the health of datasets. Customers can now see trends and measure the effectiveness of data programs on an ongoing basis and surface issues that are not visible with snapshots of quality, such as data drift. Customers may scan datasets at intervals such as daily, weekly, and hourly, providing a view into dataset quality to help provide a more accurate assessment at any given time. In addition to Talend Trust Score updates, Spring '22 accelerates productivity with collaborative workflows that can serve as a conduit between users at different technical levels. Talend expands its centralized repository with Data Quality Rules in Talend Studio, a step to ensure these simple-to-configure rules are available for reuse across the Talend ecosystem, on any data, no matter its location or format. "Talend continues to raise the bar on innovation with our customers in mind. Our advancements to Talend Trust Score will help businesses understand the ongoing quality of their data and feel confident in the decisions they are making. This new product release is another step toward helping our customers leverage healthy data to achieve powerful business outcomes." Jamie Fiorda, vice president, product marketing, Talend About Talend Talend, a leader in data integration and data management, is changing the way the world makes decisions. Talend Data Fabric is the only platform that seamlessly combines an extensive range of data integration and governance capabilities to actively manage the health of corporate information. This unified approach is unique and essential to delivering complete, clean, and uncompromised data in real-time to all employees. It has made it possible to create innovations like the Talend Trust Score™, an industry-first assessment that instantly quantifies the reliability of any dataset.

Read More

BIG DATA MANAGEMENT

Komprise Automates Unstructured Data Discovery with Smart Data Workflows

Komprise | May 20, 2022

Komprise, the leader in analytics-driven unstructured data management and mobility, today announced Komprise Smart Data Workflows, a systematic process to discover relevant file and object data across cloud, edge and on-premises datacenters and feed data in native format to AI and machine learning (ML) tools and data lakes. Industry analysts predict that at least 80% of the world’s data will be unstructured by 2025. This data is critical for AI and ML-driven applications and insights, yet much of it is locked away in disparate data storage silos. This creates an unstructured data blind spot, resulting in billions of dollars in missed big data opportunities. Komprise has expanded Deep Analytics Actions to include copy and confine operations based on Deep Analytics queries, added the ability to execute external functions such as running natural language processing functions via API and expanded global tagging and search to support these workflows. Komprise Smart Data Workflows allow you to define and execute a process with as many of these steps needed in any sequence, including external functions at the edge, datacenter or cloud. Komprise Global File Index and Smart Data Workflows together reduce the time it takes to find, enrich and move the right unstructured data by up to 80%. “Komprise has delivered a rapid way to visualize our petabytes of instrument data and then automate processes such as tiering and deletion for optimal savings,” says Jay Smestad, senior director of information technology at PacBio. “Now, the ability to automate workflows so we can further define this data at a more granular level and then feed it into analytics tools to help meet our scientists’ needs is a game changer.” Komprise Smart Data Workflows are relevant across many sectors. Here’s an example from the pharmaceutical industry: 1) Search: Define and execute a custom query across on-prem, edge and cloud data silos to find all data for Project X with Komprise Deep Analytics and the Komprise Global File Index. 2) Execute & Enrich: Execute an external function on Project X data to look for a specific DNA sequence for a mutation and tag such data as "Mutation XYZ". 3) Cull & Mobilize: Move only Project X data tagged with "Mutation XYZ" to the cloud using Komprise Deep Analytics Actions for central processing. 4) Manage Data Lifecycle: Move the data to a lower storage tier for cost savings once the analysis is complete. Other Smart Data Workflow use cases include: Legal Divestiture: Find and tag all files related to a divestiture project and move sensitive data to an object-locked storage bucket and move the rest to a writable bucket. Autonomous Vehicles: Find crash test data related to abrupt stopping of a specific vehicle model and copy this data to the cloud for further analysis. Execute an external function to identify and tag data with Reason = Abrupt Stop and move only the relevant data to the cloud data lakehouse to reduce time and cost associated with moving and analyzing unrelated data. “Whether it’s massive volumes of genomics data, surveillance data, IoT, GDPR or user shares across the enterprise, Komprise Smart Data Workflows orchestrate the information lifecycle of this data in the cloud to efficiently find, enrich and move the data you need for analytics projects. “We are excited to move to this next phase of our product journey, making it much easier to manage and mobilize massive volumes of unstructured data for cost reduction, compliance and business value.” Kumar Goswami, CEO of Komprise About Komprise Komprise is a provider of unstructured data management and mobility software that frees enterprises to easily analyze, mobilize, and monetize the right file and object data across clouds without shackling data to any vendor. With Komprise Intelligent Data Management, you can cut 70% of enterprise storage, backup and cloud costs while making data easily available to cloud-based data lakes and analytics tools.

Read More

BIG DATA MANAGEMENT

Sesame Software Provides Adaptive Data Management Strategies for the Modern Data Lakehouse

Sesame Software | February 25, 2022

Sesame Software, the innovative leader in Enterprise Data Management and creator of Relational Junction, today announced its Lakehouse Platform, providing simplified data architecture by eliminating the data silos that traditionally separate analytics and data science. The Relational Junction Lakehouse Platform combines the best elements of data lakes and data warehouses — delivering data management and performance typically found in data warehouses with the low-cost, flexible object stores offered by data lakes. "A streamlined method to access data is difficult to achieve with a separate cloud data warehouse and data lake." - says Rick Banister, founder, and CEO of Sesame Software. "A data lakehouse offers the best of both worlds – by replacing data silos with a single home for structured, semi-structured, and unstructured data, Relational Junction provides a solid and scalable Lakehouse foundation." Relational Junction Key Benefits: Provides performance and governance required to support all types of data workloads Hyper-threaded technology delivers massive scale and speed Runs operations on one simplified architecture, avoiding complex, redundant systems Supports advanced analytics, and lower total cost of ownership Relational Junction: The Lakehouse Foundation Relational Junction delivers reliability, security, and scalability on your data lake, making it the top data ingest platform to handle vast amounts of data from a wide variety of SaaS applications and databases. With plug-in architecture, new data sources can be added with configuration-only deployment. The platform's scalable architecture continuously evolves with organization-spanning data needs, allowing users easy access to their data and the ability to use how they see fit. Request a demo to learn more about Relational Junction Junction's Data Lakehouse Platform. About Sesame Software Sesame Software is the Enterprise Data Management leader, delivering data rapidly for enhanced reporting and analytics. Sesame Software's patented Relational Junction data platform offers superior solutions for data warehousing, integration, backup, and compliance to fit your business needs. Quickly connect to SaaS, on-premise, and cloud applications for accelerated insights.

Read More

Spotlight

It’s exciting to think about all the cool new frameworks and data sources out there, and the promise they hold. Spark and MapReduce, Kafka and NiFi – they all have their place in the ecosystem, but we can’t forget about legacy systems, which often hold a wealth of historical and customer data. Present day initiatives must find ways to integrate big iron with big data, while looking toward the future.

Resources