As artificial intelligence continues to become globally adopted, it’s vital that emerging technologies are built on data that’s accurate, diverse, free of bias, and ethically sourced.
MEDIA 7: Could you please take us through your professional journey?
WENDY GONZALEZ: Of course! I have spent the last two decades gaining managerial and technology leadership experience through my work for companies including Ernst and Young, Capgemini, and General Communications Inc. I also co-founded Cycle30 where we were developing SaaS solutions for Internet of Things (IoT) devices prior to joining Sama six years ago.What drew me to Sama was not only our best-in-class training data solutions that enable businesses to build innovative AI models, but also our strong social impact mission. After serving as President and COO of Sama since 2018, I stepped into the role of Interim CEO when my friend, Sama’s CEO and founder, Leila Janah, passed away in January of 2020. This past December, I was officially named CEO.
M7: For the second time, Sama made the Inc. 5000 List of America’s Fastest-Growing Private Companies! What was the strategy behind achieving such prestigious recognition in the industry?
WG: I think the result of our recognition on the Inc. 5000 list is two-fold. First, our team at Sama is one of the hardest-working teams in the industry. Each day, we’re diligently working to ensure that we are providing the highest-quality training data for the world’s leading companies such as Walmart, Google, and NVIDIA. The second factor is our unending dedication to innovation. At Sama, we are passionate about building a work environment that fosters experimentation. That’s why we host an annual ‘Innovation Week,’ where we provide employees with the resources they need to work on new projects and try new things. Last year, over 50% of all Innovation Week projects were incorporated into our roadmap and implemented within the following two quarters. Our MicroModel™ technology was actually born from one of these projects, and today helps our clients get higher quality training data in half the time. I am also excited to share that in addition to being recognized on the Inc. 5000 list of one of America’s fastest-growing private companies for the second year in a row, we’ve also been honored with an inclusion in Fast Company’s World Changing Ideas and Forbes AI 50 lists. Additionally, we recently completed our three-year Randomized Controlled Trial with MIT which validated our social impact model that has helped over 55,000 people lift themselves out of poverty through our skills training and AI work.
From helping doctors diagnose cancer to providing traders with more accurate predictions of the stock market, AI can be an incredibly powerful tool in boosting efficiencies across industries, but only if it’s built on training data that’s accurate, diverse, bias-free, and ethically sourced.
M7: What is Sama’s MicroModel technology and how does it automate data annotation?
WG: Our MicroModel™ technology is a powerful combination of machine learning-assisted annotation and ethical human validation that consistently produces data that’s between 94% and 98% accurate. First, our Machine Learning Assisted Annotation (MAA) powered by MicroModels™ draws from a library of models trained on specific use cases to expedite the labeling process by generating high-quality pre-labeled annotations. Next, our skilled annotators validate those labels instead of annotating from scratch, maximizing efficiency and producing higher quality training data in half the time. While our models are pre-trained on specific use cases, our human-in-the-loop validation helps us improve our models over time. Through the feedback from our skilled annotators, our models constantly learn to make better predictions. The end result for our clients is ultimately a significantly shorter, and higher quality path to production.
M7: What do you see as the most noticeable change right now happening in the workforce, encouraged by the rise of digital technologies?
WG: From helping doctors diagnose cancer to providing traders with more accurate predictions of the stock market, AI can be an incredibly powerful tool in boosting efficiencies across industries, but only if it’s built on training data that’s accurate, diverse, bias-free, and ethically sourced. The truth is 8 out of 10 machine learning (ML) projects fail, and 96% run into data quality and labeling problems which can produce inaccurate algorithms and may result in ethical, legal, and safety issues. At Sama, we understand the importance of having accurate, diverse, and bias-free training data to fuel this next generation of AI technology. When it comes to developing AI models, there’s no room to cut corners. We make producing accurate data our number one priority and are constantly innovating in order to provide our customers with the highest quality results to bring their AI models to production efficiently without sacrificing quality.
There are a number of challenges in the AI industry in the COVID-19 era. The first is keeping up with the demand for AI technology.
M7: What do you believe are the top three challenges in the industry in the post-COVID-19 era?
WG: There are a number of challenges in the AI industry in the COVID-19 era. The first is keeping up with the demand for AI technology. During the pandemic, we saw a dramatic shift as companies quickly adopted digital technology in order to boost efficiencies, cut costs, and keep up with competitors. This rapid digital transformation has in turn caused the artificial intelligence market to skyrocket. In 2020, the industry was estimated to be worth over $62 billion dollars, that number is expected to jump to more than $93 billion this year. As AI becomes incorporated into every industry, government organization, and virtually every aspect of our lives, companies are under extreme pressure to bring their AI models to production faster to keep up with competitors.
The second challenge is ensuring emerging technologies are built on a foundation of accurate data. Data quality plays an instrumental role in a machine learning model’s performance. Quality does not just mean the difference between your AI project succeeding or failing, but in some cases means the difference between life or death. Imagine, for example, that a self-driving car was built on data that didn’t recognize people, just stop signs? Creating quality data is one of the most expensive and time-consuming parts of building an AI model but it’s also the most crucial. At Sama, we understand accurate and quality data is important at every step of the machine learning process. The definition of what “quality” means however varies depending on each project, that’s why we always start by working with our clients to first define the meaning of quality. Through this process we are able to anticipate and address errors before the annotation process even begins, saving our clients time and money.
The third challenge is knowing what to automate and what not to automate. As we continue to boost efficiencies, it’s important to recognize that technology isn’t always the most effective way to streamline production. What makes our MicroModel™ technology so effective at consistently producing data that’s between 94% and 98% accurate is our human-in-the-loop validation. Our skilled annotators can recognize errors and inconsistencies that would otherwise have led to inaccurate data if left to automation alone. At Sama, we recognize the importance of having a high-performing and innovative team, which is why we’ve been so efficient at providing our customers with the highest quality data.
M7: Being a business leader, how do you prepare for an AI-centric world?
WG: In many ways, an AI-centric world is already here. As artificial intelligence continues to become globally adopted, it’s vital that emerging technologies are built on data that’s accurate, diverse, free of bias, and ethically sourced. As a leader in AI, I’m preparing for an AI-centric world by doing what we do best at Sama, which is building a high-performing team that develops and scales innovative, impactful technology for some of the world’s most ambitious AI companies while continuing to advocate for an ethical AI supply chain.