Computer Vision Versus Other ML Projects

During this webinar, industry experts discussed computer vision projects versus other machine learning projects within an enterprise setting. We’ve included a short transcription of the webinar, beginning at 8:54 of the webinar.

Jesse Shanahan, Booz Allen Hamilton: So what is unique about a computer vision project versus other machine learning projects within an enterprise setting? Vladimir, I’m going to pitch this one to you because you haven’t spoken much and I want to give you a chance. 

Vladimir Iglovikov, Lyft: There are some similarities in both of them. But what I’ve seen in companies that I have worked for, including Lyft right now, you need slightly different expertise for many traditional machine learning models. You need some data scientists, research scientists, and people that do some stuff and maybe not great software developers. 

When you talk about deploying in computer vision, you need people that are most skilled in writing production quality code and being scalable systems. So even if you’re looking for hiring, you are looking for slightly different people. Another thing is that computer vision is very data-intensive, so if you’re talking about retail or maybe some other things that are serious, you need them to be big but not that huge. When you’re talking about high definition computer vision problems and your work with large datasets and you have something like terabytes, tens of hundreds of terabytes of data and you need to process this efficiently, you need to build a whole ecosystem. So in this set, you don’t care about the researchers and latest state-of-the-art anymore. You care about MLops and the corresponding disciplines.

So basically they’re similar but really different and yeah, there’s overlap in terms of mathematics and maybe in terms of training for both of these divisions, but I will say it changes in what they are trying to leverage.

Jesse Shanahan, Booz Allen Hamilton: Okay, and what about anyone else? Has anyone had a different experience or do you find that Vladimir’s explanation is what you found as well?

Abhishek Singh, Apple: Just to add to what Vladimir said about data, I also feel that probably in the enterprise setting, the issue of adapting to a specific domain becomes important. So it might not always be possible to just reuse existing data sets that have been collected for some other purpose to solve your specific niche problem in the other place. So that’s where investment in data is required and that’s where people like you know Daeil and Alyssa are coming to the picture. You really have to collect data for your specific use case, which may not be a general purpose. So that might become a challenge for doing a lot of these things in the enterprise. 

Alyssa Simpson Rochwerger, Appen: I think one of the things that builds on the importance of data and that computer vision is different from other data types and machine learning. As humans, we trust our eyes and we trust what we see as truth and we rely upon that visual input very heavily when navigating the world for many people. That is, I think, different from other forms of machine learning because, this is not true for everyone, but many people rely less heavily on what they hear or what they read. 

Visual processing, everyone thinks they’re a computer vision expert because everyone thinks they can trust what they see. Those are generalizations, but I think unfortunately in computer vision, you can’t always trust what you see and you can manipulate the data as has been shown over and over again. Many of us have virtual backgrounds right now and it’s not quite so true that I’m sitting on a backdrop of the Golden Gate Bridge behind me; that’s a fallacy. 

I think more so than perhaps in other data formats, there are nuances with the reliability and trust and accuracy and computer vision that can be really hard to specify or articulate or document, versus with other formats, not everyone thinks they’re such an expert in natural language processing. I may not be a linguist or I may not specialize in English language or I may have a little bit more insecurities around my ability to process language. Where I think humans have less insecurities around their ability to process visual information. And so it can be a higher bar or bigger challenge as computer vision experts or professionals get widespread business input and agreement around what the goals are or the accuracy or objectives are for a particular computer vision Initiative for a project.

Daeil Kim, AI.Reverie: What I’d like to add also that is interesting is, you brought up NLP, Alyssa, and I think there’s something interesting there with computer vision because oftentimes cameras are not all created equal. To Abhishek’s point about how you have to target the domain. That might just be affected by the camera sensor itself. And so that’s the ability to easily take one data set and I wish Microsoft they say can solve all the visual problems in the world, but unfortunately they can’t because cameras come in all sizes and sensor types and all that stuff. So that does actually add an existing complexity especially to our work. I’m sure what happened as well. There’s a lot of stuff where you get images where it’s like, well, I can’t even, I have no idea where this is coming.

Alyssa Simpson Rochwerger, Appen: It’s also the language that we use to describe what we see with our eyes is different. So I’ll take the concept I use a lot to reference is the concept of the color green in the United States. There is a visual representation of the color green that is different from in Japan, how that same language is used to describe a visual outcome. So this is one example of how humans can have different takeaways of looking at the same stimuli. Machine Vision, it’s your job to codify and document all this kind of stuff and create decisions off of visual input and even to humans standing next to each other looking at the same piece of information are going to have different conclusions from the same visual input based on your worldview in your context.


Learn more and watch the full video on YouTube:

Recent Posts

View All Posts