A recent report from the venture capital firm LDV Capital estimates that there will be 44.4 billion cameras in the world by 2022. The report finds that cameras will be imbedded in more than just smartphones, and images (an increasingly dominant form of communication in our online society) will continue to grow in importance in media and marketing sectors.
The report predicts that “most of the pictures captured in the years to come will never be seen by a human eye,” referring to a projected 216 percent growth in the number of cameras by 2022, from today’s 14 billion to 44.4 billion cameras. These cameras, which the report defines as anything with a unique lens and sensor used for image capture, are expected to be built into all kinds of devices from smartphones and smart homes to cars, drones, robots, and boats. Tomorrow’s cameras will go far beyond today’s smartphones, GoPros, and DSLRs (which currently have a measly 1.6 percent of the camera market).
Why the boost? Besides smaller and more powerful sensors, thinner and more versatile lenses, and smarter software to connect it all in different types of devices, advances in data processing computer chips, superfast 5G broadband networks, and edge computing will combine to allow for the rapid and accurate processing of larger volumes of higher quality visual data.
It might seem obvious that smartphones would be the number one driver of camera growth now and in the future. According to Forrester Research, mobile phone subscriptions are expected to top 5.5 billion worldwide, with 90 percent of those tied to smartphones, by 2022. But it’s not just the number of people who will have smartphones, it’s also the number and types of cameras, and the ability of smartphones to capture and process increasingly rich visual data, that will make a difference.
Robots are here
The dirty little secret about AI robots is that they’re already here. Just slap Google Home or Amazon Alexa into an anthropomorphically human-shaped version of a Roomba, and presto, you’ve got an AI robot, as demonstrated by consumer electronics companies such as LG, with its Hub and Airport Guide robots, which have basic object detection capabilities in addition to full-fledged voice recognition. Mayfield Robotics’ Kuri robot uses facial recognition to identify users and then starts recording video, sort of like having a staff videographer documenting your entire life at home. Robotic vacuum pioneer Roomba already has cameras, sensors, and spatial mapping software in its Amazon Alexa-powered robovacs, which it is aiming to monetize, enabling, say, retailers to recommend household products based on knowledge of your home. Meanwhile, companies like Trax are developing computer vision-enabled robots that roam store aisles, capturing stock via images, and ensuring that shelves are kept full and inventory remains constant.
At the mall and the movies
Most public spaces already have cameras, mainly for security purposes. But the addition of computer vision and, in particular, facial recognition, have rendered them more powerful. Walmart is experimenting with checkout line facial recognition cameras that can tell if a customer is agitated (or maybe just confused at the self-checkout kiosk), and then sends store employees to help or offers onscreen advice.
Meanwhile, going to the movie theater used to be a one-way street: audiences watched the film and it didn’t watch back. But in our increasingly quantified and data-driven world, studios want to use technology to measure how audiences react to entertainment. In order to get more insight on viewer reactions to films, Disney recently set up four infrared cameras in a 400-seat theater and captured the facial expressions of about 24,000 people over the course of 150 screenings. It then used these images to train a neural network to identify different facial expressions, eventually correlating these predictions to different parts of a movie to see how audiences reacted–in other words, an AI-driven test screening that can inform future edits of a film.
These are just some of the ways that the explosive growth and ubiquity of cameras across a variety of devices – the Internet of Eyes, as the LDV report calls it – will unfold. The data capture and analysis is still in its infancy, and no doubt additional business and marketing connections will continue to be made.
In the sports arena
IBM’s work during the US Open offers an example of the impact of AI and computer vision on the media creation side: IBM used cognitive highlighting capability to look for and analyze numerous data points in video of each match, then served up the most engaging moments online and via the US Open app for fans.
On the analytics side, GumGum Sports just unveiled data showing huge sponsorship value from non team-owned social media accounts, which was gathered using image recognition to look for the presence of brand logos in pictures shared on social media.
What does this mean for media?
The LDV report suggests that AI and computer vision-powered marketing and ad technologies will become the norm, because images are increasingly important to analytics for advertisers, and their growth as a language unto their own on the web is indisputable.
Ken Weiner is Chief Technology Officer at GumGum, a computer vision company. With over fifteen years of experience in digital technology, Ken is leading his engineering and product teams to build a world-class computer vision platform for marketers.