Last month, I delved into the subject of Deep Learning and its application in video surveillance, particularly analytics (www.securityinfowatch.com/12221990). This is now becoming feasible through components called Graphics Processing Units (GPUs), which enable pixel-level processing and analysis.
As luck would have it, since writing that piece, I found myself with a little time to plow through the stack of unread publications that had been accumulating in my office. Fortunately, I stumbled onto something quite interesting in the Dec. 2015 issue of IEEE Spectrum, authored by Christopher Posch, Ryad Benosman and Rapph Etienne-Cummmings (read it at http://bit.ly/IEEE_Spectrum1115), dealing with an approach to video processing rooted in neuromorphic engineering — creating electronic signal-processing systems inspired by biological systems. No, we’re not talking about viruses; interestingly, the electronic research dates back to the late 1980s when it was postulated by Carver Mead, credited with being the inventor of VLSI.
Researchers have been applying advanced electronic technologies to make image sensing and processing behave more like how the human body efficiently processes optical images. You might think that this is another space shot, and so did I until I realized the pieces that these researchers are putting together are within our grasp. Indeed, a company in France called Pixium Vision is working on approaches to restore sight, electronically, to those who have lost it.
Video Processing and the Human Eye
Let’s look at where we stand today with video surveillance technology. Individual image sensors have reached 12 MP. Wide dynamic range has gotten better as the result of CMOS technology used in today’s image sensors, enabling light measurement and processing at the pixel level. Roughly keeping pace with these higher resolution sensors has been a succession of video compression algorithms to make transmission of reasonable frame rate video more efficient and feasible.
Post processing of the video through continually better video analytics has made video more usable and created an environment where security system operators can focus on events presented to them, rather than relying on eyeballing monitors for hours on end. As I detailed in my last column, the massive parallel processing capability offered by products like NVIDIA’s GPUs allow artificial intelligence techniques — specifically, “deep learning” — to take video analytics to the next level through a continual process of self-learning…all at a blazing 30 frames per second.
We can fool the eye at 24 fps and above, as our brain will interpolate what it needs to; however, our eyes don’t look at the world with a 30 fps sample size. Video compression (other than MJPEG) relies heavily on identifying what is changing in a scene from frame to frame, making prediction on motion (motion vectors) and encoding the error of actual vs. predicted. The technique uses blocks of pixels to perform its calculations — as block sizes increase, image quality suffers, but less bandwidth is used.
The neuromorphic researchers have gotten down to analyzing changes at the individual pixel level. By measuring the change in received light at the pixel, relevant to a certain threshold, they determinate which pixels are “in play.” Those pixels are sampled at a significantly higher rate than the other, perhaps as high as several hundred thousand times per second, while de-emphasizing those where nothing is changing.
CMOS sensors give us the ability to do this — providing a continuing rich data set to the image processing engine that much more effectively ignores what is not changing and better captures the variances in time of what is changing. These chip sets, composed of many cores working in parallel, are optimized for graphics processing and capable of self-learning.
This is the exciting part and the essence of what is to come. As the authors of the article state, “To fully unlock the potential of eyelike vision sensors, you need to abandon the whole notion of a video frame…That can be a little hard to get your head around, but as soon as you do that, you become liberated, and the subsequent processing you can do to the data can resolve things that you would otherwise easily miss.”
The Impact on Video Surveillance
In the world of video surveillance, this technology can make facial and object recognition become much more effective — akin to the brain thinking “I think I’ve seen that before.” Behavioral analytics and analysis get better because the data they mine is richer and more relevant. Correlation of certain actions may lead to the prediction of certain types of events, such as those which cause mass casualties. Correlation of images from multiple cameras would enable better people and object tracking, along the lines of Qognify’s Suspect Search, across a larger set of devices and improved accuracy.
So, how close are we to real-life applications on this technology? Closer than you may think. In addition to NVIDIA’s work, Google, IBM, Qualcomm and others have active chip developments in this area. By 2018, Qualcomm plans to extend neuromorphic capabilities of its “Zeroth” platform for cognitive computing and machine learning to other embedded applications such as wearables and drones, according to IEEE Times.
Recent research by Markets and Markets projects that the global neuromorphic chip market is expected to grow at more than 26 percent annually to reach nearly $5 billion by 2022. Image recognition applications are projected to be nearly 60 percent of the total.
Consumer applications will be the early drivers of the technology, followed by a steady progression into big data and commercial and industrial applications — hasn’t that been the case for most other impactful technologies used in video surveillance? Ten years from now, expect video surveillance products to look and perform a lot differently.
Ray Coulombe is Founder and Managing Director of SecuritySpecifiers and RepsForSecurity.com. Reach him at [email protected], through LinkedIn at www.linkedin.com/in/raycoulombe, or on Twitter, @RayCoulombe.