Researchers at MIT and the University of Minnesota have developed a new system that promises to balance speed with accuracy when analyzing video surveillance footage.
According to Christopher Amato, a postdoctoral associate at MIT, this system, dubbed "Biologically Inspired Scene Estimation or BISE," combines a number of different parts of image processing to more efficiently do scene estimation, intrusion detection or object recognition.
"Essentially, it’s trying to balance off this time and quality trade off of figuring out what (the user is) seeing in the images using different cameras by using these different methods and combining them in different ways," Amato explained. "For instance, you might have some sort of queries or objectives that you’re trying to reach in terms of you’re trying to figure out what’s in the scene. You want to know that there are five pieces of luggage and two humans or you want to know who the specific person is that enters into the scene or you want to be alerted if somebody on your watch list enters. In order to do these things, you have to do different types of processing and these processes could take different amounts of time."
Of course, Amato said that surveillance operators can use things such as facial recognition software to do these things, but this takes a lot of time and they wouldn’t need to use it in a scene where there is not a person. The BISE system uses a hierarchal approach that can mix and match these different processes by running simpler analyses first and adapting based on those results.
"The idea of our system is to try and do this in more real-time, so that we can balance off this time and quality trade off so you get results very quickly. But if you want results very quickly, then you have to deal with a certain amount of uncertainty with what you’re going to get as an output," he said. "The system that we developed explicitly considers the uncertainty that it has in its current estimation of what’s going on in the scene, as well as the amount of time it has taken already to try and figure out what to do next to try and best balance off this quality and time trade off by combining these different methods together. In that sense, it can be as high-quality or as quick as you want it to be and then it can combine any sort of methods that people develop."
The systems that have been used by BISE thus far have not included any commercial video analytic platforms; however, Amato said that the systems they have used are the basis for many commercial offerings.
"But, it is very easy in our approach to take some commercial methods as well and we could place them into our system and we just add them to our learning phase," he added. "Commercial systems tend to mostly be interested in trying to get the best solution they can and they don’t tend to balance off this uncertainty and time trade off in the same way that we’re trying to do."
While the BISE system was developed for any kind of sensory data, Amato said that the most likely application of the technology will be in the surveillance domain within critical infrastructure applications such as airports and border crossings.
Although the BISE system has only been a research application to this point, Amato said that he and his colleagues are currently in the process of pursing patents for the technology.
"At his point we’re looking for additional funding and opportunities to make it into a more commercial-ready system and then go from there," Amato said.