I was recently filling up some forms online recently when I was asked to prove that I am not a robot. I was then asked to identify the presence of a road sign among a bunch of random images, which presumably is easy for a human but difficult for a computer robot. Turns out this may no longer be true…
The advancement is that now instead of running a classifier system thousands of times on a single image to do detection, they trained a single network to do all the detection simultaneously. A sample classifier image below shows the computer drawing thousands of possible boxes around what may be considered detections, and runs through its library to decide if each of those boxes were a detection.
This system is called “You Only Look Once” or YOLO, and is available in Darknet, an open source neural network written in C. Just wondering if this would make all other security camera system software providers obsolete given the speed and power of this technology.
One drawback I did notice in the TED talk was that when he was panning his camera to the audience, only the front rows were picked out, meaning that the objects needed to be of a certain size to be able to be classified with confidence. The audience sitting at the back were all unclassified. Nevertheless, it is still amazing how smart this system is. Applications could potentially be widespread, ranging from security systems to military technology. I included the link to the open source YOLO if anyone is interested. You can read more about how the technology works there too.