Teaching Machines to See

Spread the love
This is an example of SegNet in action: the separate components of the road scene are all labelled in real time. Credit: Alex Kendall

This is an example of SegNet in action: the separate components of the road scene are all labelled in real time. Credit: Alex Kendall

New smartphone-based system could accelerate development of driverless cars. 2 newly-developed systems for driverless cars can identify a user’s location and orientation in places where GPS does not function, and identify the various components of a road scene in real time on a regular camera or smartphone, performing the same job as sensors costing tens of thousands of pounds.

Although the systems cannot currently control a driverless car, the ability to make a machine ‘see’ and accurately identify where it is and what it’s looking at is a vital part of developing autonomous vehicles and robotics. The first system, SegNet, can take an image of a street scene it hasn’t seen before and classify it, sorting objects into 12 different categories – eg roads, street signs, pedestrians, buildings and cyclists – in real time. It can deal with light, shadow and night-time environments, and currently labels > 90% of pixels correctly. Previous systems using expensive laser or radar based sensors have not been able to reach this level of accuracy while operating in real time.

For the driverless cars currently in development, radar and base sensors are expensive – in fact, they often cost more than the car itself. In contrast with expensive sensors, which recognise objects through a mixture of radar and LIDAR (a remote sensing technology), SegNet learns by example – it was ‘trained’ by an industrious group of Cambridge undergraduate students, who manually labelled every pixel in each of 5000 images, with each image taking about 30 minutes to complete. Once the labelling was finished, the researchers then took two days to ‘train’ the system before it was put into action.

SegNet was primarily trained in highway and urban environments, so it still has some learning to do for rural, snowy or desert environments – although it has performed well in initial tests for these environments. The system is not yet at the point where it can be used to control a car or truck, but it could be used as a warning system, similar to the anti-collision technologies currently available on some passenger cars.

There are 3 key technological questions that must be answered to design autonomous vehicles: where am I, what’s around me and what do I do next. SegNet addresses the 2nd question, while a separate but complementary system answers the 1st by using images to determine both precise location and orientation.

The localisation system designed by Kendall and Cipolla runs on a similar architecture to SegNet, and is able to localise a user and determine their orientation from a single colour image in a busy urban scene. The system is far more accurate than GPS and works in places where GPS does not, such as indoors, in tunnels, or in cities where a reliable GPS signal is not available. The localisation system uses the geometry of a scene to learn its precise location, and is able to determine, for example, whether it is looking at the east or west side of a building, even if the two sides appear identical.

“Work in the field of artificial intelligence and robotics has really taken off in the past few years,” said Kendall. “But what’s cool about our group is that we’ve developed technology that uses deep learning to determine where you are and what’s around you — this is the first time this has been done using deep learning.”

“In the short term, we’re more likely to see this sort of system on a domestic robot — such as a robotic vacuum cleaner, for instance,” said Cipolla. “It will take time before drivers can fully trust an autonomous car”. http://www.cam.ac.uk/research/news/teaching-machines-to-see-new-smartphone-based-system-could-accelerate-development-of-driverless-cars