Of the methods currently available to determine your position, camera is the most promising in this context.
”Camera technology is comparatively cheap and provides access to a lot of information”, Fredrik Kahl continues. “There are several possible applications for the technology, where self-driving cars and unmanned vehicles probably are the most prevalent. Research is also underway in areas such as smart camera technology used in mobile phone navigation apps, in industrial production processes and in flexible systems for inspection of various environments.”
One example, where the technology is now being demonstrated in a supermarket setting for later transfer to other applications, is the research project ‘Semantic Mapping and Visual Navigation for Smart Robots’, funded by the Swedish Foundation for Strategic Research. The project is headed by Fredrik Kahl and involves researchers from Chalmers and Lund University among others.
”Semantic mapping means training the system to be able to recognise named physical objects in pictures and link them to a geographical position”, Fredrik Kahl says. “In the supermarket setting, the system first gets to learn how selected products look like and what they are called, and then it should be able to register where these products are located on the shelves in the store. To fulfill the task, various subsystems in machine learning, computer vision and robotics need to work together.”
This technology will be tested in a supermarket in Stockholm, where a drone will fly along the shelves to identify which and how many products of each kind that are in stock. One challenge is that the products on the shelves can block each other.
There are significantly more difficulties to overcome when transferring this technology outdoors and incorporating it into a self-driving car. Then, factors such as weather, daylight and time of the year also need to be considered.
“A picture taken on a beautiful summer day differs quite a lot from a picture taken at the same place on a wintry evening in January”, says Fredrik Kahl. “Without leaves on bushes and trees, the view can be completely changed, and other objects appear in the picture instead. Fog, snow and rain, in turn, are blurring the recognition marks.”
Therefore, in order to build the visual localisation system, you need to have access to many photos taken under different exterior circumstances from the same geographical location.
The researchers are putting labels, annotations, for various types of phenomena on the images that they want the system to recognise, such as ‘road’, ‘pavement’, ‘building’ etc. Often, subdivisions are needed for the annotations to be useful, for example ‘vegetation’ becomes a too comprehensive label. Annotations are needed, but the work is time-consuming, and it is therefore important to find a balanced level for the number of classifications.
Machine learning in artificial neural networks is used to train the system, bit by bit improving the ability of the self-driving car or robot to recognise the surroundings and to orient itself.
”As our algorithms become more accurate and the three-dimensional map is being built up, fewer images will be needed for the system to be able to locate itself”, Fredrik Kahl says. “A lot of tricky problems still remain to be solved, but that is what makes this field so exciting and fun to work with.”
Website to test the accuracy of the localisation
The Chalmers researchers have launched a website, that up till now contains more than 100 000 collected images. On the website, like-minded research teams can compare and test the accuracy of their algorithms by downloading images, performing calculations and then uploading their results to get them corrected and ranked on a top list.
Text: Yvonne Jonsson
Photo of Fredrik Kahl: Malin Ulfvarson
More about the research and researchers
The research team behind the research and the film "Localization using semantics": Måns Larsson, Lars Hammarstrand, Erik Stenborg, Carl Toft, Torsten Sattler and Fredrik Kahl
For more information contact
Fredrik Kahl, professor of computer vision and image analysis at the department of Electrical Engineering at Chalmers University of Technology