Artificial orientation built up by smart imaging

​Where am I? The question is relevant not only to humans but also to self-driving cars that must be confident about their own position. Researchers at Chalmers University of Technology are developing smart algorithms for so-called visual localisation based on machine learning of large amounts of data collected from photos.
Fredrik Kahl“Visual localisation means that a robot or car should be able to determine its current position using camera images that are compared to a map of the surroundings”, says Fredrik Kahl, professor of computer vision and image analysis at the department of Electrical Engineering at Chalmers. "It is about identifying distinct features and comparing them to already known characteristics in the surroundings, which are positioned on a three-dimensional map."

Of the methods currently available to determine your position, camera is the most promising in this context.

”Camera technology is comparatively cheap and provides access to a lot of information”, Fredrik Kahl continues. “There are several possible applications for the technology, where self-driving cars and unmanned vehicles probably are the most prevalent. Research is also underway in areas such as smart camera technology used in mobile phone navigation apps, in industrial production processes and in flexible systems for inspection of various environments.”

One example, where the technology is now being demonstrated in a supermarket setting for later transfer to other applications, is the research project ‘Semantic Mapping and Visual Navigation for Smart Robots’, funded by the Swedish Foundation for Strategic Research. The project is headed by Fredrik Kahl and involves researchers from Chalmers and Lund University among others.

”Semantic mapping means training the system to be able to recognise named physical objects in pictures and link them to a geographical position”, Fredrik Kahl says. “In the supermarket setting, the system first gets to learn how selected products look like and what they are called, and then it should be able to register where these products are located on the shelves in the store. To fulfill the task, various subsystems in machine learning, computer vision and robotics need to work together.”

This technology will be tested in a supermarket in Stockholm, where a drone will fly along the shelves to identify which and how many products of each kind that are in stock. One challenge is that the products on the shelves can block each other.

There are significantly more difficulties to overcome when transferring this technology outdoors and incorporating it into a self-driving car. Then, factors such as weather, daylight and time of the year also need to be considered.

camera images



“A picture taken on a beautiful summer day differs quite a lot from a picture taken at the same place on a wintry evening in January”, says Fredrik Kahl. “Without leaves on bushes and trees, the view can be completely changed, and other objects appear in the picture instead. Fog, snow and rain, in turn, are blurring the recognition marks.”

Therefore, in order to build the visual localisation system, you need to have access to many photos taken under different exterior circumstances from the same geographical location.

The researchers are putting labels, annotations, for various types of phenomena on the images that they want the system to recognise, such as ‘road’, ‘pavement’, ‘building’ etc. Often, subdivisions are needed for the annotations to be useful, for example ‘vegetation’ becomes a too comprehensive label. Annotations are needed, but the work is time-consuming, and it is therefore important to find a balanced level for the number of classifications. 

Machine learning in artificial neural networks is used to train the system, bit by bit improving the ability of the self-driving car or robot to recognise the surroundings and to orient itself.

”As our algorithms become more accurate and the three-dimensional map is being built up, fewer images will be needed for the system to be able to locate itself”, Fredrik Kahl says. “A lot of tricky problems still remain to be solved, but that is what makes this field so exciting and fun to work with.”

Website to test the accuracy of the localisation
The Chalmers researchers have launched a website, that up till now contains more than 100 000 collected images. On the website, like-minded research teams can compare and test the accuracy of their algorithms by downloading images, performing calculations and then uploading their results to get them corrected and ranked on a top list.

Text: Yvonne Jonsson
Photo of Fredrik Kahl: Malin Ulfvarson​

More about the research and researchers
The research team behind the research and the film "Localization using semantics": Måns Larsson, Lars Hammarstrand, Erik Stenborg, Carl Toft, Torsten Sattler and Fredrik Kahl
The research team behind the project Semantic Mapping and Visual Navigation for Smart Robots, and the film from the supermarket: Patrik Persson, Marcus Greiff, Sebastian Hanér, Olof Enqvist and Fredrik Kahl

For more information contact
Fredrik Kahl, professor of computer vision and image analysis at the department of Electrical Engineering at Chalmers University of Technology


Published: Tue 24 Sep 2019. Modified: Wed 25 Sep 2019