Image-Based Localization

Image-Based Localization using Structure-from-Motion Models

Estimating the position and orientation of a camera given an image taken by it is an important step in many interesting applications such as tourist navigations, robotics, augmented reality and incremental Structure-from-Motion reconstruction. To do so, we have to find correspondences between structures seen in the image and a 3D representation of the scene. Due to the recent advances in the field of Structure-from-Motion it is now possible to reconstruct large scenes up to the level of an entire city in very little time. We can use these results to enable image-based localization of a camera (and its user) on a large scale. However, when processing such large data, the computation of correspondences between points in the image and points in the model quickly becomes the bottleneck of the localization pipeline. Therefore, it is extremely important to develop methods that are able to effectively and efficiently handle such large environments and that scale well to even larger scenes.

Publications

Fast Image-Based Localization using Direct 2D-to-3D Matching, ICCV'11

Towards Fast Image-Based Localization on a City-Scale, Real-World Scene Analysis '11

Image Retrieval for Image-Based Localization Revisited, BMVC'12

Improving Image-Based Localization by Active Correspondence Search, ECCV'12

Scalable 6-DOF Localization on Mobile Devices, ECCV'14

Source Code

ICCV'11 & ECCV'12

We release the source code for the localization methods presented in the ICCV'11 and ECCV'12 papers. You can find the current version of the source code, including BOTH approaches, here.

Notice that we only support Linux-based systems and do not plan to port the method to Mac OS X or Windows.

The source code is released under the GNU GPL version 3. For commercial use, please contact Torsten Sattler. Please see the README.txt file accompanying the source code for more details.

The generic visual vocabulary used in the paper can be found here. The visual words are stored in a text file, with the first 128 floating point values describing the first word and so on.

History


2011-10-08:	Initial release.
2012-02-29:	Updated README.txt file to warn about API breaking changes in newer versions of the FLANN library.
2012-10-04:	Souce code now include the active correspondence search framework proposed in the ECCV paper. Furthermore, we fixed an error in both methods, that previously prevented the approaches to work correctly for datasets with more than 33 million descriptors. Additionally, some minor performance improvements were made.
2012-10-04:	Modified source code to be able to handle the binary .info file contained in the Aachen dataset (see below).
2013-03-15:	Bugfix to handle the case that a connected component of 3D points contains fewer points than the number of nearest neighbors we are looking for during active search.

ECCV'14

You can download an iOS demo project for our local pose tracker from the paper Scalable 6-DOF Localization on Mobile Devices, ECCV'14 here.

Notice that we only support iOS and do not plan to port the method to Android or any other mobile operating systems.

The source code is released under the GNU GPL version 3. For commercial use, please contact Sven Middelberg. Please see the README.txt file accompanying the source code for more details.

History


2014-10-10:	Initial release.

Datasets

You can obtain the Aachen dataset from the paper Image Retrieval for Image-Based Localization Revisited, BMVC'12 by contacting Torsten Sattler. The dataset is made available only for research purposes and is not intended for commercial use.

The dataset consists of 4479 images taken with multiple cameras (3GB), 369 query images taken with the camera of a mobile phone together with their SIFT descriptors (490MB), and the actual reconstruction computed by Bundler (300MB). Distances in the reconstruction are given in meters and there exists a transformation mapping coordinates in the reconstruction to GPS coordinates. Notice that the images contained in this dataset are not the original images used for the reconstruction but have been downsampled (at most 1600x1600) to reduce the file size. If you are interested in obtaining the original images (13GB), please contact Torsten Sattler.

Furthermore, you can also obtain the binary .info file (1.1GB), which contains the SIFT descriptors of the 3D points, that is needed by the localization methods from Fast Image-Based Localization using Direct 2D-to-3D Matching, ICCV'11 and Improving Image-Based Localization by Active Correspondence Search, ECCV'12. For information about the .info file, please contact Torsten Sattler or look at the function "load_from_binary" provided with the source code of the localization method (under src/sfm/parse_bundler.cc, lines 306-415, with the format parameter set to 1).

Some of the datasets used for experimental evaluation have kindly been provided by Noah Snavely. If you are interested in obtaining them, please visit his home page.

Feedback

We are grateful for feedback! Please send suggestions, patches, bug reports, questions and comments to us.