Implementation of Monocular Visual Odometry Pipeline

Categories Courseworks

This project is from the course “Vision Algorithms for Mobile Robotics” of ETH Zurich in Autumn 2016.

The goal of the project is implementing a monocular visual odometry pipeline.

Jihwan Youn, Dongho Kang and Jaeyoung Lim contributed to the project and the course was instructed by Prof. Davide Scaramuzza


Our Visual Odometry pipeline(VO pipeline) consists of initialization, continuous operation, and bundle adjustment as shown in figure 1. A loop closure module was also implemented but excluded from the experimental result.

Figure 1: Overview of the visual odometry pipeline.

The initialization module outputs the initial pose and landmarks based on the manually selected two images, I0 and I1. The process frame uses the pre- viously estimated pose and landmarks to continuously estimate the current pose.

Additionally, VO pipeline performs window-based bundle adjustment to optimize the poses and the landmarks by minimizing the reprojection error in a nonlinear fashion.

A loop closure feature was added by storing the landmark history. However, due to the poor robustness of place recognition function, the loop closure feature is not demonstrated properly in the demo videos


The code was implemented in MATLAB language and was tested in MATLAB R2016b version. Four datasets were used to assess the performance of our VO pipeline.

KITTI dataset

KITTI dataset shows significant scale drift without the bundle adjustment implementation. After bundle adjustment, the VO pipeline is robust enough to run to the end of the dataset.

MALAGA dataset

MALAGA dataset shows scale drift even after bundle adjustment. This is thought to be from the large variations of landmark distance from the camera.

PARKING dataset

Parking dataset shows robust performance even though the camera is facing the right angle to the direction of travel. Parking has an accumulative error as the dataset has no loop even though running bundle adjustment.

SEOUL dataset

A custom dataset was constructed to verify that the visual odometry pipeline is robust enough to work in various environments. The custom dataset is based on 360 sequence of images taken from an iPhone 6s. While taking the images the camera had a fixed focal length and the images were down sam- pled by a factor of 0.2 for computational issues in calibrating the camera and rectifying images. Images were filmed in a narrow alley in Seoul, Republic of Korea in December 27th 2016.

The successful trajectory estimation of the VO pipeline in a narrow alley with large variations of landmark distance and illumination changes shows robust performance of the VO pipeline.


We successfully implemented a fully working and robust monocular visual odometry pipeline as well as the additional features as the following:

  • An appealing visualization was implemented including keypoint tracking information, camera heading and trajectory in each frame. A full landmark and trajectory visualization is also implemented.
  • Many ideas for improving the performance of the VO pipeline.
  • Refined estimated pose by minimizing reprojection error.
  • VO pipeline verification on a custom recorded dataset.
  • Full bundle adjustment(motion and structure)
  • Not fully working but implemented loop closure module using place recognition.