VLASE: Vehicle Localization by Aggregating Semantic Edges

We propose VLASE, a framework to use semantic edge features from images to achieve on-road localization. Semantic edge features denote edge contours that separate pairs of distinct objects such as building-sky, road-sidewalk, and building-ground. While prior work has shown promising results by utilizing the boundary between prominent classes such as sky and building using skylines, we generalize this to consider 19 semantic classes. We extract semantic edge features using CASENet architecture and utilize VLAD framework to perform image retrieval. We achieve improvement over state of-the-art localization algorithms such as SIFT-VLAD and its deep variant NetVLAD. Ablation study shows the importance of different semantic classes, and our unified approach achieves better performance compared to individual prominent features such as skylines. We also introduce SLC Marathon dataset, a challenging dataset covering most of Salt Lake City with sufficient lighting variations.