Geo-Distinctive Visual Element Matching for Location Estimation of Images

Xinchao Li; Martha Larson; Alan Hanjalic

doi:10.1109/TMM.2017.2763323

Geo-Distinctive Visual Element Matching for Location Estimation of Images

Xinchao Li^*, Martha Larson, Alan Hanjalic

^*Corresponding author for this work

Research output: Contribution to journal › Article › Scientific › peer-review

14 Citations (Scopus)

148 Downloads (Pure)

Abstract

We propose an image representation and matching approach that substantially improves visual-based location estimation for images. The main novelty of the approach, called distinctive visual element matching (DVEM), is its use of representations that are specific to the query image whose location is being predicted. These representations are based on visual element clouds, which robustly capture the connection between the query and visual evidence from candidate locations. We then maximize the influence of visual elements that are geo-distinctive because they do not occur in images taken at many other locations. We carry out experiments and analysis for both geo-constrained and geo-unconstrained location estimation cases using two large-scale, publicly available datasets: the San Francisco Landmark dataset with 1.06 million street-view images and the MediaEval'15 Placing Task dataset with 5.6 million geo-tagged images from Flickr. We present examples that illustrate the highly transparent mechanics of the approach, which are based on commonsense observations about the visual patterns in image collections. Our results show that the proposed method delivers a considerable performance improvement compared to the state-of-the-art.

Original language	English
Pages (from-to)	1179-1194
Number of pages	16
Journal	IEEE Transactions on Multimedia
Volume	20
Issue number	5
DOIs	https://doi.org/10.1109/TMM.2017.2763323
Publication status	Published - 2018

Bibliographical note

Accepted author manuscript

Keywords

Geo-location Estimation
information retrieval
large scale image retrieval

Access to Document

10.1109/TMM.2017.2763323

08068212Accepted author manuscript, 1.92 MB

Cite this

@article{e0f1ae2733a141d4937b1139c597d659,

title = "Geo-Distinctive Visual Element Matching for Location Estimation of Images",

abstract = "We propose an image representation and matching approach that substantially improves visual-based location estimation for images. The main novelty of the approach, called distinctive visual element matching (DVEM), is its use of representations that are specific to the query image whose location is being predicted. These representations are based on visual element clouds, which robustly capture the connection between the query and visual evidence from candidate locations. We then maximize the influence of visual elements that are geo-distinctive because they do not occur in images taken at many other locations. We carry out experiments and analysis for both geo-constrained and geo-unconstrained location estimation cases using two large-scale, publicly available datasets: the San Francisco Landmark dataset with 1.06 million street-view images and the MediaEval'15 Placing Task dataset with 5.6 million geo-tagged images from Flickr. We present examples that illustrate the highly transparent mechanics of the approach, which are based on commonsense observations about the visual patterns in image collections. Our results show that the proposed method delivers a considerable performance improvement compared to the state-of-the-art.",

keywords = "Geo-location Estimation, information retrieval, large scale image retrieval",

author = "Xinchao Li and Martha Larson and Alan Hanjalic",

note = "Accepted author manuscript",

year = "2018",

doi = "10.1109/TMM.2017.2763323",

language = "English",

volume = "20",

pages = "1179--1194",

journal = "IEEE Transactions on Multimedia",

issn = "1520-9210",

publisher = "IEEE",

number = "5",

}

TY - JOUR

T1 - Geo-Distinctive Visual Element Matching for Location Estimation of Images

AU - Li, Xinchao

AU - Larson, Martha

AU - Hanjalic, Alan

N1 - Accepted author manuscript

PY - 2018

Y1 - 2018

N2 - We propose an image representation and matching approach that substantially improves visual-based location estimation for images. The main novelty of the approach, called distinctive visual element matching (DVEM), is its use of representations that are specific to the query image whose location is being predicted. These representations are based on visual element clouds, which robustly capture the connection between the query and visual evidence from candidate locations. We then maximize the influence of visual elements that are geo-distinctive because they do not occur in images taken at many other locations. We carry out experiments and analysis for both geo-constrained and geo-unconstrained location estimation cases using two large-scale, publicly available datasets: the San Francisco Landmark dataset with 1.06 million street-view images and the MediaEval'15 Placing Task dataset with 5.6 million geo-tagged images from Flickr. We present examples that illustrate the highly transparent mechanics of the approach, which are based on commonsense observations about the visual patterns in image collections. Our results show that the proposed method delivers a considerable performance improvement compared to the state-of-the-art.

AB - We propose an image representation and matching approach that substantially improves visual-based location estimation for images. The main novelty of the approach, called distinctive visual element matching (DVEM), is its use of representations that are specific to the query image whose location is being predicted. These representations are based on visual element clouds, which robustly capture the connection between the query and visual evidence from candidate locations. We then maximize the influence of visual elements that are geo-distinctive because they do not occur in images taken at many other locations. We carry out experiments and analysis for both geo-constrained and geo-unconstrained location estimation cases using two large-scale, publicly available datasets: the San Francisco Landmark dataset with 1.06 million street-view images and the MediaEval'15 Placing Task dataset with 5.6 million geo-tagged images from Flickr. We present examples that illustrate the highly transparent mechanics of the approach, which are based on commonsense observations about the visual patterns in image collections. Our results show that the proposed method delivers a considerable performance improvement compared to the state-of-the-art.

KW - Geo-location Estimation

KW - information retrieval

KW - large scale image retrieval

UR - http://www.scopus.com/inward/record.url?scp=85046057665&partnerID=8YFLogxK

U2 - 10.1109/TMM.2017.2763323

DO - 10.1109/TMM.2017.2763323

M3 - Article

SN - 1520-9210

VL - 20

SP - 1179

EP - 1194

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

IS - 5

ER -

Geo-Distinctive Visual Element Matching for Location Estimation of Images

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this