Sub-document timestamping of web documents

Y. Zhao, C Hauff

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

6 Citations (Scopus)

Abstract

Knowledge about a (Web) document's creation time has been shown to be an important factor in various temporal information retrieval settings. Commonly, it is assumed that such documents were created at a single point in time. While this assumption may hold for news articles and similar document types, it is a clear oversimplification for general Web documents. In this paper, we investigate to what extent (i) this simplifying assumption is violated for a corpus of Web documents, and, (ii) it is possible to accurately estimate the creation time of individual Web documents' components (so-called sub-documents).
Original languageEnglish
Title of host publicationProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2015
EditorsR Baeza-Yates, M Lalmas, A Moffat, B Ribeiro-Neto
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery (ACM)
Pages1023-1026
Number of pages4
ISBN (Print)978-1-4503-3621-5
DOIs
Publication statusPublished - 2015
EventSIGIR 2015, Santiago, Chile - New york
Duration: 9 Aug 201513 Aug 2015

Publication series

Name
PublisherACM

Conference

ConferenceSIGIR 2015, Santiago, Chile
Period9/08/1513/08/15

Keywords

  • timestamping
  • sub-documents
  • Web-archiving

Fingerprint

Dive into the research topics of 'Sub-document timestamping of web documents'. Together they form a unique fingerprint.

Cite this