Please use this identifier to cite or link to this item:
Title: XPath-wrapper induction for data extraction
Authors: Nam-Khanh, Tran
Kim-Cuong, Pham
Quang-Thuy, Ha
Keywords: Amount of information;Data extraction;Human being;Structured information;Template-based;User query;Wrapper induction
Issue Date: 2010
Publisher: H. : ĐHQGHN
Abstract: The Web contains an enormous amount of information which is formatted for human beings. This makes it difficult for computer to extract relevant content from various sources. This paper presents an XPath-wrapper induction algorithm which leverages user queries and template-based sites for extracting structured information. Our experiments show average accuracy of 94%. © 2010 IEEE.
Description: Proceedings - 2010 International Conference on Asian Language Processing, IALP 2010 2010, Article number 5681601, Pages 150-153
ISBN: 978-076954288-1
Appears in Collections:Bài báo của ĐHQGHN trong Scopus

Files in This Item:

  • File : 24.pdf
  • Description : 
  • Size : 399.85 kB
  • Format : Adobe PDF

  • Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.