WO2014049310A3 - Method and apparatuses for interactive searching of electronic documents - Google Patents

Method and apparatuses for interactive searching of electronic documents Download PDF

Info

Publication number
WO2014049310A3
WO2014049310A3 PCT/GB2013/000369 GB2013000369W WO2014049310A3 WO 2014049310 A3 WO2014049310 A3 WO 2014049310A3 GB 2013000369 W GB2013000369 W GB 2013000369W WO 2014049310 A3 WO2014049310 A3 WO 2014049310A3
Authority
WO
WIPO (PCT)
Prior art keywords
query
terms
electronic document
controlled automatic
interactive
Prior art date
Application number
PCT/GB2013/000369
Other languages
French (fr)
Other versions
WO2014049310A2 (en
Inventor
Pavel LOSKOT
Original Assignee
Swansea University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Swansea University filed Critical Swansea University
Publication of WO2014049310A2 publication Critical patent/WO2014049310A2/en
Publication of WO2014049310A3 publication Critical patent/WO2014049310A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users

Abstract

Interactive extraction of information and interactive searching of an electronic document, where there is formulation of a query to extract specific information from a given part of the electronic document while preserving technical and scientific information quality and accuracy; the process involving a query-controlled automatic segmentation of the given part of the electronic document into a plurality of terms represented as a weighted directed graph; query-controlled automatic classification of each of these terms by associating it with type and feature vector; query-controlled relevance scoring of each of these terms; query-controlled automatic selection of a subset of these terms; and automated composition of the system output guaranteeing to have at least some minimum level of coherence to be presented to the user.
PCT/GB2013/000369 2012-09-27 2013-09-04 Method and apparatuses for interactive searching of electronic documents WO2014049310A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB1217334.0 2012-09-27
GBGB1217334.0A GB201217334D0 (en) 2012-09-27 2012-09-27 System and method for data extraction and storage

Publications (2)

Publication Number Publication Date
WO2014049310A2 WO2014049310A2 (en) 2014-04-03
WO2014049310A3 true WO2014049310A3 (en) 2014-05-15

Family

ID=47225325

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2013/000369 WO2014049310A2 (en) 2012-09-27 2013-09-04 Method and apparatuses for interactive searching of electronic documents

Country Status (2)

Country Link
GB (1) GB201217334D0 (en)
WO (1) WO2014049310A2 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291780B (en) * 2016-04-12 2021-05-28 腾讯科技(深圳)有限公司 User comment information display method and device
IT201600103594A1 (en) * 2016-10-14 2018-04-14 Sws Eng S P A PROCEDURE AND SYSTEM FOR CALCULATING THE RISK LEVEL IN THE PROXIMITY OF THE EXCAVATION FRONT OF A UNDERGROUND WORK
US9996527B1 (en) 2017-03-30 2018-06-12 International Business Machines Corporation Supporting interactive text mining process with natural language and dialog

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7013309B2 (en) 2000-12-18 2006-03-14 Siemens Corporate Research Method and apparatus for extracting anchorable information units from complex PDF documents
CN1629838A (en) 2003-12-17 2005-06-22 国际商业机器公司 Method, apparatus and system for processing, browsing and information extracting of electronic document
US7386789B2 (en) 2004-02-27 2008-06-10 Hewlett-Packard Development Company, L.P. Method for determining logical components of a document
US7590647B2 (en) 2005-05-27 2009-09-15 Rage Frameworks, Inc Method for extracting, interpreting and standardizing tabular data from unstructured documents
US7469251B2 (en) 2005-06-07 2008-12-23 Microsoft Corporation Extraction of information from documents
US20120151310A1 (en) 2010-12-13 2012-06-14 El-Kalliny Ahmed M Method and system for identifying and delivering contextually-relevant information to end users of a data network
GB2487600A (en) 2011-01-31 2012-08-01 Keywordlogic Ltd System for extracting data from an electronic document
AU2012327239B8 (en) 2011-10-14 2015-10-29 Oath Inc. Method and apparatus for automatically summarizing the contents of electronic documents

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ALBERTO H F LAENDER ET AL: "A Brief Survey of Web Data Extraction Tools", SIGMOD RECORD, June 2002 (2002-06-01), pages 84 - 93, XP055109980, Retrieved from the Internet <URL:http://dl.acm.org/citation.cfm?id=565137> [retrieved on 20140326] *

Also Published As

Publication number Publication date
GB201217334D0 (en) 2012-11-14
WO2014049310A2 (en) 2014-04-03

Similar Documents

Publication Publication Date Title
WO2014183956A3 (en) Social media content analysis and output
WO2010014185A3 (en) Federated community search
WO2013001535A3 (en) System, method and data structure for fast loading, storing and access to huge data sets in real time
WO2011054002A3 (en) Content-based image search
WO2013188504A3 (en) Multilingual mixed search method and system
WO2011090882A3 (en) Extraction and publication of reusable organizational knowledge
MX2015008723A (en) Data base query translation system.
WO2016029018A3 (en) Executing constant time relational queries against structured and semi-structured data
GB2549875A (en) Automated content classification/filtering
MX341505B (en) Context-based ranking of search results.
WO2012134972A3 (en) Systems and methods for paragraph-based document searching
MX368777B (en) System and method for automatic product matching.
WO2014043200A3 (en) Dynamic data acquisition method and system
WO2011146276A3 (en) Television related searching
WO2012170318A3 (en) Presenting images as search results
WO2013089668A3 (en) Content-based automatic input protocol selection
GB2542304A (en) Methods, systems, and media for searching for video content
GB201203858D0 (en) Automated processing of documents
WO2014004545A3 (en) Pushing business objects
GB201203233D0 (en) Method and device for a meta data fragment from a metadata component associated with multimedia data
WO2013188886A3 (en) Method and system for parallel batch processing of data sets using gaussian process with batch upper confidence bound
WO2014049310A3 (en) Method and apparatuses for interactive searching of electronic documents
IL223381B (en) Automatic summarising of media content
GB2494573A (en) Assessing and adapting component parameters
MX2013013345A (en) System and method for automatic wrapper induction using target strings.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13776533

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 13776533

Country of ref document: EP

Kind code of ref document: A2