CN103415850A - 结构化文档管理装置、结构化文档检索方法 - Google Patents

结构化文档管理装置、结构化文档检索方法 Download PDF

Info

Publication number
CN103415850A
CN103415850A CN2012800029691A CN201280002969A CN103415850A CN 103415850 A CN103415850 A CN 103415850A CN 2012800029691 A CN2012800029691 A CN 2012800029691A CN 201280002969 A CN201280002969 A CN 201280002969A CN 103415850 A CN103415850 A CN 103415850A
Authority
CN
China
Prior art keywords
title
degree
document
structured document
vocabulary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012800029691A
Other languages
English (en)
Chinese (zh)
Inventor
国分智晴
真锅俊彦
仲野亘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Toshiba Digital Solutions Corp
Original Assignee
Toshiba Corp
Toshiba Solutions Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp, Toshiba Solutions Corp filed Critical Toshiba Corp
Publication of CN103415850A publication Critical patent/CN103415850A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/83Querying
    • G06F16/835Query processing
    • G06F16/8373Query execution

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
CN2012800029691A 2012-03-14 2012-07-20 结构化文档管理装置、结构化文档检索方法 Pending CN103415850A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2012057240A JP5417471B2 (ja) 2012-03-14 2012-03-14 構造化文書管理装置、構造化文書検索方法
JP2012-057240 2012-03-14
PCT/JP2012/068505 WO2013136545A1 (ja) 2012-03-14 2012-07-20 構造化文書管理装置、構造化文書検索方法

Publications (1)

Publication Number Publication Date
CN103415850A true CN103415850A (zh) 2013-11-27

Family

ID=49160504

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012800029691A Pending CN103415850A (zh) 2012-03-14 2012-07-20 结构化文档管理装置、结构化文档检索方法

Country Status (4)

Country Link
US (1) US20130268554A1 (enExample)
JP (1) JP5417471B2 (enExample)
CN (1) CN103415850A (enExample)
WO (1) WO2013136545A1 (enExample)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105912585A (zh) * 2016-04-01 2016-08-31 乐视控股(北京)有限公司 一种邮件搜索方法及装置
CN106407330A (zh) * 2016-09-04 2017-02-15 乐视控股(北京)有限公司 一种电子邮件的显示方法及装置
CN107391535A (zh) * 2017-04-20 2017-11-24 阿里巴巴集团控股有限公司 在文档应用中搜索文档的方法及装置
CN108108387A (zh) * 2016-11-23 2018-06-01 谷歌有限责任公司 基于模版的结构化文档分类和提取
CN113204579A (zh) * 2021-04-29 2021-08-03 北京金山数字娱乐科技有限公司 内容关联方法、系统、装置、电子设备及存储介质

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10157175B2 (en) * 2013-03-15 2018-12-18 International Business Machines Corporation Business intelligence data models with concept identification using language-specific clues
US10698924B2 (en) 2014-05-22 2020-06-30 International Business Machines Corporation Generating partitioned hierarchical groups based on data sets for business intelligence data models
US10002179B2 (en) 2015-01-30 2018-06-19 International Business Machines Corporation Detection and creation of appropriate row concept during automated model generation
US9984116B2 (en) 2015-08-28 2018-05-29 International Business Machines Corporation Automated management of natural language queries in enterprise business intelligence analytics
JP6710007B1 (ja) * 2019-04-26 2020-06-17 Arithmer株式会社 対話管理サーバ、対話管理方法、及びプログラム
CN110175322A (zh) * 2019-05-22 2019-08-27 北京神州泰岳软件股份有限公司 一种文档的结构化方法及装置
CN110688842B (zh) * 2019-10-14 2023-06-09 鼎富智能科技有限公司 一种文档标题层级的分析方法、装置及服务器
US11663215B2 (en) 2020-08-12 2023-05-30 International Business Machines Corporation Selectively targeting content section for cognitive analytics and search
CN113408660B (zh) * 2021-07-15 2024-05-24 北京百度网讯科技有限公司 图书聚类方法、装置、设备和存储介质
CN116894176A (zh) * 2023-07-27 2023-10-17 国网江苏省电力有限公司经济技术研究院 一种面向输变电工程设计文档的指标提取优化方法和系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101014076A (zh) * 2006-01-31 2007-08-08 富士施乐株式会社 文档管理系统和方法、文档销毁管理系统和方法
US20090292698A1 (en) * 2002-01-25 2009-11-26 Martin Remy Method for extracting a compact representation of the topical content of an electronic text
US20100017390A1 (en) * 2008-07-16 2010-01-21 Kabushiki Kaisha Toshiba Apparatus, method and program product for presenting next search keyword

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6385602B1 (en) * 1998-11-03 2002-05-07 E-Centives, Inc. Presentation of search results using dynamic categorization
JP2003242175A (ja) * 2002-02-15 2003-08-29 Ricoh Co Ltd 文書検索システム、文書検索方法、その方法によったプログラムおよびそのプログラムを記憶した記憶媒体
JP3999093B2 (ja) * 2002-09-30 2007-10-31 株式会社東芝 構造化文書検索方法及び構造化文書検索システム
US20060150076A1 (en) * 2004-12-30 2006-07-06 Microsoft Corporation Methods and apparatus for the evaluation of aspects of a web page
JP2006195667A (ja) * 2005-01-12 2006-07-27 Toshiba Corp 構造化文書検索装置、構造化文書検索方法、及び構造化文書検索プログラム
US7546294B2 (en) * 2005-03-31 2009-06-09 Microsoft Corporation Automated relevance tuning
US20070150473A1 (en) * 2005-12-22 2007-06-28 Microsoft Corporation Search By Document Type And Relevance
US7779370B2 (en) * 2006-06-30 2010-08-17 Google Inc. User interface for mobile devices
JP2008146209A (ja) * 2006-12-07 2008-06-26 Just Syst Corp 文書検索装置、文書検索方法および文書検索プログラム
US9218414B2 (en) * 2007-02-06 2015-12-22 Dmitri Soubbotin System, method, and user interface for a search engine based on multi-document summarization
US20090055386A1 (en) * 2007-08-24 2009-02-26 Boss Gregory J System and Method for Enhanced In-Document Searching for Text Applications in a Data Processing System
US8538989B1 (en) * 2008-02-08 2013-09-17 Google Inc. Assigning weights to parts of a document
GB2472250A (en) * 2009-07-31 2011-02-02 Stephen Timothy Morris Method for determining document relevance
US8209361B2 (en) * 2010-01-19 2012-06-26 Oracle International Corporation Techniques for efficient and scalable processing of complex sets of XML schemas
US8140512B2 (en) * 2010-04-12 2012-03-20 Ancestry.Com Operations Inc. Consolidated information retrieval results
US8504567B2 (en) * 2010-08-23 2013-08-06 Yahoo! Inc. Automatically constructing titles

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090292698A1 (en) * 2002-01-25 2009-11-26 Martin Remy Method for extracting a compact representation of the topical content of an electronic text
CN101014076A (zh) * 2006-01-31 2007-08-08 富士施乐株式会社 文档管理系统和方法、文档销毁管理系统和方法
US20100017390A1 (en) * 2008-07-16 2010-01-21 Kabushiki Kaisha Toshiba Apparatus, method and program product for presenting next search keyword

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105912585A (zh) * 2016-04-01 2016-08-31 乐视控股(北京)有限公司 一种邮件搜索方法及装置
CN106407330A (zh) * 2016-09-04 2017-02-15 乐视控股(北京)有限公司 一种电子邮件的显示方法及装置
CN108108387A (zh) * 2016-11-23 2018-06-01 谷歌有限责任公司 基于模版的结构化文档分类和提取
CN107391535A (zh) * 2017-04-20 2017-11-24 阿里巴巴集团控股有限公司 在文档应用中搜索文档的方法及装置
CN113204579A (zh) * 2021-04-29 2021-08-03 北京金山数字娱乐科技有限公司 内容关联方法、系统、装置、电子设备及存储介质
CN113204579B (zh) * 2021-04-29 2024-06-07 北京金山数字娱乐科技有限公司 内容关联方法、系统、装置、电子设备及存储介质

Also Published As

Publication number Publication date
US20130268554A1 (en) 2013-10-10
JP2013191046A (ja) 2013-09-26
JP5417471B2 (ja) 2014-02-12
WO2013136545A1 (ja) 2013-09-19

Similar Documents

Publication Publication Date Title
CN103415850A (zh) 结构化文档管理装置、结构化文档检索方法
JP5721818B2 (ja) 検索におけるモデル情報群の使用
US9910932B2 (en) System and method for completing a user query and for providing a query response
US20150234927A1 (en) Application search method, apparatus, and terminal
KR20120112663A (ko) 검색 제안 클러스터링 및 프리젠테이션
CN101073080A (zh) 推荐搜索引擎关键词
US20100083102A1 (en) Online Content Editing of Dynamic Websites
CN103136228A (zh) 一种图片搜索方法以及图片搜索装置
CN102341800A (zh) 检索处理方法以及装置
US20130151936A1 (en) Page preview using contextual template metadata and labeling
JP2015525929A (ja) 検索品質を改善するための重みベースのステミング
JP2004341753A (ja) 検索支援装置、検索支援方法、およびプログラム
US20150339387A1 (en) Method of and system for furnishing a user of a client device with a network resource
US10078686B2 (en) Combination filter for search query suggestions
JP6664599B2 (ja) 曖昧性評価装置、曖昧性評価方法、及び曖昧性評価プログラム
JP2009037501A (ja) 情報検索装置、情報検索方法およびプログラム
US20160217181A1 (en) Annotating Query Suggestions With Descriptions
CN101836209B (zh) 管理信息地图的系统和方法
US9773035B1 (en) System and method for an annotation search index
RU2589856C2 (ru) Способ обработки целевого сообщения, способ обработки нового целевого сообщения и сервер (варианты)
JP2019133367A (ja) 営業支援装置及び方法
JP7314089B2 (ja) 検索支援システム、及び検索支援方法
JP4808181B2 (ja) ウェブページ情報処理装置、ウェブページ情報処理方法、及びウェブページ情報処理プログラム
JP2009169651A (ja) ドキュメント検索システム
KR20040100857A (ko) 검색 시스템에서의 데이터베이스 작성 방법 및 작성된데이터베이스를 포함하는 검색 시스템

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20131127