CN102193929B - 利用词信息熵的搜索方法及其设备 - Google Patents
利用词信息熵的搜索方法及其设备 Download PDFInfo
- Publication number
- CN102193929B CN102193929B CN2010101205640A CN201010120564A CN102193929B CN 102193929 B CN102193929 B CN 102193929B CN 2010101205640 A CN2010101205640 A CN 2010101205640A CN 201010120564 A CN201010120564 A CN 201010120564A CN 102193929 B CN102193929 B CN 102193929B
- Authority
- CN
- China
- Prior art keywords
- word
- information entropy
- searching request
- word information
- search
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/319—Inverted lists
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3325—Reformulation based on results of preceding query
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Creation or modification of classes or clusters
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Priority Applications (7)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2010101205640A CN102193929B (zh) | 2010-03-08 | 2010-03-08 | 利用词信息熵的搜索方法及其设备 |
| US12/932,643 US8566303B2 (en) | 2010-03-08 | 2011-03-01 | Determining word information entropies |
| JP2012557039A JP5450842B2 (ja) | 2010-03-08 | 2011-03-02 | 単語情報エントロピの決定 |
| PCT/US2011/000401 WO2011112238A1 (en) | 2010-03-08 | 2011-03-02 | Determining word information entropies |
| EP11753707.6A EP2545439A4 (en) | 2010-03-08 | 2011-03-02 | Determining word information entropies |
| HK12100205.7A HK1159813B (en) | 2012-01-09 | Method and apparatus for searching by using word information entropies | |
| US14/024,431 US9342627B2 (en) | 2010-03-08 | 2013-09-11 | Determining word information entropies |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2010101205640A CN102193929B (zh) | 2010-03-08 | 2010-03-08 | 利用词信息熵的搜索方法及其设备 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN102193929A CN102193929A (zh) | 2011-09-21 |
| CN102193929B true CN102193929B (zh) | 2013-03-13 |
Family
ID=44532194
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN2010101205640A Expired - Fee Related CN102193929B (zh) | 2010-03-08 | 2010-03-08 | 利用词信息熵的搜索方法及其设备 |
Country Status (5)
| Country | Link |
|---|---|
| US (2) | US8566303B2 (enExample) |
| EP (1) | EP2545439A4 (enExample) |
| JP (1) | JP5450842B2 (enExample) |
| CN (1) | CN102193929B (enExample) |
| WO (1) | WO2011112238A1 (enExample) |
Families Citing this family (21)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8938466B2 (en) * | 2010-01-15 | 2015-01-20 | Lexisnexis, A Division Of Reed Elsevier Inc. | Systems and methods for ranking documents |
| US9110986B2 (en) * | 2011-01-31 | 2015-08-18 | Vexigo, Ltd. | System and method for using a combination of semantic and statistical processing of input strings or other data content |
| CN103116572B (zh) * | 2013-02-02 | 2015-10-21 | 深圳先进技术研究院 | 文学作品出品时期识别方法及装置 |
| CN103106192B (zh) * | 2013-02-02 | 2016-02-03 | 深圳先进技术研究院 | 文学作品作者识别方法及装置 |
| CN103678274A (zh) * | 2013-04-15 | 2014-03-26 | 南京邮电大学 | 一种基于改进互信息和熵的文本分类特征提取方法 |
| US20140350919A1 (en) * | 2013-05-27 | 2014-11-27 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for word counting |
| CN104009970A (zh) * | 2013-09-17 | 2014-08-27 | 宁波公众信息产业有限公司 | 一种网络信息采集方法 |
| US10042936B1 (en) * | 2014-07-11 | 2018-08-07 | Google Llc | Frequency-based content analysis |
| CN104199832B (zh) * | 2014-08-01 | 2017-08-22 | 西安理工大学 | 基于信息熵的金融网络异常交易社区发现方法 |
| CN105224695B (zh) * | 2015-11-12 | 2018-04-20 | 中南大学 | 一种基于信息熵的文本特征量化方法和装置及文本分类方法和装置 |
| CN106649868B (zh) * | 2016-12-30 | 2019-03-26 | 首都师范大学 | 问答匹配方法及装置 |
| US10607604B2 (en) * | 2017-10-27 | 2020-03-31 | International Business Machines Corporation | Method for re-aligning corpus and improving the consistency |
| CN108256070B (zh) * | 2018-01-17 | 2022-07-15 | 北京百度网讯科技有限公司 | 用于生成信息的方法和装置 |
| CN108664470B (zh) * | 2018-05-04 | 2022-06-17 | 武汉斗鱼网络科技有限公司 | 视频标题信息量的度量方法、可读存储介质及电子设备 |
| CN110750986B (zh) * | 2018-07-04 | 2023-10-10 | 普天信息技术有限公司 | 基于最小信息熵的神经网络分词系统及训练方法 |
| JP6948425B2 (ja) * | 2020-03-19 | 2021-10-13 | ヤフー株式会社 | 判定装置、判定方法及び判定プログラム |
| CN112765975B (zh) * | 2020-12-25 | 2023-08-04 | 北京百度网讯科技有限公司 | 分词岐义处理方法、装置、设备以及介质 |
| JP7045515B1 (ja) * | 2021-07-19 | 2022-03-31 | ヤフー株式会社 | 情報処理装置、情報処理方法および情報処理プログラム |
| US12153619B2 (en) * | 2022-09-20 | 2024-11-26 | Adobe Inc. | Generative prompt expansion for image generation |
| US12314309B2 (en) * | 2022-09-23 | 2025-05-27 | Adobe Inc. | Zero-shot entity-aware nearest neighbors retrieval |
| CN115858478B (zh) * | 2023-02-24 | 2023-05-12 | 山东中联翰元教育科技有限公司 | 一种可互动的智慧教学平台的数据快速压缩方法 |
Family Cites Families (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| FR2825814B1 (fr) * | 2001-06-07 | 2003-09-19 | Commissariat Energie Atomique | Procede de creation automatique d'une base de donnees images interrogeable par son contenu semantique |
| US6836777B2 (en) * | 2001-11-15 | 2004-12-28 | Ncr Corporation | System and method for constructing generic analytical database applications |
| US6941297B2 (en) * | 2002-07-31 | 2005-09-06 | International Business Machines Corporation | Automatic query refinement |
| CN1629833A (zh) * | 2003-12-17 | 2005-06-22 | 国际商业机器公司 | 实现问与答功能和计算机辅助写作的方法及装置 |
| US20080077570A1 (en) * | 2004-10-25 | 2008-03-27 | Infovell, Inc. | Full Text Query and Search Systems and Method of Use |
| JP2006343925A (ja) * | 2005-06-08 | 2006-12-21 | Fuji Xerox Co Ltd | 関連語辞書作成装置、および関連語辞書作成方法、並びにコンピュータ・プログラム |
| US20070250501A1 (en) * | 2005-09-27 | 2007-10-25 | Grubb Michael L | Search result delivery engine |
| CN101535945A (zh) * | 2006-04-25 | 2009-09-16 | 英孚威尔公司 | 全文查询和搜索系统及其使用方法 |
| CN101122909B (zh) * | 2006-08-10 | 2010-06-16 | 株式会社日立制作所 | 文本信息检索装置以及文本信息检索方法 |
| US7392250B1 (en) * | 2007-10-22 | 2008-06-24 | International Business Machines Corporation | Discovering interestingness in faceted search |
| US7860885B2 (en) * | 2007-12-05 | 2010-12-28 | Palo Alto Research Center Incorporated | Inbound content filtering via automated inference detection |
| US7877389B2 (en) * | 2007-12-14 | 2011-01-25 | Yahoo, Inc. | Segmentation of search topics in query logs |
| US8190541B2 (en) * | 2008-02-25 | 2012-05-29 | Atigeo Llc | Determining relevant information for domains of interest |
| CN101510221B (zh) * | 2009-02-17 | 2012-05-30 | 北京大学 | 一种用于信息检索的查询语句分析方法与系统 |
| US9928296B2 (en) * | 2010-12-16 | 2018-03-27 | Microsoft Technology Licensing, Llc | Search lexicon expansion |
-
2010
- 2010-03-08 CN CN2010101205640A patent/CN102193929B/zh not_active Expired - Fee Related
-
2011
- 2011-03-01 US US12/932,643 patent/US8566303B2/en active Active
- 2011-03-02 WO PCT/US2011/000401 patent/WO2011112238A1/en not_active Ceased
- 2011-03-02 EP EP11753707.6A patent/EP2545439A4/en not_active Withdrawn
- 2011-03-02 JP JP2012557039A patent/JP5450842B2/ja not_active Expired - Fee Related
-
2013
- 2013-09-11 US US14/024,431 patent/US9342627B2/en active Active
Also Published As
| Publication number | Publication date |
|---|---|
| EP2545439A1 (en) | 2013-01-16 |
| JP2013522720A (ja) | 2013-06-13 |
| US8566303B2 (en) | 2013-10-22 |
| JP5450842B2 (ja) | 2014-03-26 |
| US20110219004A1 (en) | 2011-09-08 |
| WO2011112238A1 (en) | 2011-09-15 |
| CN102193929A (zh) | 2011-09-21 |
| US20140074884A1 (en) | 2014-03-13 |
| US9342627B2 (en) | 2016-05-17 |
| EP2545439A4 (en) | 2017-03-08 |
| HK1159813A1 (en) | 2012-08-03 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN102193929B (zh) | 利用词信息熵的搜索方法及其设备 | |
| US10546005B2 (en) | Perspective data analysis and management | |
| US8990241B2 (en) | System and method for recommending queries related to trending topics based on a received query | |
| CN102760138B (zh) | 用户网络行为的分类方法和装置及对应的搜索方法和装置 | |
| WO2022127543A1 (zh) | 广告信息处理方法、装置、设备和存储介质 | |
| CN110390052B (zh) | 搜索推荐方法、ctr预估模型的训练方法、装置及设备 | |
| CN104102639B (zh) | 基于文本分类的推广触发方法和装置 | |
| CN106796578A (zh) | 知识自动化系统 | |
| CN103020049A (zh) | 搜索方法及搜索系统 | |
| CN105183733A (zh) | 一种文本信息的匹配、业务对象的推送方法和装置 | |
| CN114330329A (zh) | 一种业务内容搜索方法、装置、电子设备及存储介质 | |
| US10346496B2 (en) | Information category obtaining method and apparatus | |
| CN104978332B (zh) | 用户生成内容标签数据生成方法、装置及相关方法和装置 | |
| JP2012533819A (ja) | 文書インデックス化およびデータクエリングのための方法およびシステム | |
| CN103744887B (zh) | 一种用于人物搜索的方法、装置和计算机设备 | |
| CN103365904A (zh) | 一种广告信息搜索方法和系统 | |
| CN111191111A (zh) | 内容推荐方法、装置及存储介质 | |
| CN115017200B (zh) | 搜索结果的排序方法、装置、电子设备和存储介质 | |
| US10055478B2 (en) | Perspective data analysis and management | |
| CN104077707A (zh) | 一种推广呈现方式的优化方法和装置 | |
| CN103942232A (zh) | 用于挖掘意图的方法和设备 | |
| CN103186650B (zh) | 一种搜索方法和装置 | |
| CN104252487A (zh) | 一种用于生成词条信息的方法和装置 | |
| CN111694929B (zh) | 基于数据图谱的搜索方法、智能终端和可读存储介质 | |
| CN107133321B (zh) | 页面的搜索特性的分析方法和分析装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1159813 Country of ref document: HK |
|
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: GR Ref document number: 1159813 Country of ref document: HK |
|
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20130313 |