CN104102738A - Entity library expansion method and device - Google Patents

Entity library expansion method and device Download PDF

Info

Publication number
CN104102738A
CN104102738A CN201410364026.4A CN201410364026A CN104102738A CN 104102738 A CN104102738 A CN 104102738A CN 201410364026 A CN201410364026 A CN 201410364026A CN 104102738 A CN104102738 A CN 104102738A
Authority
CN
China
Prior art keywords
entity word
entity
storehouse
word
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410364026.4A
Other languages
Chinese (zh)
Other versions
CN104102738B (en
Inventor
梁爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201410364026.4A priority Critical patent/CN104102738B/en
Publication of CN104102738A publication Critical patent/CN104102738A/en
Application granted granted Critical
Publication of CN104102738B publication Critical patent/CN104102738B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses an entity library expansion method and device. The method comprises the following steps: obtaining structural data from a repository; identifying entity words from the field content of a preset meaning field of the structural data; screening the entity words according to a preset rule; and if the screened entity words do not exist in the entity library, adding the entity words into the entity library to expand the entity library. The accuracy of the entity words in the entity library can be improved.

Description

A kind of method and device that expands entity storehouse
Technical field
The present invention relates to internet information processing technology field, be specifically related to a kind of method and device that expands entity storehouse.
Background technology
Along with the development of communication technology and network, people carry out the search of various knowledge and information more and more by internet.Content supplier provides content to make all use can browse coequally per family, create, improve content platform in internet.
Such as Baidupedia, wikipedia, interactive encyclopaedia etc., can allow Internet user can find by encyclopaedia website comprehensive, accurate, the objective definitional information of oneself wanting, can carry out the inquiry of similar theme and browse for other users, to corresponding knowledge or reference are provided.For example, entry is that unit is cut apart on the basis of the contained content in encyclopaedia website, and an entry has one or more single themes, for setting forth a things, a personage or possessing the knowledge contents such as combination of particular topic.The entry that comprises a myriad of in encyclopaedia website, these encyclopaedia entries can improve the accuracy of retrieval and the coverage rate of retrieval widely, and are conducive to extract structural data from webpage, can carry out vertical search, obtain more accurate information.
Along with the continuous expansion that wide-scale distribution and the people of information exchange content, new term emerges in an endless stream.Greatly find valuable entry, the entity storehouse of expanding encyclopaedia website is the important goal of encyclopaedia product.Common implementation is all from existing data, the entity word that utilizes text dividing to analyze may to exist in text, judge which entity word is to exist in encyclopaedia entity storehouse, which is non-existent in encyclopaedia entity storehouse, and non-existent entity word is increased in encyclopaedia entity storehouse.But there is text dividing and the inaccurate problem of Attribute Recognition in this scheme.
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of method and device that expands entity storehouse, to overcome existing encyclopaedia entity storehouse, expands text dividing and the inaccurate problem of Attribute Recognition of existing.
First aspect, the embodiment of the present invention provides a kind of method that expands entity storehouse, comprising:
From resources bank, obtain structural data;
From the field contents of the preset implication field of described structural data, identify entity word;
Described entity word is screened according to preset rules;
If the entity word filtering out does not appear in entity storehouse, described entity word is added in described entity storehouse, to expand described entity storehouse.
Second aspect, the embodiment of the present invention also provides a kind of device that expands entity storehouse, comprising:
Structural data recognition unit, for obtaining structural data from resources bank;
Entity word recognition unit, identifies entity word for the field contents of the preset implication field from described structural data;
Entity word screening unit, for screening according to preset rules described entity word;
Entity word adding device, if do not appear at entity storehouse for the entity word filtering out, adds described entity word in described entity storehouse to, to expand described entity storehouse.
The technical scheme of the embodiment of the present invention by obtaining structural data from resources bank, from the field contents of preset implication field, identify entity word, after screening, the entity word not appearing in entity storehouse is added in entity storehouse, to expand described entity storehouse.Because the preset implication field of structural data itself has been carried out cutting to word content exactly, and corresponding to certain implication, so it is higher therefrom effectively to obtain the probability of entity word, can improve the accuracy that expands entity word in entity storehouse.
Accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme in the embodiment of the present invention, below the accompanying drawing of required use during the embodiment of the present invention is described is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to the content of the embodiment of the present invention and these accompanying drawings other accompanying drawing.
Fig. 1 is the method flow diagram in the expansion entity storehouse described in the embodiment of the present invention one;
Fig. 2 is the sectional drawing of the first example form of comprising in example entry in Baidupedia;
Fig. 3 is the sectional drawing of the second example form of comprising in example entry in Baidupedia;
Fig. 4 is the method flow diagram in the expansion entity storehouse described in the embodiment of the present invention two;
Fig. 5 is the structured flowchart of the device in the expansion entity storehouse described in the embodiment of the present invention three.
Embodiment
For the technical scheme of technical matters that the present invention is solved, employing and the technique effect that reaches clearer, below in conjunction with accompanying drawing, the technical scheme of the embodiment of the present invention is described in further detail, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Embodiment based in the present invention, those skilled in the art, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
Below in conjunction with accompanying drawing and by embodiment, further illustrate technical scheme of the present invention.
Embodiment mono-
Fig. 1 is the method flow diagram in the expansion entity storehouse that provides of the embodiment of the present invention one, the situation that the present embodiment expands entity storehouse applicable to the structural data utilizing in resources bank, entity word general reference noun and pronoun that the present embodiment is alleged, also can further refer to meet pre-conditioned noun and pronoun.Entity storehouse is for storing the database of the relevant information of each entity word, can cross and provide obtaining entity word related data for user.For example, in encyclopaedia, entity word refers to the subject name of entry, and entry is that unit is cut apart on the basis of the contained content in encyclopaedia website, entry comprise entity word, to the explanation of this entity word and with the relevant information of this entity word.In addition, the entity storehouse of other classifications, as music property storehouse, commodity entity storehouse etc., also can use music title, trade name etc. as entity word, stores the correlative detail data of each entity word, as music background introduction, the commodity place of production etc. in entity storehouse.
The method of the present embodiment can be carried out by the device that is configured in the expansion entity storehouse in server, and as shown in Figure 1, the method in the expansion entity storehouse described in the present embodiment comprises:
S101, from resources bank, obtain structural data.
Structural data refers to data is stored in respectively at least one preset implication field, conventionally can realize expression by bivariate table architecture logic, data in relevant database are all structural data, hereof, structural data comprises form, chart, the isostructural data of form.Data in preset implication field all meet the preset implication requirement of this field, have certain general character, such as being all name, being all address etc.The data of structured storage have been carried out preliminary division by preset implication field, and data have certain attributive character.
The alleged resources bank of the present embodiment can be the data source of arbitrary form, such as database, file bag, web page resources storehouse, electronic document etc., as long as can therefrom obtain structural data, and can excavate the entity word that need to extend to entity storehouse in described structural data.
Because the object of the present embodiment is to expand entity storehouse, the content comprising in the resources bank using is preferably the content higher with the content degree of correlation in this entity storehouse.And more for introducing other entity words that the related data of entity word exists in entity storehouse, relevance is stronger, is suitable for the instrument as expansion.For example,, if for expanding encyclopaedia entity storehouse, can preferably adopt encyclopaedia resources bank as resources bank.Take singer as example, in the related data that " Liu Dehua " this entity word is introduced, the entity word such as other star personages of singer's association, song, film is more therewith, from the dependency structure data of existing entity word, searches and filter out the having higher success rate of entity word of expansion.
S102, from the field contents of the preset implication field of described structural data, identify entity word.
Because structural data can come logical expression to realize by bivariate table structure, so the field contents general category of the same field of structural data (being the row in structural data) is identical.When needs expand entity storehouse, the present embodiment can be by the classification of the entity word that expands as required, in conjunction with what expand that target arranges field, conditioned disjunction enumerator is set should expand the field of target, from obtained structural data, filter out the preset implication field that meets described expansion target, obtain the field contents of the field of screening in described structural data, obtained field contents is identified and obtained entity word.If the field contents of some field can not Direct Recognition go out entity word, can carry out carrying out again entity word identifying operation after cutting to field contents.
For example, if target is for expanding the entity word in personage's classification, condition judgment field can be set and whether comprise " person ", " member ", " people ", and word or word such as " performer ", also can enumerator should expand the field " figure " of target, " director ", " cooperation performer ", and field name such as " singer ", with enumerated field name, be called example, can in structural data " film of the taking part in a performance " form from encyclopaedia entry " Liu Dehua ", filter out " figure ", " director ", and " cooperation performer " these three fields are as preset implication field, as shown in Figure 2.Can also in " for other people the creation " form from this entry, filter out " singer " this field as preset implication field, as shown in Figure 3.
Wherein, from the field contents of " figure ", " director " and " singer " field, can Direct Recognition go out entity word, and extract after field contents from " cooperation performer " field, need to carry out cutting by branch and identify entity word.
S103, described entity word is screened according to preset rules.
Described preset rules can arrange according to the expansion target in entity storehouse, for example, the entity word that number of words in described entity word is greater than to predetermined threshold value filters out, the entity word that belongs to blacklist is filtered out and/or the entity word that belongs to preset kind is filtered out to (for example comprising sequence number, time, special symbol).
It should be noted that, described preset rules can comprise the screening rule for the field contents of all preset implication fields, and described preset rules also can comprise respectively the screening rule for the field contents of each preset implication field.
If the entity word that S104 filters out does not appear in entity storehouse, described entity word is added in described entity storehouse, to expand described entity storehouse.
For fear of repeating to arrange entity word, operation S103 obtains after entity word, also needs to judge whether entity word has appeared in entity storehouse, and the entity word not appearing in described entity storehouse is added in described entity storehouse.
The technical scheme of the present embodiment by obtaining structural data from resources bank, from the field contents of preset implication field, identify entity word, after screening, the entity word not appearing in entity storehouse is added in entity storehouse, entity word ambiguity can be eliminated, the scope to structural data identification can be reduced.Because the preset implication field of structural data itself has been carried out cutting to word content exactly, and corresponding to certain implication, so it is higher therefrom effectively to obtain the probability of entity word, can improve the accuracy and efficiency of entity word identification, can improve the accuracy and efficiency that expands entity storehouse.
Embodiment bis-
Fig. 4 is the method flow diagram in the expansion entity storehouse described in the embodiment of the present invention two, the structural data that the present embodiment be take by encyclopaedia resources bank expand encyclopaedia entity storehouse as example discloses a kind of method that expands entity storehouse, as shown in Figure 4, the method in the expansion entity storehouse described in the present embodiment comprises:
S401, from encyclopaedia entity storehouse, obtain structural data.
As preferably, described resources bank can be this encyclopaedia entity storehouse, from this encyclopaedia entity storehouse inner excavated entity word, expands itself.
In general, for convenient search and data management, existing entity word in encyclopaedia entity storehouse is classified, such as being divided into the classifications such as song, film, personage, nature, culture, geography, history, life, society, art, economy, science and technology, physical culture, or some classification also has further deeper classification.Therefore,, in order to improve hit rate, described operation of obtaining structural data from resources bank, obtains structural data in the classification that the classification of the entity word that can more preferably expand with needs from this encyclopaedia entity storehouse is associated.For example, need to expand the entity word of the movies category in encyclopaedia entity storehouse, and the classification being associated with movies category is movies category and personage's classification, only need to from the movies category in encyclopaedia entity storehouse and personage's classification, obtain structural data, to dwindle the seek scope of structural data, thereby improve the efficiency that expands entity storehouse.
S402, obtain the preset implication field of described structural data.
When needs expand encyclopaedia entity storehouse, the present embodiment can be by the classification of the entity word that expands as required, in conjunction with what expand that target arranges field, conditioned disjunction enumerator is set should expand the field of target, from obtained structural data, filter out the preset implication field that meets described expansion target, such as filtering out fields such as time, address, obtain the field contents of the field of screening in described structural data, obtained field contents is identified and obtained entity word.
S403, obtain the field contents of the preset implication field of described structural data.
If the field contents of some field can not Direct Recognition go out entity word, can carry out carrying out again entity word identifying operation after cutting to field contents.
S404, filter out the field contents that has internal chaining.
The alleged internal chaining of the present embodiment refers to interior chain, inner in entity storehouse, if there is the related data of certain entity word, when this entity word appears in the related data of other entity words, internal links set up in entity word for this reason, so that user finds the related data of this entity word oneself easily.For example, in encyclopaedia entity storehouse, in the inner meeting of each entry, the existing entry wherein relating to is added to internal chaining, for user, by internal chaining, find webpage position and the classification of other related entries of entry.For example, in structural data " film of the taking part in a performance " form (as shown in Figure 2) in encyclopaedia entry " Liu Dehua " in " figure " these row, some field contents has added internal chaining, and some does not add internal chaining (delineation content as shown in Figure 2).To comprise that the content of adding internal chaining has appeared in the entity word of encyclopaedia, without interpolation, therefore, in order raising the efficiency, after obtaining field contents, can before the identification of entity word, to filter carrying out.
For example, in structural data from encyclopaedia entry " Liu Dehua " " film of taking part in a performance " form (as shown in Figure 2), in " figure ", " director " and " cooperation performer " these three preset implication fields, identify other entity word of figure kind, after obtaining these field contents, by existing the field contents of internal chaining to filter out, only obtaining, do not add internal chaining (delineation content as shown in Figure 2).And for example, in " for other people creation " these row of form (as shown in Figure 3) " song title " from this entry, identify the entity word of song classification, filter out after the field contents of internal chaining, only obtain the delineation content of not adding internal chaining (as shown in Figure 3).By filtering, exist the field contents of internal chaining to screen in advance, can dwindle the scope of entity word identification, thereby can raise the efficiency.
In S405, the field contents from filtering, identify entity word.
S406, described entity word is screened according to preset rules.
S407, described entity word is carried out to duplicate removal processing.
It should be noted that, this operation can be carried out after screening, also can before screening, carry out.By the entity word to identified, carry out duplicate removal processing, can further reduce to operate the number of the entity word in 408, can avoid repeating adding simultaneously.
If the described entity word of S408 does not appear in the entity word of encyclopaedia, described entity word is added in encyclopaedia entity storehouse.
It is example that the present embodiment be take by the structural data expansion encyclopaedia entity storehouse in encyclopaedia resources bank, on the basis of embodiment mono-, increased the operation that filters out the field contents that has internal chaining, and increased the operation of entity word being carried out to duplicate removal processing, can further improve the efficiency that expands entity storehouse.
Embodiment tri-
Fig. 5 is the structured flowchart of the device in the expansion entity storehouse described in the embodiment of the present invention three, and as shown in Figure 5, the device in the expansion entity storehouse described in the present embodiment comprises:
Structural data recognition unit 501, for obtaining structural data from resources bank;
Entity word recognition unit 502, identifies entity word for the field contents of the preset implication field from described structural data;
Entity word screening unit 503, for screening according to preset rules described entity word;
Entity word adding device 504, if do not appear at entity storehouse for the entity word filtering out, adds described entity word in described entity storehouse to, to expand described entity storehouse.
Further, described resources bank is encyclopaedia resources bank.
Further, described entity word recognition unit 502 specifically for:
Obtain the field contents of the preset implication field of described structural data;
If described field contents does not exist internal chaining in described resources bank, from described field contents, identify entity word.
Further, described entity word screening unit 503 specifically for:
The entity word that meets following at least one is filtered out: in described entity word, number of words is greater than the entity word of predetermined threshold value, the entity word that belongs to blacklist, the entity word that comprises default symbol and belongs to the entity word of preset kind.
Further, described entity word screening unit 503 also for: before adding described entity word the operation of described entity storehouse to, also comprise: described entity word is carried out to duplicate removal processing.
The device in the expansion entity storehouse that the present embodiment provides can be carried out the method in the expansion entity storehouse that the embodiment of the present invention one and embodiment bis-provide, and possesses the corresponding functional module of manner of execution and beneficial effect.
All or part of content in the technical scheme that above embodiment provides can realize by software programming, and its software program is stored in the storage medium can read, storage medium for example: the hard disk in computing machine, CD or floppy disk.
Note, above are only preferred embodiment of the present invention and institute's application technology principle.Skilled person in the art will appreciate that and the invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious variations, readjust and substitute and can not depart from protection scope of the present invention.Therefore, although the present invention is described in further detail by above embodiment, the present invention is not limited only to above embodiment, in the situation that not departing from the present invention's design, can also comprise more other equivalent embodiment, and scope of the present invention is determined by appended claim scope.

Claims (10)

1. a method that expands entity storehouse, is characterized in that, comprising:
From resources bank, obtain structural data;
From the field contents of the preset implication field of described structural data, identify entity word;
Described entity word is screened according to preset rules;
If the entity word filtering out does not appear in entity storehouse, described entity word is added in described entity storehouse, to expand described entity storehouse.
2. method according to claim 1, is characterized in that, described resources bank is encyclopaedia resources bank.
3. method according to claim 2, is characterized in that, the operation that identifies entity word from the field contents of the preset implication field of described structural data specifically comprises:
Obtain the field contents of the preset implication field of described structural data;
If described field contents does not exist internal chaining in described resources bank, from described field contents, identify entity word.
4. method according to claim 1, is characterized in that, the operation that described entity word is screened according to preset rules specifically comprises:
The entity word that meets following at least one is filtered out: in described entity word, number of words is greater than the entity word of predetermined threshold value, the entity word that belongs to blacklist, the entity word that comprises default symbol and belongs to the entity word of preset kind.
5. method according to claim 1, is characterized in that, before adding described entity word to operation in described entity storehouse, also comprises: described entity word is carried out to duplicate removal processing.
6. a device that expands entity storehouse, is characterized in that, comprising:
Structural data recognition unit, for obtaining structural data from resources bank;
Entity word recognition unit, identifies entity word for the field contents of the preset implication field from described structural data;
Entity word screening unit, for screening according to preset rules described entity word;
Entity word adding device, if do not appear at entity storehouse for the entity word filtering out, adds described entity word in described entity storehouse to, to expand described entity storehouse.
7. device according to claim 6, is characterized in that, described resources bank is encyclopaedia resources bank.
8. device according to claim 7, is characterized in that, described entity word recognition unit specifically for:
Obtain the field contents of the preset implication field of described structural data;
If described field contents does not exist internal chaining in described resources bank, from described field contents, identify entity word.
9. device according to claim 6, it is characterized in that, described entity word screening unit specifically for, the entity word that meets following at least one is filtered out: in described entity word, number of words is greater than the entity word of predetermined threshold value, the entity word that belongs to blacklist, the entity word that comprises default symbol and belongs to the entity word of preset kind.
10. device according to claim 6, is characterized in that, described entity word screening unit also for: before adding described entity word the operation of described entity storehouse to, also comprise: described entity word is carried out to duplicate removal processing.
CN201410364026.4A 2014-07-28 2014-07-28 A kind of method and device for expanding entity storehouse Active CN104102738B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410364026.4A CN104102738B (en) 2014-07-28 2014-07-28 A kind of method and device for expanding entity storehouse

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410364026.4A CN104102738B (en) 2014-07-28 2014-07-28 A kind of method and device for expanding entity storehouse

Publications (2)

Publication Number Publication Date
CN104102738A true CN104102738A (en) 2014-10-15
CN104102738B CN104102738B (en) 2018-04-27

Family

ID=51670891

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410364026.4A Active CN104102738B (en) 2014-07-28 2014-07-28 A kind of method and device for expanding entity storehouse

Country Status (1)

Country Link
CN (1) CN104102738B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106168947A (en) * 2016-07-01 2016-11-30 北京奇虎科技有限公司 A kind of related entities method for digging and system
CN110309355A (en) * 2018-06-15 2019-10-08 腾讯科技(深圳)有限公司 Generation method, device, equipment and the storage medium of content tab

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1607527A (en) * 2003-08-07 2005-04-20 索尼株式会社 Setting user preferences for an electronic program guide
US7698293B2 (en) * 2005-01-28 2010-04-13 Microsoft Corporation System and methods for capturing structure of data models using entity patterns
CN101901235A (en) * 2009-05-27 2010-12-01 国际商业机器公司 Method and system for document processing
CN102495892A (en) * 2011-12-09 2012-06-13 北京大学 Webpage information extraction method
US20120221324A1 (en) * 2011-02-28 2012-08-30 Hitachi, Ltd. Document Processing Apparatus
CN103106189A (en) * 2011-11-11 2013-05-15 北京百度网讯科技有限公司 Method and device for excavating synonymous attribute words
CN103425660A (en) * 2012-05-15 2013-12-04 北京百度网讯科技有限公司 Method and device for acquiring entries
CN103440287A (en) * 2013-08-14 2013-12-11 广东工业大学 Web question-answering retrieval system based on product information structuring

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1607527A (en) * 2003-08-07 2005-04-20 索尼株式会社 Setting user preferences for an electronic program guide
US7698293B2 (en) * 2005-01-28 2010-04-13 Microsoft Corporation System and methods for capturing structure of data models using entity patterns
CN101901235A (en) * 2009-05-27 2010-12-01 国际商业机器公司 Method and system for document processing
US20120221324A1 (en) * 2011-02-28 2012-08-30 Hitachi, Ltd. Document Processing Apparatus
CN103106189A (en) * 2011-11-11 2013-05-15 北京百度网讯科技有限公司 Method and device for excavating synonymous attribute words
CN102495892A (en) * 2011-12-09 2012-06-13 北京大学 Webpage information extraction method
CN103425660A (en) * 2012-05-15 2013-12-04 北京百度网讯科技有限公司 Method and device for acquiring entries
CN103440287A (en) * 2013-08-14 2013-12-11 广东工业大学 Web question-answering retrieval system based on product information structuring

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106168947A (en) * 2016-07-01 2016-11-30 北京奇虎科技有限公司 A kind of related entities method for digging and system
CN110309355A (en) * 2018-06-15 2019-10-08 腾讯科技(深圳)有限公司 Generation method, device, equipment and the storage medium of content tab

Also Published As

Publication number Publication date
CN104102738B (en) 2018-04-27

Similar Documents

Publication Publication Date Title
JP5575902B2 (en) Information retrieval based on query semantic patterns
Szomszor et al. Semantic modelling of user interests based on cross-folksonomy analysis
TWI652584B (en) Method and device for matching text information and pushing business objects
JP5332477B2 (en) Automatic generation of term hierarchy
Reinanda et al. Mining, ranking and recommending entity aspects
US20110264651A1 (en) Large scale entity-specific resource classification
US20120158703A1 (en) Search lexicon expansion
JP2017508214A (en) Provide search recommendations
CN101872351A (en) Method, device for identifying synonyms, and method and device for searching by using same
JP2009093650A (en) Selection of tag for document by paragraph analysis of document
CN105488068B (en) It searches for music and establishes the method and device of index, search result judgment method
US20170109358A1 (en) Method and system of determining enterprise content specific taxonomies and surrogate tags
CN110969022B (en) Semantic determining method and related equipment
TW201923629A (en) Data processing method and apparatus
CN103838798A (en) Page classification system and method
Lipczak et al. The impact of resource title on tags in collaborative tagging systems
Yao et al. Evolutionary taxonomy construction from dynamic tag space
Faralli et al. Automatic acquisition of a taxonomy of microblogs users’ interests
Ming et al. Prototype hierarchy based clustering for the categorization and navigation of web collections
US20110119261A1 (en) Searching using semantic keys
Böhm et al. Latent topics in graph-structured data
CN107977420A (en) The abstract extraction method, apparatus and readable storage medium storing program for executing of a kind of evolved document
CN106294358A (en) The search method of a kind of information and system
CN105653546A (en) Method and system for searching target theme
CN111984786A (en) Intelligent whistle blowing early warning method based on news information and server

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant