CN105335510A - Text data efficient searching method - Google Patents

Text data efficient searching method Download PDF

Info

Publication number
CN105335510A
CN105335510A CN201510725603.2A CN201510725603A CN105335510A CN 105335510 A CN105335510 A CN 105335510A CN 201510725603 A CN201510725603 A CN 201510725603A CN 105335510 A CN105335510 A CN 105335510A
Authority
CN
China
Prior art keywords
concept
document
user
word
mapping
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510725603.2A
Other languages
Chinese (zh)
Inventor
李垚霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Boruide Science & Technology Co Ltd
Original Assignee
Chengdu Boruide Science & Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Boruide Science & Technology Co Ltd filed Critical Chengdu Boruide Science & Technology Co Ltd
Priority to CN201510725603.2A priority Critical patent/CN105335510A/en
Publication of CN105335510A publication Critical patent/CN105335510A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The invention provides a text data efficient searching method. The method comprises the steps of performing concept description on an entity, and building an entity knowledge base; carrying out a semantic analysis on a text document according to the knowledge base; calculating a similarity value of a user search term with an entity concept, sorting search results on the basis of the calculated similarity value and returning the same to a user. According to the text data efficient searching method provided by the invention, the defects of traditional data search can be overcome, and the data search efficiency is improved in the aspects of information recall ratio and precision ratio.

Description

Text data high-efficiency search method
Technical field
The present invention relates to natural language processing, particularly a kind of text data high-efficiency search method.
Background technology
Along with the fast development of Internet technology, society enters the information age, and especially the quantity of the large data background in current financial field lower network text message document is more and more huger.Network information text document presents more complicated characteristic thereupon, exposes some problem demanding prompt solutions.But, traditional financial domain search engine is in the face of the syntactic level of text character String matching, the semantic class analysis that shortage represents for information and processes and understand, namely information is abundant, and knowledge is poor, so rely on the demand that traditional data retrieval mode has been difficult to meet more and more higher financial class user.
Summary of the invention
For solving the problem existing for above-mentioned prior art, the present invention proposes a kind of text data high-efficiency search method, comprising:
Conceptual description is carried out to entity, builds entity knowledge base;
Based on described knowledge base, semantic analysis is carried out to text document;
Calculate the similar value of user search word and entitative concept, and
Based on calculated similar value result for retrieval sorted and return to user.
Preferably, described semantic analysis is carried out to text document, comprise and the semanteme of document is marked, and extract file characteristics and carry out text mapping, from entity vocabulary, obtain entitative concept, set up the semantic feature territory of document, complete the automatic mark of document library document, and the non-semantic feature of mark and index file, generate document index storehouse and metadatabase thus, wherein based on document markup information architecture index database, be according to retrieving the document information of meeting consumers' demand with index database;
Describedly based on similar value, result for retrieval to be sorted, comprise with the entity dictionary of solid generation for foundation, perform the participle process of user search input characters, user search is divided into entitative concept set and the set of non-physical concept; Then press similar value to these two set respectively to expand, obtain two retrieval candidate collection, obtain the retrieval set after sequence, after finally result for retrieval being pressed the similar value sequence of retrieval request, result is pushed to user.
Preferably, described text mapping comprises the following steps:
First entitative concept is described as F=(U, T, J, Y), wherein U={u 1, u 2..., u | U|represent the user using word management text document, and each user has No. ID unique mark; T={t 1, t 2..., t | T|representing the used word of user in set, this word is arbitrary character string, J={i 1, i 2..., i | J|representing all spectra related text document, its content depends on the type of user's tag set, and user's tag set is made up of user, word, document three key elements, namely uses (U, T, J) to be described; represent ternary relation, wherein (u, t, i) element describes the text document i that user uses the collection of word t mark; F (u, i)={ t ∈ T| (u, t, i) ∈ Y} describes user and uses one group of word to define a text document, wherein u ∈ U, i ∈ J; Principal BO=(C, R) is built, wherein C={c by two tuples 1, c 2..., C | c|representing concept set, described representation of concept is c=(id, syn, phase, kind), id is the unique identification of concept, and syn is TongYiCi CiLin, and phase is the phrase describing concept, and kind is the part of speech of concept being classified; R={r 1, r 2..., r | R|represent relation between concept set; Now be defined as one group of TongYiCi CiLin S, each text document w ∈ S two tuples are expressed as: (w, fq c)), wherein, fq cw () is the frequency of occurrences of text document w;
In the text mapping stage, utilize one of following text mapping method:
Direct mapping, is mapped to the concept in entity, is expressed as TC by each word: for all t ∈ T, have T → 2 c; Wherein, in concept set, each t of C is a resource in TongYiCi CiLin syn, and words of description is to the direct mapping of concept;
Part mapping, namely when word can not directly map, the time from start to end, progressively shortens into a word by phrase; Attempt which stage can mapping shortening phrase in from the left side of phrase based on grammer, then carry out from the right improving amendment;
First Document mapping, namely arrange the matrix D C:[t being used for mapping intensity between words of description and concept i, t j] m × n, wherein, m=|T| and word quantity, n=|C| and concept quantity; In mapping process, produce initial matrix, the mapping intensity of initial matrix is the syn document word frequency be associated:
DC : [ t j , c j ] = f q c j ( t j ) c j ∈ TC ( t i ) 0
After mapping terminates, the value of initial matrix DC represents t in dictionary iand c jmapping intensity.
The present invention compared to existing technology, has the following advantages:
The present invention proposes a kind of text data high-efficiency search method, make up the deficiency of traditional data retrieval, improve the efficiency of data retrieval from information recall ratio and precision ratio aspect.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the text data high-efficiency search method according to the embodiment of the present invention.
Embodiment
Detailed description to one or more embodiment of the present invention is hereafter provided together with the accompanying drawing of the diagram principle of the invention.Describe the present invention in conjunction with such embodiment, but the invention is not restricted to any embodiment.Scope of the present invention is only defined by the claims, and the present invention contain many substitute, amendment and equivalent.Set forth many details in the following description to provide thorough understanding of the present invention.These details are provided for exemplary purposes, and also can realize the present invention according to claims without some in these details or all details.
An aspect of of the present present invention provides a kind of text data high-efficiency search method.Fig. 1 is the text data high-efficiency search method process flow diagram according to the embodiment of the present invention.The present invention utilizes entity to set up semantic relation between word, realizes semantic retrieval, by descriptive power abundant in semanteme and powerful logical reasoning ability accurate description information text document, builds a kind of search method that can realize semantic class and analyze.Semantic retrieval is different based on the retrieval mode of keyword match from tradition, because semantic retrieval is analyzed the retrieval request that information text document and user are submitted to based on the semantic class of comprehension of information, semantic retrieval mode all imparts semantic component to search condition, Information Organization and result for retrieval, can improve retrieval precision.
Semantic data based on entity is retrieved, be to make document carry out semantic description, the semantic marker to document object is completed by entity knowledge base, and then analyze the semantic understanding of document object, and the semantic information of user search word, the semanteme that simultaneously can realize entity retrieves word is expanded, and finally complete the acquisition of desirable result for retrieval, concrete retrieving is:
Step 1: build and describe entitative concept.Describe entitative concept, build entity knowledge base.
Step 2: extract the feature of document and carry out text mapping.Describe and construction of knowledge base and management by entitative concept, semantic marker and text mapping are carried out to the document obtained, document semantic implication is analyzed.
Step 3: formulate entitative concept and expand and retrieve semantic expansion strategy.On the basis that entitative concept describes, analyze the semantic information of user search request, and carry out semanteme expansion to user search word, searching system completes the retrieving to the alternative retrieval word set generated.
Step 4: carry out the calculating of entitative concept similar value.Rely on entitative concept structural drawing, perform the calculating of entitative concept semantic distance, entitative concept node depth calculation and semantic overall similar value and calculate, serve result for retrieval sequencer procedure with this.
Step 5: the result for retrieval of searching system is sorted.Based on association similar value computation rule, the similar value that the original term of completing user and searching system return result for retrieval compares, and sorts based on the result for retrieval of similar value to searching system and feeds back to user.
Wherein, the core that entity and taxonomic hierarchies thereof characterize as data text document semantic, can describe document and instruct.Domain knowledge is also as the basic foundation that retrieval is expanded and result for retrieval sorts.Therefore, the structure of domain knowledge and maintenance, as structure and the maintenance of inference rule, be unable to do without domain knowledge management.The semanteme of document carries out semantic analysis by mark, relend and help file characteristics extractive technique, entitative concept is obtained from entity vocabulary, set up the semantic feature territory of document, complete the automatic mark of document library document, and complete the non-semantic feature of mark and index file, generate document index storehouse and metadatabase thus.Based on document markup information architecture index database, on this basis, the document information that can meet consumers' demand is retrieved.Realizing retrieval to expand and sort result, is exactly with the entity dictionary of solid generation for foundation, and the participle work of completing user retrieval input characters, is divided into entitative concept set and the set of non-physical concept by user search.Then, respectively these two set are expanded by similar value, obtain two retrieval candidate collection, the retrieval set after sequence is obtained by association similar value sort algorithm, finally, complete the submission link of retrieval request to index database and search library, by result for retrieval by after the similar value sequence of retrieval request, result is pushed to user.
Entitative concept is described as by the present invention: F=(U, T, J, Y), wherein, and U={u 1, u 2..., u | U|represent user, describe the user using word management text document, and each user there is No. ID unique mark.T={t 1, t 2..., t | T|representing word, the used word of user in description collections, word can be arbitrary character string (word or expression), now word is expressed as one group of sequence of terms, t={term 1, term 2..., term m, t ∈ T, above-mentioned formula words of description is also mapped to one group of term, and term can be any word.J={i 1, i 2..., i | J|represent text document, describe all spectra relevant documentation, its content depends on the type of user's tag set, and user's tag set is made up of user, word, document three key elements, namely uses (U, T, J) to be described. represent ternary relation, wherein (u, t, i) element describes the text document i that user uses the collection of word t mark.F (u, i)={ t ∈ T| (u, t, i) ∈ Y} describes user and uses one group of word to define a text document, wherein u ∈ U, i ∈ J.
In order to understand the relation between user words implication and word, building principal, being described by two tuples: BO=(C, R), wherein C={c 1, c 2..., C | c|representing concept set, representation of concept is c=(id, syn, phase, kind), id is the unique identification of concept, and syn is TongYiCi CiLin, contain the synonymous term set of concept, phase is the phrase describing concept, and kind is the part of speech of concept being classified; R={r 1, r 2..., r | R|represent relation between concept set.Now be defined as one group of TongYiCi CiLin S, text document w ∈ S, each text document w two tuples are expressed as: (w, fq c)), wherein, fq cw () is the frequency of occurrences of text document w.
A word can be mapped to one or more concept, also likely only has partial words can be mapped to one or more concept, and the present invention utilizes following several text mapping method.
Direct mapping: words of description, to the mapping of concept, is mapped to the concept in entity by each word, can be expressed as: TC: for all t ∈ T, have T → 2 c.Wherein, each t in concept set is a text document in syn, and words of description is to the direct mapping of concept.
Part mapping: when word can not directly map, the time from start to end, can complete part mapping as follows.Step 1: phrase is progressively shortened into a word.Step 2: based on grammer, from the left side of phrase, attempts which stage can mapping shortening phrase in, then carries out from the right improving amendment.
Document mapping: first the matrix being used for mapping intensity between words of description and concept is set: DC:[t i, t j] m × n, wherein, m=|T| and word quantity, n=|C| and concept quantity.To produce initial matrix in mapping process, the mapping intensity of initial matrix is the syn text document word frequencies be associated:
D C : [ t i , c j ] = fq c j ( t i ) c j ∈ T C ( t i ) 0
After mapping terminates, the value of initial matrix DC represents t in dictionary iand c jmapping intensity.
Further, entitative concept expansion step of the present invention is as follows.
Step 1: each entitative concept is expanded.
Make e (c)={ x|sim (x, c) >p ∩ y| (y ≠ x) and sim (y, c) <sim (x, c) } be entitative concept set, sim () is the similar value function of two entitative concepts, wherein p preset similarity value threshold value.If carry out semanteme expansion to entitative concept can form e (C 1)={ C 11, C 12..., C 1i, wherein each element or be sky in set, or sim (C 1k, C 1) >p, and sim (C 1k, C 1) <sim (C 1k, C m).
That is, single entity concept can be expanded based on the calculating of association similar value, similar value is selected to be greater than the entitative concept of given threshold value p, and the entitative concept be selected is with the similar value between other user subject concepts, than this, to be selected entitative concept little with the similar value between current single entity concept.
Step 2: build entitative concept retrieval set.The element of user subject concept set is expanded, each concept element wherein expands set e widenable to one, choosing of entitative concept can be carried out respectively from each e, build entitative concept retrieval set, the member of retrieval entitative concept set expands from each entitative concept the concept chosen set, and retrieval entitative concept set description is:
f c={f 1,f 2,...f n}
Wherein, f 1at e (C 1) in choose, f nat e (C n) in choose.All entitative concept retrieval set can be described as, FC (C) { (F 1, F 2... F n) | F 1∈ e (C 1) ... F n∈ e (C n)
Each entitative concept set f cthe similar value of the entitative concept set C do not expanded inputted with user can be calculated by following formula:
sim s e m ( f c , C ) = &Sigma; i = 1 M a x ( | f c | , | C | ) s i m ( f i , c i ) + &theta; M a x ( | f c | , | C | ) + &theta;
Wherein, θ is regulating parameter.
N is made to be element number in the entitative concept set C do not expanded that inputs of user, so sim sem(f c, C) can be described as:
sim s e m ( f c , C ) = &Sigma; i = 1 n s i m ( f i , c i ) + &theta; n + &theta;
Wherein, allow multiple θ to be present in each entitative concept to expand in set.
For the expansion of key word, because key word is not entitative concept, the set that the set of all expansions forms is the power set of original keyword collection, be designated as P (K), element wherein itself is also set, if p is the element in power set P (K), then the similar value computing method between p and K are:
sim k e y ( p , K ) = | p | + &theta; | K | + &theta; .
Entity can describe concept in specific area and relation thereof, and entitative concept structural drawing can be formed, this sterogram can computational entity concept similar value, entitative concept structural drawing can see the directed acyclic graph with root node as, the node of entitative concept in figure represents, relation between concept is represented by directed edge, and this entitative concept structural drawing has the hierarchical structure characteristic of tree, and the characteristic of directed edge and multiple inheritance.Concrete steps are as follows:
Step 1: the calculating of entitative concept semantic distance.
Based on entitative concept figure, the semantic distance of entitative concept shows as the quantity of the directed edge of connection two concept nodes, is designated as d (C a, C b).Pass between entitative concept semantic distance and entitative concept semantic similitude value is: the distance between two entitative concept semantemes is larger, then the similar value between these two entitative concepts is less.
Step 2: the calculating of the entitative concept father node degree of depth.
Based on the hierarchical structure feature that entitative concept embodies, top-down tissue is carried out to entitative concept node, and descending classification, according to entitative concept common father node place level recently, its level known is darker, and entitative concept classification is thinner, and the semantic information inherited from its father node is more, its common semantic information is more, and the similar value namely between these two entitative concepts is larger.Now use depth (parent (C a, C b)) the recently common father node degree of depth of two concepts, parent (C are described a, C b) represent two concepts common father node recently.
Step 3: computing semantic degree of overlapping.Semantic Overlapping Calculation between entitative concept can have been come according to the identical father node number comprised between two entitative concepts, if but entitative concept semantic distance and the semantic degree of overlapping of entitative concept are taken in simultaneously, then there is the possibility of double counting, because imply the semantic degree of overlapping information of entitative concept in entitative concept semantic distance, so can based on entitative concept semantic distance and common father node the degree of depth in entitative concept structural drawing, calculate entitative concept similar value, make two entitative concept a, b, then a, the semantic similitude value of b represents the weighting normalization that similar value affects by both semantic distances and the degree of depth of common father node, be calculated as follows:
s i m ( C a , C b ) = &alpha; k d ( c a , c b ) + k + &beta; d e p t h ( p a r e n t ( c a , c b ) ) max d e p t h
Wherein, α is semantic distance weighted value, and β represents the weighted value of common father node, and meets alpha+beta=1, and the similar value that semantic distance determines regulates by regulating parameter k, the depth capacity of presentation-entity conceptional tree.
Step 4: the overall similar value of entitative concept calculates.
If carry out semanteme to the entitative concept in user search word set to expand, the semantic expansion of retrieval can be generated gather, be designated as FC (C), if expanded the set of keywords of the non-physical concept in user search word set, the power set of set of keywords can be generated, be designated as P (K), now from FC (C), get an element, be designated as f c, this element is one and expands concept set, then gets an element from P (K), is designated as p, and this element is one and expands set of keywords, just can form the retrieval request that is submitted to searching system, be expressed as (f c, p), make user search word set for (C, K), then by (C, K) and (f c, the p) calculating of similar value, can obtain the similar value of user search word set and result for retrieval.Based on expansion set of keywords similar value, expand entitative concept set similar value, classification concept set similar value, can calculate overall similar value, mathematical description is as follows:
SIM(f c,p,C,K)=λ 1×sim sem(f c,C)+λ 2×sim key(p,K)
Wherein, λ 1, λ 2for regulating parameter, λ 1represent the ratio of entitative concept set similar value and overall similar value, λ 2represent the ratio of set of keywords similar value and overall similar value, and λ 1+ λ 2=1.
In sum, the present invention proposes a kind of text data high-efficiency search method, make up the deficiency of traditional data retrieval, improve the efficiency of data retrieval from information recall ratio and precision ratio aspect.
Obviously, it should be appreciated by those skilled in the art, above-mentioned of the present invention each module or each step can realize with general computing system, they can concentrate on single computing system, or be distributed on network that multiple computing system forms, alternatively, they can realize with the executable program code of computing system, thus, they can be stored and be performed by computing system within the storage system.Like this, the present invention is not restricted to any specific hardware and software combination.
Should be understood that, above-mentioned embodiment of the present invention only for exemplary illustration or explain principle of the present invention, and is not construed as limiting the invention.Therefore, any amendment made when without departing from the spirit and scope of the present invention, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.In addition, claims of the present invention be intended to contain fall into claims scope and border or this scope and border equivalents in whole change and modification.

Claims (3)

1. a text data high-efficiency search method, is characterized in that, comprising:
Conceptual description is carried out to entity, builds entity knowledge base;
Based on described knowledge base, semantic analysis is carried out to text document;
Calculate the similar value of user search word and entitative concept, and
Based on calculated similar value result for retrieval sorted and return to user.
2. method according to claim 1, it is characterized in that, described semantic analysis is carried out to text document, comprise and the semanteme of document is marked, and extract file characteristics and carry out text mapping, entitative concept is obtained from entity vocabulary, set up the semantic feature territory of document, complete the automatic mark of document library document, and the non-semantic feature of mark and index file, generating document index storehouse and metadatabase thus, wherein based on document markup information architecture index database, is according to retrieving the document information of meeting consumers' demand with index database;
Describedly based on similar value, result for retrieval to be sorted, comprise with the entity dictionary of solid generation for foundation, perform the participle process of user search input characters, user search is divided into entitative concept set and the set of non-physical concept; Then press similar value to these two set respectively to expand, obtain two retrieval candidate collection, obtain the retrieval set after sequence, after finally result for retrieval being pressed the similar value sequence of retrieval request, result is pushed to user.
3. method according to claim 2, is characterized in that, described text mapping comprises the following steps:
First entitative concept is described as F=(U, T, J, Y), wherein U={u 1, u 2..., u | U|represent the user using word management text document, and each user has No. ID unique mark; T={t 1, t 2..., t | T|representing the used word of user in set, this word is arbitrary character string, J={i 1, i 2..., i | J|representing all spectra related text document, its content depends on the type of user's tag set, and user's tag set is made up of user, word, document three key elements, namely uses (U, T, J) to be described; represent ternary relation, wherein (u, t, i) element describes the text document i that user uses the collection of word t mark; F (u, i)={ t ∈ T| (u, t, i) ∈ Y} describes user and uses one group of word to define a text document, wherein u ∈ U, i ∈ J; Principal BO=(C, R) is built, wherein C={c by two tuples 1, c 2..., C | c|representing concept set, described representation of concept is c=(id, syn, phase, kind), id is the unique identification of concept, and syn is TongYiCi CiLin, and phase is the phrase describing concept, and kind is the part of speech of concept being classified; R={r 1, r 2..., r | R|represent relation between concept set; Now be defined as one group of TongYiCi CiLin S, each text document w ∈ S two tuples are expressed as: (w, fq c)), wherein, fq cw () is the frequency of occurrences of text document w;
In the text mapping stage, utilize one of following text mapping method:
Direct mapping, is mapped to the concept in entity, is expressed as TC by each word: for all t ∈ T, have T → 2 c; Wherein, in concept set, each t of C is a resource in TongYiCi CiLin syn, and words of description is to the direct mapping of concept;
Part mapping, namely when word can not directly map, the time from start to end, progressively shortens into a word by phrase; Attempt which stage can mapping shortening phrase in from the left side of phrase based on grammer, then carry out from the right improving amendment;
First Document mapping, namely arrange the matrix D C:[t being used for mapping intensity between words of description and concept i, t j] m × n, wherein, m=|T| and word quantity, n=|C| and concept quantity; In mapping process, produce initial matrix, the mapping intensity of initial matrix is the syn document word frequency be associated:
D C : &lsqb; t i , c j &rsqb; = fq c j ( t i ) c j &Element; T C ( t i ) 0
After mapping terminates, the value of initial matrix DC represents t in dictionary iand c jmapping intensity.
CN201510725603.2A 2015-10-30 2015-10-30 Text data efficient searching method Pending CN105335510A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510725603.2A CN105335510A (en) 2015-10-30 2015-10-30 Text data efficient searching method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510725603.2A CN105335510A (en) 2015-10-30 2015-10-30 Text data efficient searching method

Publications (1)

Publication Number Publication Date
CN105335510A true CN105335510A (en) 2016-02-17

Family

ID=55286037

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510725603.2A Pending CN105335510A (en) 2015-10-30 2015-10-30 Text data efficient searching method

Country Status (1)

Country Link
CN (1) CN105335510A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701234A (en) * 2016-02-19 2016-06-22 浪潮通用软件有限公司 Achieving method based on C# full-text retrieval
CN106951411A (en) * 2017-03-24 2017-07-14 福州大学 The quick multi-key word Semantic Ranking searching method of data-privacy is protected in a kind of cloud computing
CN107203532A (en) * 2016-03-16 2017-09-26 阿里巴巴集团控股有限公司 Construction method, the implementation method of search and the device of directory system
CN107704453A (en) * 2017-10-23 2018-02-16 深圳市前海众兴电子商务有限公司 A kind of word semantic analysis, word semantic analysis terminal and storage medium
CN109885653A (en) * 2019-01-30 2019-06-14 南京邮电大学 Text searching method
CN110457435A (en) * 2019-07-26 2019-11-15 南京邮电大学 A kind of patent novelty analysis system and its analysis method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169780A (en) * 2006-10-25 2008-04-30 华为技术有限公司 Semantic ontology retrieval system and method
US8024329B1 (en) * 2006-06-01 2011-09-20 Monster Worldwide, Inc. Using inverted indexes for contextual personalized information retrieval
CN103678576A (en) * 2013-12-11 2014-03-26 华中师范大学 Full-text retrieval system based on dynamic semantic analysis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8024329B1 (en) * 2006-06-01 2011-09-20 Monster Worldwide, Inc. Using inverted indexes for contextual personalized information retrieval
CN101169780A (en) * 2006-10-25 2008-04-30 华为技术有限公司 Semantic ontology retrieval system and method
CN103678576A (en) * 2013-12-11 2014-03-26 华中师范大学 Full-text retrieval system based on dynamic semantic analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杨建林: "基于本体的文本信息检索研究", 《情报理论与实践》 *
赵彦锋等: "基于本体的语义信息检索模型研究", 《软件工程师》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701234A (en) * 2016-02-19 2016-06-22 浪潮通用软件有限公司 Achieving method based on C# full-text retrieval
CN107203532A (en) * 2016-03-16 2017-09-26 阿里巴巴集团控股有限公司 Construction method, the implementation method of search and the device of directory system
CN107203532B (en) * 2016-03-16 2021-03-16 阿里巴巴集团控股有限公司 Index system construction method, search realization method and device
CN106951411A (en) * 2017-03-24 2017-07-14 福州大学 The quick multi-key word Semantic Ranking searching method of data-privacy is protected in a kind of cloud computing
CN106951411B (en) * 2017-03-24 2019-10-15 福州大学 The quick multi-key word Semantic Ranking searching method of data-privacy is protected in a kind of cloud computing
CN107704453A (en) * 2017-10-23 2018-02-16 深圳市前海众兴电子商务有限公司 A kind of word semantic analysis, word semantic analysis terminal and storage medium
CN107704453B (en) * 2017-10-23 2021-10-08 深圳市前海众兴科研有限公司 Character semantic analysis method, character semantic analysis terminal and storage medium
CN109885653A (en) * 2019-01-30 2019-06-14 南京邮电大学 Text searching method
CN109885653B (en) * 2019-01-30 2022-10-04 南京邮电大学 Text retrieval method
CN110457435A (en) * 2019-07-26 2019-11-15 南京邮电大学 A kind of patent novelty analysis system and its analysis method

Similar Documents

Publication Publication Date Title
CN101305366B (en) Method and system for extracting and visualizing graph-structured relations from unstructured text
CN104239513B (en) A kind of semantic retrieving method of domain-oriented data
CN105335510A (en) Text data efficient searching method
CN108563773B (en) Knowledge graph-based legal provision accurate search ordering method
CN105045875B (en) Personalized search and device
CN103473283A (en) Method for matching textual cases
CN105160046A (en) Text-based data retrieval method
CN106446162A (en) Orient field self body intelligence library article search method
CN106407208A (en) Establishment method and system for city management ontology knowledge base
CN111737400A (en) Knowledge reasoning-based big data service tag expansion method and system
CN108664599A (en) Intelligent answer method, apparatus, intelligent answer server and storage medium
CN104484380A (en) Personalized search method and personalized search device
US20210350125A1 (en) System for searching natural language documents
CN110377751A (en) Courseware intelligent generation method, device, computer equipment and storage medium
CN116127084A (en) Knowledge graph-based micro-grid scheduling strategy intelligent retrieval system and method
CN112036178A (en) Distribution network entity related semantic search method
Sadr et al. Unified topic-based semantic models: A study in computing the semantic relatedness of geographic terms
CN110659357A (en) Geographic knowledge question-answering system based on ontology semantic similarity
Sharaff et al. Analysing fuzzy based approach for extractive text summarization
Cobos et al. Clustering of web search results based on an Iterative Fuzzy C-means Algorithm and Bayesian Information Criterion
CN117010373A (en) Recommendation method for category and group to which asset management data of power equipment belong
Kraft et al. Textual information retrieval with user profiles using fuzzy clustering and inferencing
CN116561264A (en) Knowledge graph-based intelligent question-answering system construction method
CN116401338A (en) Design feature extraction and attention mechanism based on data asset intelligent retrieval input and output requirements and method thereof
CN116049376A (en) Method, device and system for retrieving and replying information and creating knowledge

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160217

RJ01 Rejection of invention patent application after publication