CN103049442A - Method and device for identifying abbreviation-full name conversion of mobile phone network retrieval words - Google Patents

Method and device for identifying abbreviation-full name conversion of mobile phone network retrieval words Download PDF

Info

Publication number
CN103049442A
CN103049442A CN 201110307206 CN201110307206A CN103049442A CN 103049442 A CN103049442 A CN 103049442A CN 201110307206 CN201110307206 CN 201110307206 CN 201110307206 A CN201110307206 A CN 201110307206A CN 103049442 A CN103049442 A CN 103049442A
Authority
CN
China
Prior art keywords
full name
abbreviation
phone network
string
cell phone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 201110307206
Other languages
Chinese (zh)
Inventor
卢玉成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN 201110307206 priority Critical patent/CN103049442A/en
Publication of CN103049442A publication Critical patent/CN103049442A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and device for identifying abbreviation-full name conversion of mobile phone network retrieval words. The conversion identification method comprises the steps of resolving input abbreviations into single word strings composed of single words; finding out full name strings containing all single words in the single word strings in a full name data base, outputting full names without the matched full name strings if the full name strings are not found, grading the found candidate full name strings respectively according to a relevance formula, and enabling the candidate full name strings corresponding to a maximum graded value to be output as the full names corresponding to the abbreviation. According to the conversion identification method, the accuracy rate and the processing speed are both considered.

Description

Cell phone network retrieval term abbreviation-full name conversion identification method and device
Technical field
The present invention relates to the data retrieval technology field, especially relevant with a kind of cell phone network retrieval term abbreviation-full name conversion identification method and a kind of cell phone network retrieval term abbreviation-full name conversion identification device.
Background technology
In daily interchange and written writing, thinking habit and the speech habits daily according to people, people often refer to the long entity title of a title with abbreviation, as referring to " Beijing University of Technology " with " Beijing Polytechnical University ".
Along with surfing Internet with cell phone is more and more universal, function of remote query also is widely used more and more especially.But, mobile phone unlike computing machine have function screen large, check conveniently, write easily characteristics that the user more wishes to obtain by the inquiry of word abbreviation the information of own needs.Therefore, a kind of method and apparatus that is called for short to identify its full name with the cell phone network term just seems and is necessary very much.
Full name (F) is the complete address to the title of entity or object, and abbreviation (A) is the brevity and lucidity in order to express, and the address that obtains after the compression simplified in full name, if F and A have full abbreviation relation, claims that then F is the full name of A, and A is the abbreviation of F.Being called for short the processing problem is exactly to a given abbreviation A, manages to understand its full name.
Be called for short the processing problem and become basic and crucial problem in the application such as natural language processing, information retrieval.Natural language processing is a major issue in computer science and the artificial intelligence field.Its research can realize carrying out with natural language between people and the computing machine various theories and the method for efficient communication.Widespread use along with computing machine and internet, the accessible natural language text quantity of computing machine unprecedentedly increases, towards application demand rapid growths such as the text mining of magnanimity information, information extraction, cross-language information processing, man-machine interactions, the object of natural language processing is also processed from the small-scale restricted language and is turned to extensive real text to process, and its research will produce far-reaching influence to people's life.Information retrieval, research are how from the bulk information of numerous and complicated, fast, the technology of Obtaining Accurate information needed.Information retrieval technique is through for many years development, and quite ripe at present, the novel information retrieval technique is just towards future developments such as intellectuality, mobilism, variation, personalizations.
The method that solves the abbreviation processing problem of cyberspeak retrieval can be divided into two large classes: a class is based on the method for pattern, mainly utilize linguistics and natural language processing technique, extract relation schema by lexical analysis and grammatical analysis, then utilize pattern match to obtain full abbreviation relation, but the method accuracy rate is difficult to the real requirement that reaches desirable; The another kind of method that is based on statistics mainly based on corpus and statistical language model, is obtained full abbreviation relation by the degree of association of calculating between the concept, though the method accuracy rate is high, can not satisfy ultra-large obtaining.
And some other processing is called for short the method for problem entirely, and processing speed is not high, is difficult to be applied in the such real-time system of search engine.
Summary of the invention
For problems of the prior art, the object of the present invention is to provide a kind of cell phone network retrieval term abbreviation-full name conversion identification method, can not take into account the technical matters of accuracy rate and processing speed to solve prior art cell phone network retrieval term abbreviation-full name conversion identification method.
Another object of the present invention is to provide a kind of cell phone network retrieval term abbreviation-full name conversion identification device.
For achieving the above object, technical scheme of the present invention is as follows:
A kind of cell phone network retrieval term abbreviation-full name conversion identification method comprises step: the abbreviation that will input is decomposed into the individual character string that is comprised of individual character one by one; Find out the full name string that contains all individual characters in the described individual character string from a full name database, if can not find described full name string, then output does not have the full name of coupling; To the candidate's full name string that finds, mark according to degree of correlation formula respectively, and maximum scores is worth corresponding candidate's full name string is called for short corresponding full name as this and exports.
A kind of cell phone network retrieval term abbreviation-full name conversion identification device comprises full name database, input block, matching unit, scoring unit, comparer and output unit; Described input block receives an abbreviation of inputting; Described matching unit, the described abbreviation that described input block is inputted are decomposed into the individual character string that is comprised of individual character one by one; And from described full name database, find out the full name string that contains all individual characters in the described individual character string; To the candidate's full name string that finds, mark according to degree of correlation formula respectively in described scoring unit; Described comparer, the maximum scores value is selected in the scoring of more described scoring unit; Output unit if can not find described full name string, is not then had the full name of coupling by output unit output; Otherwise maximum scores is worth corresponding full name string to be called for short corresponding full name as this and to export.
Beneficial effect of the present invention is, cell phone network retrieval term abbreviation-full name conversion identification method of the present invention, at first receive one and be called for short A as input, then from a full name database, find A candidate full name F1 ..., Fn, last according to specific judging rules, select best (or a plurality of) full name Fi, as the full name of A.Method of the present invention has higher accuracy and processing speed faster, in the test that contains 2101 full name databases (national ordinary higher learning school title), and rate of accuracy reached to 97%.
Description of drawings
Fig. 1 is the process flow diagram of the cell phone network retrieval term abbreviation-full name conversion identification method of the embodiment of the invention.
Fig. 2 is the synoptic diagram of the cell phone network retrieval term abbreviation-full name conversion identification device of the embodiment of the invention.
Embodiment
The exemplary embodiments that embodies feature ﹠ benefits of the present invention will be described in detail in the following description.Be understood that the present invention can have at different embodiment various variations, its neither departing from the scope of the present invention, and explanation wherein and appended accompanying drawing be when the usefulness that explain in itself, but not in order to limit the present invention.
Cell phone network retrieval term abbreviation-full name conversion identification method of the present invention, main step comprises: at first receive one and be called for short A as input, then from a full name database, find the candidate full name F1 that is called for short A ..., Fn, according to a judging rules, select a best full name Fi as the full name output of A at last.
Below specifically introduce the cell phone network retrieval term abbreviation-full name conversion identification method and apparatus of the embodiment of the invention.
The cell phone network of embodiment of the invention retrieval term abbreviations-full name conversion identification method, need to use a full name database (comprise the retrieval term be called for short corresponding one or more fields the database of possible full name, referred to as FDB).In given full name database FDB, the form of full name is as shown in table 1, and they are stored in the full name database in the mode of 3 row.
Table 1
ID Full name Participle
... ... ...
11022 Beijing University of Technology Beijing/polytechnical university/
... ...
21365 Beijing Institute of Technology Beijing Institute of Technology/
... ...
48271 Beijing Technology and Business University Beijing/industry and commerce/university/
... ...
To a given full name database, we at first carry out participle to all full name among the FDB, the instrument that participle adopts can be Chinese lexical analysis system (ICTCLAS), and for example the ICTCLAS that can realize the Versions of participle function does not wherein have the named entity function.The file layout of FDB behind the participle is shown in table 1 the 3rd row.
After finishing participle, according to FDB, we can adopt the TF/IDF formula commonly used in the information extraction, calculate the IDF value of each participle.Process is as follows:
Suppose the full name of the total N of FDB.To a minute lexical item t, suppose that t total Nt full name in FDB contains t, then the IDF score value of t is:
IDF ( t ) = log N Nt
For example, according to the test library shown in the table 2, can calculate:
IDF (" Beijing ")=3.718343
IDF (" industry ")=5.165262
IDF (" science and engineering ")=5.165262
IDF (" industry and commerce ")=5.347583
IDF (" university ")=2.298310
After having calculated its IDF value for each participle, the candidate full name is designated as F 1..., Fn.The participle of supposing Fi is Wi1 ..., Wij ... Wik, then be called for short the degree of correlation between A and the candidate's full name Fi:
relevance ( A , F i ) = Σ j = 1 k IDF ( W ij ) (formula 1)
For example, to A=" Beijing Polytechnical University ", through calculating as can be known:
Relevance (" Beijing Polytechnical University ", " Beijing University of Technology ")=8.372779,
Relevance (" Beijing Polytechnical University ", " Beijing Institute of Technology ")=7.650169,
Relevance (" Beijing Polytechnical University ", " Beijing Technology and Business University ")=11.364237.
The result of formula 1 is partial to select to contain the Fi of more participle.For example, when " mathematics institute of Beijing University of Technology " when also appearing among the FDB, formula 1 can select " mathematics institute of Beijing University of Technology " rather than " Beijing University of Technology " as being called for short " Beijing Polytechnical University " corresponding full name.Therefore, preferably, we can process formula 1, overcome above-mentioned prejudice.Can be on the right of formula 1 divided by the participle number among the Fi.Making the participle number among the Fi is Si, and then we have the formula 2 after pair formula 1 improves:
relevance ( A , F i ) = 1 s i Σ j = 1 R IDF ( W ij ) (formula 2)
With candidate's full name of given abbreviation matching in, what number of words was less often is better full name.Because in word segmentation result, the less full name participle number of number of words is not necessarily few, therefore with the right of formula 2 again divided by the number of words of Fi | Fi|, the formula 3 that is improved.
relevance ( A , F i ) = 1 S i × | F i | Σ j = 1 k IDF ( W ij ) (formula 3)
In addition, when people simplify a full name, usually can the key component in the full name be embodied in abbreviation.For example, in the word segmentation result " Beijing/post and telecommunications/university/" of " Beijing University of Post ﹠ Telecommunication ", the IDF value of three minutes lexical items is respectively: 3.718343,5.452944,2.298310, and wherein the IDF value of " post and telecommunications " is the highest, the most representative and discrimination in full name.Therefore, just more likely comprise the part " postal " that embodies this minute lexical item in its abbreviation " Beijing University of Post ﹠ Telecommunication ".
In order in algorithm, to express above-mentioned relation, can define the penalty of the full name minute lexical item that does not appear in the abbreviation, as shown in Equation 4.
Figure BSA00000589375200053
(formula 4)
Then formula 3 can be improved to:
relevance ( A , F i ) = Σ j = 1 k IDF ( W ij ) S i × | F i | × ( 1 + Σ j = 1 k penalty ( A , W ij ) ) (formula 5)
At last, observing the distribution situation of character in candidate's full name Fi in being called for short, also is an important implications that affects the full name quality.Specifically, the character in the abbreviation can disperse to distribute in the full name preferably usually, and can not flock together.For example, " China Agricultural University " abbreviation " middle peasant " rather than " state's farming "; " Beijing Institute of International Relations " then is called for short " state pass ".Therefore, the divergence concept of character in candidate's full name Fi during method of the present invention has proposed to be called for short:
div ergence ( A , F i ) = log | A | + 1 Σ j = 1 | A | - 1 adjacency ( A , j , F i ) + 1 (formula 6)
Wherein, | A| is the length that is called for short A, is used for judging the substring A of A in abutting connection with function adjacency (A, j, Fi) jA J+1Whether in Fi, occur.
Figure BSA00000589375200062
(formula 7)
Then formula 5 is improved to:
relevance ( A , F i ) = div ergence ( A , F i ) × Σ j = 1 k IDF ( W ij ) S i × | F i | × ( 1 + Σ j = 1 k penalty ( A , W ij ) ) (formula 8)
Based on above-mentioned discussion, the cell phone network of embodiment of the invention retrieval term abbreviation-full name conversion identification method, the abbreviation A of an entity of reception by above-mentioned computing, searches out full name and output that one or more the bests meet A from FDB, and concrete steps are as follows:
Step 1: the abbreviation A that at first will input is decomposed into the string that is comprised of individual character one by one, is designated as [A].
For example, to A=" Beijing Polytechnical University ", A obtains [A]=" Beijing Polytechnical University " after decomposing
Step 2: from FDB, find out the full name string that contains all individual characters in [A].
Step 3: if can not find, then output does not have the full name of coupling.
Step 4: the candidate full name of the A that finds from FDB is F 1..., Fn.
For example, to A=" Beijing Polytechnical University ", F from FDB 1=" Beijing University of Technology ", F 2=" Beijing Institute of Technology ", F 3=" Beijing Technology and Business University ".
Step 5: to F 1..., Fn marks according to formula 3, formula 5 or formula 8, calculates respectively relevance (A, F 1), relevance (A, F 2) ..., relevance (A, Fn).
For example, to A=" Beijing Polytechnical University ", relevance (" Beijing Polytechnical University ", " Beijing University of Technology ")=0.967261, relevance (" Beijing Polytechnical University ", " Beijing Institute of Technology ")=0.883782, relevance (" Beijing Polytechnical University ", " Beijing Technology and Business University ")=0.875232.
Step 6: the candidate full name of output score value maximum is as full name and the output of A.
For example, to A=" Beijing Polytechnical University ", since relevance (" Beijing Polytechnical University ", " Beijing University of Technology ")=the 0.967261st, the maximum scores value, and therefore the full name of the A of output is " Beijing University of Technology ".
The cell phone network retrieval term abbreviation-full name conversion identification device of the embodiment of the invention comprises full name database, input block, matching unit, scoring unit, comparer and output unit;
Described input block receives an abbreviation of inputting;
Described matching unit, the described abbreviation that described input block is inputted are decomposed into the individual character string that is comprised of individual character one by one; And from described full name database, find out the full name string that contains all individual characters in the described individual character string, if can not find described full name string, then there is not the full name of coupling by output unit output;
Described scoring unit, to the candidate's full name string that finds, respectively according to degree of correlation formula, for example formula 3, formula 5 and formula 8 are marked;
Described comparer, the maximum scores value is selected in the scoring of more described scoring unit;
Output unit is worth corresponding full name string with maximum scores and is called for short corresponding full name as this and exports.
Can design a user interface and input abbreviation and show the full name Query Result for the user, realize method and apparatus of the present invention.Whole system can be used the PHP language compilation, is deployed under the Linux+Apache 2.0.61+PHP 5.2.5 environment, conducts interviews by webpage to make things convenient for the user.
The used data structure of conversion identification method and apparatus of the present invention is as follows:
(1) full name database FDB is for full name and the relevant information of storing university's title of collecting.In addition, in order to accelerate inquiry velocity, need to carry out buffer memory to calculating the part results of intermediate calculations that is called for short A and candidate's full name Fi degree of correlation relevance (A, Fi).We find, in this degree of correlation formula (8),
1 S i × | F i | Σ j = 1 k IDF ( W ij ) ,
Because and it is irrelevant to be called for short query word A, can calculates in advance and the result is carried out buffer memory for each university's full name.Like this, when online query was processed, we just only needed to calculate penalty (A, Wij) and divergence (A, Fi) these two parts, and with buffered results by formula (8) in conjunction with obtaining being called for short the degree of correlation relevance (A, Fi) of query word and candidate's full name.Then the relation schema RS of FDB data is as follows:
(formula 9)
(2) divide lexical item IDF database WIDF, be used for that the full name participle of storage full name database occurred all minutes lexical item the IDF value.This value also is used for the calculating of online query penalty (A, Wij) except being used in calculating FDB.Because the acquisition time complexity of these data is
Figure BSA00000589375200081
And comprise comparatively time-consuming full name participle operation segment, so we also need these data are carried out buffer memory.Then the relation schema of WIDF data is as follows:
WIDF (WID, participle key name, IDF value) (formula 10)
(3) set up full name storehouse individual character inverted list, be used for according to being called for short the Chinese character that comprises in the middle of the query word, navigating to fast candidate's full name Fi when larger in the full name storehouse.Inverted list also claims inverted index or reverse indexing, is a kind of index data structure, is used for being stored in the mapping of the appearance position of certain word in one group of document under the full-text search, is used for support to the quick full-text index of mass data.For example, the full name database FDB shown in the his-and-hers watches 1, we can set up inverted index such as table 2:
Table 2
Key Full name ID
North {11022,21365,48271}
Greatly {11022,21365,48271}
The worker {11022,21365,48271}
The capital {11022,21365,48271}
Reason {21365}
The merchant {48271}
Learn {11022,21365,48271}
Already {11022}
... ...
If given inquiry " northern science and engineering " then can quick-searching comprises the full name item of these three individual characters in the FDB by this inverted index, as the Candidate Set of further processing:
{11022,21365,48271}∩{21365}∩{11022,21365,48271}={21365}
To the table scan of FDB, when being on a grand scale of full name database FDB, effectively improve the efficient of candidate generation when utilizing inverted index to avoid inquiring about.
Said system can adopt event-driven mode, and the operation starter system can check the validity of desired data storage, and the state of data storage is pointed out.
The user is the difference in functionality of selective system according to demand.System can receive user's request event, and calls disparate modules event is responded.For example, the user can ask to generate or upgrade various data cached.If various data cached effective, the user can input to be called for short and inquire about corresponding full name.
In Query Result, can list the candidate's full name that retrieves, the participle of full name and the degree of correlation that various contrast algorithm calculates.Query Result sorts according to the degree of correlation that formula 8 obtains.
In order to verify the accuracy of conversion identification method of the present invention, select national ordinary higher learning school full name data as shown in table 3 as experimental data.These data are the statistics in May, 2011.
Selected at random 100 abbreviations as test data, seen Table 3, by conversion identification method of the present invention, the result who obtains after the conversion is presented on the 3rd row of table 3.
Table 3
Figure BSA00000589375200091
Figure BSA00000589375200111
Figure BSA00000589375200121
As shown in Table 3, in the test that contains 2101 national ordinary higher learning school name datas, accuracy has reached 97%, only has the 25th (Chinese University of Science and Technology), the 98th (sea is large) and the result of the 100th (river is large) that certain deviation is arranged.
Conversion identification method of the present invention has generality, can obtain similar effect to different full name databases.
To sum up, conversion identification method of the present invention and conversion equipment at first receive one and are called for short A as input, then from a full name database, find A candidate full name F1 ..., Fn, according to a judging rules, select a best full name Fi as the full name output of A at last.Conversion identification method of the present invention has higher accuracy and travelling speed faster.
Those skilled in the art should recognize change and the retouching of doing in the situation that does not break away from the scope and spirit of the present invention that the appended claim of the present invention discloses, all belong within the protection domain of claim of the present invention.

Claims (8)

1. a cell phone network is retrieved term abbreviation-full name conversion identification method, comprises step:
The abbreviation of input is decomposed into the individual character string that is comprised of individual character one by one;
Find out the full name string that contains all individual characters in the described individual character string from a full name database, if can not find described full name string, then output does not have the full name of coupling;
To the candidate's full name string that finds, mark according to degree of correlation formula respectively, and maximum scores is worth corresponding candidate's full name string is called for short corresponding full name as this and exports.
2. cell phone network as claimed in claim 1 is retrieved term abbreviation-full name conversion identification method, it is characterized in that described degree of correlation formula is:
relevance ( A , F i ) = div ergence ( A , F i ) × Σ j = 1 k IDF ( W ij ) S i × | F i | × ( 1 + Σ j = 1 k penalty ( A , W ij ) )
Wherein, A is the abbreviation of inputting, and Fi is described candidate's full name string, and Si is the participle number among the Fi, and Wij is the participle of Fi;
Penalty (A, Wij) does not appear at the penalty that is called for short the full name minute lexical item among the A for expression, and
Figure FSA00000589375100012
Divergence (A, Fi) is the divergence of character in candidate's full name Fi in being called for short:
div ergence ( A , F i ) = log | A | + 1 Σ j = 1 | A | - 1 adjacency ( A , j , F i ) + 1
Wherein, | A| is the length that is called for short A, is used for judging the substring A of A in abutting connection with function adjacency (A, j, Fi) jA J+1Whether in Fi, occur, and
Figure FSA00000589375100014
3. cell phone network as claimed in claim 2 retrieval term abbreviation-full name conversion identification method is characterized in that, also comprises for each full name calculating in advance the degree of correlation and the result being carried out the step of buffer memory.
4. cell phone network retrieval term abbreviation-full name conversion identification method as claimed in claim 3 is characterized in that, also comprises the step of the individual character inverted list of setting up the full name database.
5. a cell phone network retrieval term abbreviation-full name conversion identification device comprises full name database, input block, matching unit, scoring unit, comparer and output unit;
Described input block receives an abbreviation of inputting;
Described matching unit, the described abbreviation that described input block is inputted are decomposed into the individual character string that is comprised of individual character one by one; And from described full name database, find out the full name string that contains all individual characters in the described individual character string;
To the candidate's full name string that finds, mark according to degree of correlation formula respectively in described scoring unit;
Described comparer, the maximum scores value is selected in the scoring of more described scoring unit;
Output unit if can not find described full name string, is not then had the full name of coupling by output unit output; Otherwise maximum scores is worth corresponding full name string to be called for short corresponding full name as this and to export.
6. cell phone network as claimed in claim 5 is retrieved term abbreviation-full name conversion identification device, it is characterized in that described degree of correlation formula is:
relevance ( A , F i ) = div ergence ( A , F i ) × Σ j = 1 k IDF ( W ij ) S i × | F i | × ( 1 + Σ j = 1 k penalty ( A , W ij ) )
Wherein, A is the abbreviation of inputting, and Fi is described candidate's full name string, and Si is the participle number among the Fi, and Wij is the participle of Fi;
Penalty (A, Wij) does not appear at the penalty that is called for short the full name minute lexical item among the A for expression, and
Figure FSA00000589375100022
Divergence (A, Fi) is the divergence of character in candidate's full name Fi in being called for short:
div ergence ( A , F i ) = log | A | + 1 Σ j = 1 | A | - 1 adjacency ( A , j , F i ) + 1
Wherein, | A| is the length that is called for short A, is used for judging the substring A of A in abutting connection with function adjacency (A, j, Fi) jA J+1Whether in Fi, occur, and
Figure FSA00000589375100024
7. cell phone network as claimed in claim 6 is retrieved term abbreviation-full name conversion identification device, it is characterized in that described cell phone network retrieval term abbreviation-full name conversion identification device also comprises for each full name and calculates in advance the degree of correlation and the result is carried out the buffer unit of buffer memory.
8. cell phone network as claimed in claim 7 retrieval term abbreviations-full name conversion identification device is characterized in that, described cell phone network is retrieved the unit of arranging that term abbreviation-full name conversion identification device also comprises the individual character inverted list of setting up the full name database.
CN 201110307206 2011-10-12 2011-10-12 Method and device for identifying abbreviation-full name conversion of mobile phone network retrieval words Pending CN103049442A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110307206 CN103049442A (en) 2011-10-12 2011-10-12 Method and device for identifying abbreviation-full name conversion of mobile phone network retrieval words

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110307206 CN103049442A (en) 2011-10-12 2011-10-12 Method and device for identifying abbreviation-full name conversion of mobile phone network retrieval words

Publications (1)

Publication Number Publication Date
CN103049442A true CN103049442A (en) 2013-04-17

Family

ID=48062086

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110307206 Pending CN103049442A (en) 2011-10-12 2011-10-12 Method and device for identifying abbreviation-full name conversion of mobile phone network retrieval words

Country Status (1)

Country Link
CN (1) CN103049442A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377887A (en) * 2019-07-19 2019-10-25 出门问问(苏州)信息科技有限公司 Entity abbreviation method for transformation, readable storage medium storing program for executing and electronic equipment
CN110728150A (en) * 2019-10-08 2020-01-24 支付宝(杭州)信息技术有限公司 Named entity screening method, device, equipment and readable medium
CN111782975A (en) * 2020-06-28 2020-10-16 北京百度网讯科技有限公司 Retrieval method and device and electronic equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377887A (en) * 2019-07-19 2019-10-25 出门问问(苏州)信息科技有限公司 Entity abbreviation method for transformation, readable storage medium storing program for executing and electronic equipment
CN110728150A (en) * 2019-10-08 2020-01-24 支付宝(杭州)信息技术有限公司 Named entity screening method, device, equipment and readable medium
CN110728150B (en) * 2019-10-08 2023-06-20 支付宝(杭州)信息技术有限公司 Named entity screening method, named entity screening device, named entity screening equipment and readable medium
CN111782975A (en) * 2020-06-28 2020-10-16 北京百度网讯科技有限公司 Retrieval method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN106777274B (en) A kind of Chinese tour field knowledge mapping construction method and system
CN104391942B (en) Short essay eigen extended method based on semantic collection of illustrative plates
CN107220237A (en) A kind of method of business entity's Relation extraction based on convolutional neural networks
CN104679885B (en) A kind of user's search string organization names recognition method based on semantic feature model
CN103678576A (en) Full-text retrieval system based on dynamic semantic analysis
CN101299217B (en) Method, apparatus and system for processing map information
CN102693279B (en) Method, device and system for fast calculating comment similarity
CN104866593A (en) Database searching method based on knowledge graph
CN110781670B (en) Chinese place name semantic disambiguation method based on encyclopedic knowledge base and word vectors
CN103488724A (en) Book-oriented reading field knowledge map construction method
CN104484380A (en) Personalized search method and personalized search device
CN104408148A (en) Field encyclopedia establishment system based on general encyclopedia websites
CN104199965A (en) Semantic information retrieval method
CN103246670A (en) Microblog sorting, searching, display method and system
CN112749265B (en) Intelligent question-answering system based on multiple information sources
CN110362678A (en) A kind of method and apparatus automatically extracting Chinese text keyword
CN106202294A (en) The related news computational methods merged based on key word and topic model and device
CN103106287A (en) Processing method and processing system for retrieving sentences by user
CN104715063A (en) Search ranking method and search ranking device
CN103116573A (en) Field dictionary automatic extension method based on vocabulary annotation
CN114090861A (en) Education field search engine construction method based on knowledge graph
CN104881399A (en) Event identification method and system based on probability soft logic PSL
CN101923556A (en) Method and device for searching webpages according to sentence serial numbers
CN102567392A (en) Control method for interest subject excavation based on time window
CN113269477B (en) Scientific research project query scoring model training method, query method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130417