CN103823857A - Space information searching method based on natural language processing - Google Patents
Space information searching method based on natural language processing Download PDFInfo
- Publication number
- CN103823857A CN103823857A CN201410059272.9A CN201410059272A CN103823857A CN 103823857 A CN103823857 A CN 103823857A CN 201410059272 A CN201410059272 A CN 201410059272A CN 103823857 A CN103823857 A CN 103823857A
- Authority
- CN
- China
- Prior art keywords
- weight
- natural language
- language processing
- method based
- utilizes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/313—Selection or weighting of terms for indexing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a space information searching method based on natural language processing. The space information searching method comprises the following steps of (1), performing word segmentation on an indexing document, and changing weights of various words obtained by word segmentation to obtain an indexing document comprising the weights; (2) inputting an inquire statement by a user, performing work segmentation on the inquire statement, and changing weights of various words obtained by word segmentation to obtain an inquire statement comprising the weights; and (3) searching the inquire statement comprising the weights in the indexing document comprising the weights. According to the space information searching method, a natural language processing tool is used, a word segmentation technology and a named entity identity technology are applied to the field of space information searching, and a searching effect is optimized.
Description
Technical field
The present invention relates to retrieval technique and natural language processing technique, relate in particular to the spatial information search method based on natural language processing.
Background technology
Natural language processing is an important directions in artificial intelligence field, and main research realizes the theory and the method that between people and computing machine, exchange with natural language symbol.Natural language processing is one and melts computer science, mathematics and linguistics in the science of one.The nineties in last century, there is huge variation in the field of natural language understanding and processing: require system can process real large-scale text, requirement can extract useful information from natural language text.Due to requirement above, the development of Large Scale Corpus, and the establishment of informative scale dictionary really is all developed, thereby applies and bring great convenience for low levels such as participle, part-of-speech taggings.
Search engine refers to according to certain strategy, uses specific computer program to gather information from internet, after information being organized and is processed, and for user provides retrieval service, the system by information display relevant user search to user.Search engine comprises full-text index, directory index, first search engine, vertical search engine, aggregation type search engine, door search engine, free lists of links etc.
The work of modern search engine can be divided into three phases: collection stage, pretreatment stage and inquiry phase.For the retrieval in vertical field, the collection stage is comparatively simple, conventionally only need to carry out simple uniform formatization to metadata and process.Pretreatment stage is also referred to as the index construct stage, and this stage is the stage the most complicated in search engine, and most of sort algorithm can be applied in this stage.First, search engine can be treated index data and clear up, and carries out and comprises participle, removes the operations such as stop words; Be exactly most important step afterwards: build inverted index, inverted index is expressed as a word, frequency and position etc. that corresponding this word occurs in document, be equivalent to dictionary of all data construct, according to word can quick indexing to relevant documentation; Inquiry phase is the actual operational phase of search engine, and all and part user interactions all completes in this stage.Search engine is done cleaning to user's input and is processed, and is equally to use participle and remove the operations such as stop words, then lexical item to be retrieved is updated to inverted index and marking formula, after sequence, returns.
Technology binding site between natural language and retrieval is a lot, is all widely used in academia and industry member, comprising: participle, keyword extraction and semantic retrieval etc.
Summary of the invention
The invention provides a kind of spatial information Optimization of Information Retrieval method based on natural language processing, its object is to use the effect of natural language processing algorithm room for promotion information retrieval.
A spatial information search method based on natural language processing, comprising:
Step 1, carries out participle by index file, and changes the weight of each word after participle, obtains the index file after weight change;
Step 2, user input query statement, carries out participle to query statement, and changes the weight of each word after participle, obtains the query statement after weight change;
Step 3, retrieves the query statement after weight change in the index file after weight change.
Wherein, index file refers to the text being pre-stored in retrieval platform, and query statement refers to the text that user inputs in the time retrieving.In the time retrieving, by by the query statement of user input with mate with index file, the text of coupling is exported as result for retrieval.By changing the weight of each word in index file and query statement, the word weight of representation space information is increased, thereby improve the accuracy of retrieval.
In step 1, utilize overall linear model to carry out participle to index file, and in step 2, utilize overall linear model to carry out participle to query statement.
Overall situation linear model carries out modeling to target sequence on the basis of observation sequence, solves the problem of serializing mark.There is discriminative model and production model consideration simultaneously, considered the transition probability between contextual tagging, carry out global parameter optimization and decoding with serializing form.
The method for building up of described overall linear model is:
Step 1-1, marks corpus, the corresponding label of each individual character in the corpus after mark;
Step 1-2, utilizes the corpus after default feature templates and mark to carry out model training, obtains described overall linear model.
Aspect rule-based machine learning, the present invention has used a large amount of participle samples for geo-spatial data, has comprised the spatial information natural language sentences of point good word in these samples.These sample sentences comprise the sentence of the Sample Storehouse of increasing income, and are the sentence through manually marking on the other hand for spatial geographic information.These sample sentences have formed corpus.Corpus is marked, be convenient to follow-up word segmentation processing.
In step 1-2, the step of carrying out model training is as follows:
Step 1-21, applies mechanically feature templates to the corpus after mark, to the list of each individual character generating feature;
Step 1-22, extracts the feature in each feature list, utilize feature and and weight build model, wherein the initial value of each weight is 0;
Step 1-23, utilizes model to predict all individual characters in the corpus after marking, and predicts the outcome and is handled as follows for each individual character:
Prediction is correct, carries out the prediction of next individual character;
Prediction error, utilizes the weight of online updating algorithm regeneration characteristics, obtains new model, utilizes new model this individual character to be predicted again, until prediction update times correct or weight exceedes preset value.
The part of speech of character representation word, comprises the part of speech of word and the part of speech of previous word in feature templates.Wherein prediction mode has a lot, for example, adopt viterbi algorithm prediction, the error between the predicted value of individual character and actual value and threshold value is compared, thereby judge whether individual character is predicted correctly.
In step 1 and step 2, the method for carrying out participle is as follows,
Step a, inputs to text in overall linear model, and described overall linear model is applied to feature templates in text, and obtains the corresponding feature list of text according to weight calculation;
Step b, adopts dynamic programming algorithm to obtain all possible tag combination according to feature list, utilizes back-track algorithm to find optimum tag combination;
Step c, carries out word division according to optimum tag combination by text;
Wherein, the text described in step a to c is the query statement in index file or the step 2 in step 1.
Due to the corresponding label of each individual character, therefore optimum tag combination has represented the most possible division position of each word in text, thereby carries out word division (participle) according to optimum tag combination.
Described dynamic programming algorithm is viterbi algorithm.
Adopt viterbi algorithm to carry out best consideration to whole context, thereby obtain preferably word segmentation result.
In step 1 and step 2, utilize keyword extraction to change the weight of word, the weight of keyword is increased.
Wherein, keyword refers to the word that comprises spatial information.
Utilize TextRank algorithm to carry out keyword extraction.
TextRank algorithm, adopt and the similar figure TRANSFER MODEL of Page Rank of Google, can realize the extraction of keyword well.
In step 1 and step 2, utilize the weight of each word after named entity recognition method change participle, increase the weight of spatial information noun in text, be index file at step 1 Chinese version, in step 2, be query statement.
The noun that adopts representation space information in named entity recognition method identification text, makes result for retrieval more concentrated in spatial information field, thereby has improved effectiveness of retrieval.
The inventive method is used natural language processing instrument, by participle technique and named entity recognition technology application space information retrieval field, has optimized the effect of retrieval.
Accompanying drawing explanation
Fig. 1 utilizes viterbi algorithm to carry out the method schematic diagram of participle in one embodiment of the invention;
Fig. 2 is the effect schematic diagram of Chinese word segmentation in the current embodiment of the present invention;
Fig. 3 is the inventive method process flow diagram.
Embodiment
Below in conjunction with accompanying drawing, specific embodiments of the invention are described.It should be noted that the embodiments described herein, only for illustrating, is not limited to the present invention.
As shown in Figure 3, the step of the embodiment of the present invention is as follows:
Step 1, carries out participle by index file, and changes the weight of each word after participle, obtains the index file after weight change;
Step 2, user input query statement, carries out participle to query statement, and changes the weight of each word after participle, obtains the query statement after weight change;
Wherein, the participle in step 1, index file being carried out and all adopt overall linear model to carry out to the participle of query statement in step 2.
The method for building up of overall situation linear model is:
Step 1-1, marks corpus, the corresponding label of each individual character in the corpus after mark;
Step 1-2, utilizes the corpus after default feature templates and mark to carry out model training, obtains overall linear model.The step of carrying out model training is as follows:
Step 1-21, applies mechanically feature templates to the corpus after mark, to the list of each individual character generating feature.Take Chinese individual character as example,
Step 1-22, extracts the feature in each feature list, utilize feature and and weight build model, wherein the initial value of each weight is 0;
Step 1-23, utilizes model to predict each individual character in the corpus after marking:
Prediction is correct, carries out the prediction of next individual character;
Prediction error, utilizes the weight of online updating algorithm regeneration characteristics, obtains new model, and repeating step 1-23, until prediction update times correct or weight exceedes preset value.
In embodiments of the present invention, adopt viterbi algorithm to carry out individual character prediction, judge whether that according to the error between the predicted value of individual character and sample value prediction accurately, if prediction error, the label of prediction is different with actual label, represent that parameter has problem to the prediction of this individual character, need undated parameter, concrete update algorithm is online updating (OnlinePassive-Aggressive) algorithm;
When the error amount of loop iteration is less than the threshold value of setting, or exceed the iterations of setting, finish algorithm.
After model training finishes, just can predict by the overall situation obtaining, the method of concrete prediction is more, conventional one is dynamic programming algorithm, as shown in Figure 2, we use dynamic programming algorithm, infer the mark of current state according to the mark of previous state, finally use back-track algorithm find out optimization path and return.
In participle to index file in step 1 and step 2, query statement is carried out to the method for participle as follows:
Step a, inputs to text in overall linear model, and overall linear model is applied to feature templates in text, and obtains the corresponding feature list of text according to weight calculation.
Step b, adopts dynamic programming algorithm to obtain all possible tag combination according to feature list, utilizes back-track algorithm to find optimum tag combination.
In the current embodiment of the present invention, dynamic programming algorithm is viterbi algorithm.Fig. 1 utilizes viterbi algorithm to select the schematic diagram of optimum label combination.Based on the segmenting method schematic diagram of mark.Take Chinese word segmentation as example, Fig. 2 is a sentence having marked, the corresponding label of each individual character (comprising punctuation mark) in sentence, in the corpus through mark, only have four kinds of possible labels: S represents individual character, B represents the beginning of word, M represents the centre of word, and E represents the end of word.In the above example, sentence is divided into:
| modernization | battleship | upper |, | or not exist | technology | simple | | post.
In sentence, " " this word independently becomes word, so use S mark; " modernization " is three words, and the corresponding B of " showing " word, represents the beginning of word, the corresponding M of " generation " word, and the centre of expression word, word does not also finish, and " change " corresponding E, the end of tagged words.
Step c, carries out word division according to optimum tag combination by text.
After completing participle, change the weight of each word, so that later retrieval retrieves according to the weight of word, thereby improve effectiveness of retrieval and accuracy.The weight method that changes word can be the keyword extraction of utilizing TextRank algorithm.In embodiments of the present invention, adopt named entity recognition to carry out the change of weight, the word of representation space information in the text after participle is increased to weight, thereby increase the professional domain specific aim of retrieval.
Step 3, retrieves in the index file by the query statement after weight change after weight change.
To index file with after being weighted, can impel two statements that similarity is higher to obtain higher weight in the time of retrieval, thereby in Search Results, arrange forward.The computing formula of similarity is as follows:
sim(d,q)=cosine(d
→,q
→)=(d
→·q
→)/(|d
→|×|q
→|)
Wherein d
→represent index file, q
→represent query statement, the similarity between the two calculates by cosine angle formulae, and weight information has been included in d
→and q
→among, by increasing the weight of keyword, can make the index file that similarity is high obtain higher score, thereby in result for retrieval, make the index file sequence of higher score forward, improve the accuracy of retrieval.
The present invention combines participle technique and named entity recognition technology, natural language processing technique is applied in the retrieval of spatial geographic information field to effectively room for promotion geographic information retrieval effect.
Claims (9)
1. the spatial information search method based on natural language processing, is characterized in that, comprising:
Step 1, carries out participle by index file, and changes the weight of each word after participle, obtains the index file after weight change;
Step 2, user input query statement, carries out participle to query statement, and changes the weight of each word after participle, obtains the query statement after weight change;
Step 3, retrieves the query statement after weight change in the index file after weight change.
2. the spatial information search method based on natural language processing as claimed in claim 1, is characterized in that, in step 1, utilizes overall linear model to carry out participle to index file, and in step 2, utilizes overall linear model to carry out participle to query statement.
3. the spatial information search method based on natural language processing as claimed in claim 2, is characterized in that, the method for building up of described overall linear model is:
Step 1-1, marks corpus, the corresponding label of each individual character in the corpus after mark;
Step 1-2, utilizes the corpus after default feature templates and mark to carry out model training, obtains described overall linear model.
4. the spatial information search method based on natural language processing as claimed in claim 3, is characterized in that, in step 1-2, the step of carrying out model training is as follows:
Step 1-21, applies mechanically feature templates to the corpus after mark, to the list of each individual character generating feature;
Step 1-22, extracts the feature in each feature list, utilize feature and and weight build model, wherein the initial value of each weight is 0;
Step 1-23, utilizes model to predict all individual characters in the corpus after marking, and predicts the outcome and is handled as follows for each individual character:
Prediction is correct, carries out the prediction of next individual character;
Prediction error, utilizes the weight of online updating algorithm regeneration characteristics, obtains new model, utilizes new model this individual character to be predicted again, until prediction update times correct or weight exceedes preset value.
5. the spatial information search method based on natural language processing as claimed in claim 4, is characterized in that, in step 1 and step 2, the method for carrying out participle is as follows,
Step a, inputs to text in overall linear model, and described overall linear model is applied to feature templates in text, and obtains the corresponding feature list of text according to weight calculation;
Step b, adopts dynamic programming algorithm to obtain all possible tag combination according to feature list, utilizes back-track algorithm to find optimum tag combination;
Step c, carries out word division according to optimum tag combination by text;
Wherein, the text described in step a to c is the query statement in index file or the step 2 in step 1.
6. the spatial information search method based on natural language processing as claimed in claim 5, is characterized in that, in step b, described dynamic programming algorithm is viterbi algorithm.
7. the spatial information search method based on natural language processing as claimed in claim 1, is characterized in that, utilizes keyword extraction to change the weight of word in step 1 and step 2, and the weight of keyword is increased.
8. the spatial information search method based on natural language processing as claimed in claim 7, is characterized in that, utilizes TextRank algorithm to carry out keyword extraction.
9. the spatial information search method based on natural language processing as claimed in claim 1, it is characterized in that, in step 1 and step 2, utilize the weight of each word after named entity recognition method change participle, increase the weight of spatial information noun in text, being index file at step 1 Chinese version, is query statement at step 2 Chinese version.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410059272.9A CN103823857B (en) | 2014-02-21 | 2014-02-21 | Space information searching method based on natural language processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410059272.9A CN103823857B (en) | 2014-02-21 | 2014-02-21 | Space information searching method based on natural language processing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103823857A true CN103823857A (en) | 2014-05-28 |
CN103823857B CN103823857B (en) | 2017-02-01 |
Family
ID=50758921
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410059272.9A Active CN103823857B (en) | 2014-02-21 | 2014-02-21 | Space information searching method based on natural language processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103823857B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104008166A (en) * | 2014-05-30 | 2014-08-27 | 华东师范大学 | Dialogue short text clustering method based on form and semantic similarity |
CN104268144A (en) * | 2014-08-12 | 2015-01-07 | 华东师范大学 | Electronic medical record query statement constructing method |
CN106372063A (en) * | 2016-11-01 | 2017-02-01 | 上海智臻智能网络科技股份有限公司 | Information processing method and device and terminal |
CN106970922A (en) * | 2016-01-14 | 2017-07-21 | 北大方正集团有限公司 | Index establishing method, search method and directory system based on multi-field keyword |
CN107992514A (en) * | 2016-10-26 | 2018-05-04 | 谷歌有限责任公司 | The search and retrieval of structured message card |
CN108897861A (en) * | 2018-07-01 | 2018-11-27 | 东莞市华睿电子科技有限公司 | A kind of information search method |
CN110705249A (en) * | 2019-09-03 | 2020-01-17 | 东南大学 | NLP library combined use method based on overlapping degree calculation |
CN111259145A (en) * | 2020-01-16 | 2020-06-09 | 广西计算中心有限责任公司 | Text retrieval classification method, system and storage medium based on intelligence data |
CN112183087A (en) * | 2020-09-27 | 2021-01-05 | 武汉华工安鼎信息技术有限责任公司 | System and method for sensitive text recognition |
WO2021254227A1 (en) * | 2020-06-18 | 2021-12-23 | International Business Machines Corporation | Targeted partial re-enrichment of a corpus based on nlp model enhancements |
TWI779599B (en) * | 2021-02-09 | 2022-10-01 | 鼎新電腦股份有限公司 | Application programming interface service search system and application programming interface service search method |
CN112183087B (en) * | 2020-09-27 | 2024-05-28 | 武汉华工安鼎信息技术有限责任公司 | System and method for identifying sensitive text |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103530415A (en) * | 2013-10-29 | 2014-01-22 | 谭永 | Natural language search method and system compatible with keyword search |
CN103544309B (en) * | 2013-11-04 | 2017-03-15 | 北京中搜网络技术股份有限公司 | A kind of retrieval string method for splitting of Chinese vertical search |
-
2014
- 2014-02-21 CN CN201410059272.9A patent/CN103823857B/en active Active
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104008166A (en) * | 2014-05-30 | 2014-08-27 | 华东师范大学 | Dialogue short text clustering method based on form and semantic similarity |
CN104008166B (en) * | 2014-05-30 | 2017-05-24 | 华东师范大学 | Dialogue short text clustering method based on form and semantic similarity |
CN104268144A (en) * | 2014-08-12 | 2015-01-07 | 华东师范大学 | Electronic medical record query statement constructing method |
CN106970922A (en) * | 2016-01-14 | 2017-07-21 | 北大方正集团有限公司 | Index establishing method, search method and directory system based on multi-field keyword |
CN107992514B (en) * | 2016-10-26 | 2022-04-05 | 谷歌有限责任公司 | Structured information card search and retrieval |
US11238058B2 (en) | 2016-10-26 | 2022-02-01 | Google Llc | Search and retrieval of structured information cards |
CN107992514A (en) * | 2016-10-26 | 2018-05-04 | 谷歌有限责任公司 | The search and retrieval of structured message card |
CN106372063A (en) * | 2016-11-01 | 2017-02-01 | 上海智臻智能网络科技股份有限公司 | Information processing method and device and terminal |
CN108897861A (en) * | 2018-07-01 | 2018-11-27 | 东莞市华睿电子科技有限公司 | A kind of information search method |
CN110705249A (en) * | 2019-09-03 | 2020-01-17 | 东南大学 | NLP library combined use method based on overlapping degree calculation |
CN110705249B (en) * | 2019-09-03 | 2023-04-11 | 东南大学 | NLP library combined use method based on overlapping degree calculation |
CN111259145A (en) * | 2020-01-16 | 2020-06-09 | 广西计算中心有限责任公司 | Text retrieval classification method, system and storage medium based on intelligence data |
WO2021254227A1 (en) * | 2020-06-18 | 2021-12-23 | International Business Machines Corporation | Targeted partial re-enrichment of a corpus based on nlp model enhancements |
US11537660B2 (en) | 2020-06-18 | 2022-12-27 | International Business Machines Corporation | Targeted partial re-enrichment of a corpus based on NLP model enhancements |
GB2611682A (en) * | 2020-06-18 | 2023-04-12 | Ibm | Targeted partial re-enrichment of a corpus based on NLP model enhancements |
AU2021294112B2 (en) * | 2020-06-18 | 2023-05-11 | International Business Machines Corporation | Targeted partial re-enrichment of a corpus based on NLP model enhancements |
CN112183087A (en) * | 2020-09-27 | 2021-01-05 | 武汉华工安鼎信息技术有限责任公司 | System and method for sensitive text recognition |
CN112183087B (en) * | 2020-09-27 | 2024-05-28 | 武汉华工安鼎信息技术有限责任公司 | System and method for identifying sensitive text |
TWI779599B (en) * | 2021-02-09 | 2022-10-01 | 鼎新電腦股份有限公司 | Application programming interface service search system and application programming interface service search method |
Also Published As
Publication number | Publication date |
---|---|
CN103823857B (en) | 2017-02-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103823857B (en) | Space information searching method based on natural language processing | |
CN107861939B (en) | Domain entity disambiguation method fusing word vector and topic model | |
CN104699763B (en) | The text similarity gauging system of multiple features fusion | |
CN108121700B (en) | Keyword extraction method and device and electronic equipment | |
CN109543181B (en) | Named entity model and system based on combination of active learning and deep learning | |
CN105095204B (en) | The acquisition methods and device of synonym | |
CN108959258B (en) | Specific field integrated entity linking method based on representation learning | |
CN110851596A (en) | Text classification method and device and computer readable storage medium | |
CN106777957B (en) | The new method of biomedical more ginseng event extractions on unbalanced dataset | |
CN110879834B (en) | Viewpoint retrieval system based on cyclic convolution network and viewpoint retrieval method thereof | |
US20190317986A1 (en) | Annotated text data expanding method, annotated text data expanding computer-readable storage medium, annotated text data expanding device, and text classification model training method | |
CN103324700A (en) | Noumenon concept attribute learning method based on Web information | |
CN104699797A (en) | Webpage data structured analytic method and device | |
CN112328800A (en) | System and method for automatically generating programming specification question answers | |
Sasidhar et al. | A survey on named entity recognition in Indian languages with particular reference to Telugu | |
Devi et al. | Entity extraction for malayalam social media text using structured skip-gram based embedding features from unlabeled data | |
CN111159332A (en) | Text multi-intention identification method based on bert | |
CN111881256B (en) | Text entity relation extraction method and device and computer readable storage medium equipment | |
CN104317882A (en) | Decision-based Chinese word segmentation and fusion method | |
CN110008473B (en) | Medical text named entity identification and labeling method based on iteration method | |
JP2007156545A (en) | Symbol string conversion method, word translation method, its device, its program and recording medium | |
Wang et al. | Semi-supervised chinese open entity relation extraction | |
US20190095525A1 (en) | Extraction of expression for natural language processing | |
CN110377690B (en) | Information acquisition method and system based on remote relationship extraction | |
Wang et al. | A sentence segmentation method for ancient Chinese texts based on NNLM |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |