CN108959269A - A kind of sentence auto ordering method and device - Google Patents

A kind of sentence auto ordering method and device Download PDF

Info

Publication number
CN108959269A
CN108959269A CN201810839470.5A CN201810839470A CN108959269A CN 108959269 A CN108959269 A CN 108959269A CN 201810839470 A CN201810839470 A CN 201810839470A CN 108959269 A CN108959269 A CN 108959269A
Authority
CN
China
Prior art keywords
sentence
intersection
term vector
ordering method
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810839470.5A
Other languages
Chinese (zh)
Other versions
CN108959269B (en
Inventor
刘杰
骆力明
周建设
史金生
袁克柔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China University of Technology
Original Assignee
Capital Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Capital Normal University filed Critical Capital Normal University
Priority to CN201810839470.5A priority Critical patent/CN108959269B/en
Publication of CN108959269A publication Critical patent/CN108959269A/en
Application granted granted Critical
Publication of CN108959269B publication Critical patent/CN108959269B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The present invention provides a kind of sentence auto ordering method and devices, wherein this method comprises: carrying out punctuate pretreatment to document sets, obtains sentence intersection;Sentence intersection is trained and obtains term vector dictionary, the term vector is clustered in conjunction with preset Chinese thesaurus;Based on conditional entropy algorithm, bluebeard compound vector clusters are as a result, obtain the proximity in sentence intersection between sentence;The sequence to sentence in sentence intersection is realized using Markov random walk model.Semantic analysis of the present invention can realize the auto judge to text sentence logicality, improve and judge efficiency, reduce and judge error, and can reduce the influence of Sparse, improve the efficiency that sentence ranking results generate.

Description

A kind of sentence auto ordering method and device
Technical field
The present invention relates to field of computer technology, specifically, being related to a kind of sentence auto ordering method and device.
Background technique
With the rapid development of Internet technology, the automatic scoring research of Chinese composition is gradually risen, and is write a composition for improving Scoring efficiency, be inherently eliminated the inconsistent of Evaluation of composition, control error score has a very important significance.Due in The complexity of literary logic of language is big, it is existing research to composition evaluation and test mostly from vocabulary uses, grammer, expression, write a composition length, Conjunctive word use, the utilization of rhetorical devices, article subject consistency are angularly evaluated and tested, and composition internal logic is not directed to Reasonability evaluation and test.But in composition evaluation and test, logic reasonability is equally an important indicator for evaluating language communicating competence. Logic in text between sentence rationally shows that sentence tissue sequence is reasonable, and such text has readable well.
In the prior art, the research about sentence sequence mainly appears in text summarization field, and text is plucked automatically Want the sentence Sorting task in field, it mainly will be manually having finished writing and upset the documentation summary sentence collection or machine choice of sequence Summary candidate sentence collection tissue be reasonable and readable digest.Existing research can substantially be divided into following a few classes: one, utilizing Temporal information determines sentence sequence in sentence: the time occurred in corpus with sentence is ranked up for foundation, such as news corpus In, the temporal information inside sentence is extracted, then sort algorithm is assisted to be ranked up sentence;Two, between sentence in collection of document Implication relation determines sentence sequence: this method is from transfer of the sentence internal entity between sentence, continuation status, the theme of event tag The logical relation contained between sentence is excavated in transfer etc.;Three, from large-scale corpus is relied on, the naturally suitable of internal sentence is excavated Sequence: this method calculates the proximity between adjacent sentence on the basis of vocabulary, and the condition that estimation sentence constitutes front and back sentence pair is general Rate obtains ranking results.
But there is also problems for the above research, and for the first, second class research, problem is mainly: being believed using the time The methods of breath, inheritance between sentence, sentence theme, have biggish limitation, can not be to not including these specific informations Text carries out sentence sequence;Additionally, due to machine to the deficiency of natural language understanding, rely on descriptor, time word and it is implicit when Between identification, stealthy conjunctive word excavates is also one big difficult.For third class research, deficiency is mainly: relying on large-scale language The problem of material calculates the collocations between sentence pair, and parameter space is big, is easy to appear Sparse, is unfavorable for subsequent proximity It calculates.
Summary of the invention
Aiming at the problems existing in the prior art, the present invention provides a kind of sentence auto ordering method, comprising:
(1) punctuate pretreatment is carried out to document sets, obtains sentence intersection;
(2) the sentence intersection is trained and obtains term vector dictionary, in conjunction with preset Chinese thesaurus to institute's predicate Vector is clustered;
(3) it is based on conditional entropy algorithm, in conjunction with the term vector cluster result, calculates in the sentence intersection word between sentence pair Logicality collocation information, to obtain the proximity in the sentence intersection between sentence.
Further, the calculation formula of the conditional entropy algorithm is as follows:
Wherein, H (Sm|Sm-1) value of conditional entropy, S between adjacent two sentence in the sentence intersectionmWith Sm-1It is adjacent two A sentence, m are that the serial number of sentence and m are positive integer and are more than or equal to 2 less than or equal to n in the sentence intersection, and n is described The sum of sentence in sentence intersection;wiFor Sm-1The word of middle appearance, wjFor SmThe word of middle appearance, wherein i, j take positive integer;p (wiwj) it is wi, wjThe probability occurred jointly, p (wj|wi) it is conditional probability.
Further, global information can be obtained from whole recurrence using neural network based and determine any node The algorithm of importance realizes the sequence to the sentence in the sentence intersection.
Further, the neural network algorithm is based on Markov random walk model.
Further, the term vector is clustered as 500-1500 class.
Further, in the preset Chinese thesaurus number of synonym more than 7000 classes.
Further, the sentence auto ordering method further includes the evaluation and test step to the ranking results of the sentence, institute Commentary is surveyed step and is scored based on ranking results of the ROUGE-L to the sentence.
Further, the threshold value of the ROUGE-L scoring is set as 0.6, i.e., by the true sentence ranking results of the document Compared with the sentence ranking results of the sentence auto ordering method, if the ROUGE-L, which scores, is greater than or equal to threshold value, two Person's ranking results are similar.
Further, the sentence intersection is divided, is divided into several statement block intersections comprising 2-3 sentence;
Firstly, being calculated adjacent in the statement block intersection based on conditional entropy algorithm in conjunction with the term vector cluster result The logicality collocation information of word between statement block, to obtain the proximity in the statement block intersection between statement block;
Then, it is based on conditional entropy algorithm, in conjunction with the term vector cluster result, calculates the sentence pair in each statement block Between word logicality collocation information, to obtain the proximity between the sentence in each statement block.
The present invention also provides a kind of generating means of sentence auto-sequencing, comprising:
Document preprocessing module obtains the corresponding sentence intersection of the document sets for carrying out sentence segmentation to document sets;
Term vector cluster module obtains term vector dictionary, and combine preset for being trained to the sentence intersection Chinese thesaurus clusters the term vector;
Proximity computing module is based on conditional entropy algorithm, in conjunction with the cluster result of the term vector, calculates the sentence and closes The logicality collocation information of word between concentration sentence pair, to obtain the proximity in the sentence intersection between sentence;
Ranking results generation module is swum for the proximity calculated result according to the sentence using Markov at random It walks model to be ranked up the sentence, obtains ranking results.
The invention has the beneficial effects that:
(1) semantic analysis of the invention can realize the auto judge to text sentence logicality, improve judge efficiency, It reduces and judges error.
(2) present invention uses non-supervisory method, all has to the corpus of large number of corpus and lesser amt more excellent Versatility.
(3) present invention is ranked up sentence using Markov random walk model, and efficiency of algorithm height, ranking results are more Reliably.
(4) present invention divides word and is clustered using term vector semantically, can reduce the influence of Sparse, Improve computational efficiency.
(5) present invention combines Chinese thesaurus that can reduce the inaccuracy of automatic cluster, optimization sentence ranking results.
(6) it can be obtained more reasonable in sentence auto-sequencing of the invention using the method that paragraph is split as sentence block Sentence auto-sequencing effect.
Detailed description of the invention
It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art Embodiment or attached drawing needed to be used in the description of the prior art be briefly described, it should be apparent that, it is described below Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor It puts, is also possible to obtain other drawings based on these drawings.
Fig. 1 is the flow chart of sentence auto ordering method of the present invention;
Fig. 2 is the structural schematic diagram of sentence automatic sequencing device of the present invention;
Specific embodiment
Technical solution of the present invention is clearly and completely described below in conjunction with attached drawing, it is clear that described implementation Example is a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill Personnel's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be noted that in which the same or similar labels are throughly indicated identical or Similar element or element with the same or similar functions;Sentence pair refers to two sentences adjacent in sentence intersection.
It is according to the flow chart of the sentence auto ordering method of one embodiment of the invention referring to Fig. 1, which arranges automatically Sequence method the following steps are included:
(1) each composition website obtains middle and primary schools figure kind composition corpus 16329 from network, in addition obtains other Classification is write a composition 109404, and the document sets for obtaining total 125733 documents obtain the pretreatment that the document sets are made pauses in reading unpunctuated ancient writings Obtain sentence intersection;
(2) term vector dictionary is obtained, in conjunction with preset Chinese thesaurus to institute by being trained to the sentence intersection Predicate vector is clustered.Wherein, the term vector is preferably trained by word2vec, is obtained and is amounted to 79770 words Term vector dictionary;The number of synonym is more than 7000 classes in the preset Chinese thesaurus, and more preferably " Harbin Institute of Technology believes Cease retrieval research room Chinese thesaurus extended edition ", it is related to 11769 class synonyms altogether;Preferably it is by term vector cluster Term vector cluster is more preferably 1500 classes by 500-1500 class.
(3) it is based on conditional entropy algorithm, in conjunction with the cluster result of the term vector, calculates in the sentence intersection word between sentence pair The logicality collocation information of language, to obtain the proximity in the sentence intersection between sentence, the calculation formula of conditional entropy algorithm It is as follows:
Wherein, H (Sm|Sm-1) value of conditional entropy, S between adjacent two sentence in the sentence intersectionmWith Sm-1It is adjacent two A sentence, m are that the serial number of sentence and m are positive integer and are more than or equal to 2 less than or equal to n in the sentence intersection, and n is described The sum of sentence in sentence intersection;wiFor Sm-1The word of middle appearance, wjFor SmThe word of middle appearance, wherein i, j take positive integer;p (wiwj) it is wi, wjThe probability occurred jointly, p (wj|wi) it is conditional probability.
(4) after the proximity in the sentence intersection has been determined between sentence, using it is neural network based can be from entirety Recurrence obtains global information and determines the algorithm of any node importance to realize to the sentence in the sentence intersection Sequence.Preferably, selection Markov random walk model is ranked up sentence, and calculation method is as follows:
The Markov Chain of the corresponding traversal of random walk matrix, can be with by constantly transfer between any two state It reaches mutually, the Markov random walk model defines figure G=(V, E), and V is vertex set, that is, institute's predicate to be sorted The set of sentence, E is side collection, that is, the proximity of any two sentence in the sentence set to be sorted, and value is to pass through institute State the sentence v that conditional entropy formula is calculatedi→vjProbability, wherein i, j be positive integer represent sentence in the sentence intersection Serial number.Migration matrix sequence M=can be obtained in m sentences to be sorted | MI, j|m×m,
Score value SentScore (the v of some sentence in the ranking based on matrix norm M, in the sentence intersectioni) can pass through It is obtained with other sentences, calculation formula is as follows:
Scheme G=(V, E) to calculate according to above up to convergence, chooses the wherein highest sentence priority ordering of score value, it will be remaining Sentence reformulates new figure G ' and re-executes operation, until being sky to an ordering statement V.The collating sequence of sentence is final row Sequence result.
(5) after the ranking results for obtaining the sentence intersection, the ranking results of the sentence are evaluated and tested, are utilized ROUGE-L scores to the ranking results of the sentence, and the ROUGE-L is carried out from the point of view of Longest Common Substring The marking of similarity, calculation formula are as follows:
LSC=lsc (stand_order.sorted_order)
Wherein, LSC be the sentence auto ordering method ranking results (sorted_order) and the document it is true The length of the Longest Common Substring of real sentence ranking results (stand_order);Len (sorted_order) be the sentence from The length of the ranking results of dynamic sort method, len (stand_order) are the length of the true sentence ranking results of the document Degree, the two length are consistent;R indicates recall rate, and P indicates accuracy rate, and score (ROUGE-L) is the scoring of ROUGE-L;It is public Formula passes through abbreviation, and the scoring of the ROUGE-L is determined by ratio of the public substring in the ranking results in length.
Preferably, the threshold value of ROUGE-L scoring is set as 0.6, i.e., by the true sentence ranking results of the document with Both the sentence ranking results of the sentence auto ordering method compare, if ROUGE-L scoring is greater than or equal to threshold value, Ranking results be it is similar, then it is believed that the sentence ranking results of the sentence auto ordering method be it is consistent and acceptable, Then the acceptable sequence ratio obtained through the semantic analysis auto ordering method is counted.
Further, the present invention has found the language that the document in the document sets includes by the analysis to experimental result In the case where sentence negligible amounts, the sentence auto ordering method achieves more acceptable ranking results, but with institute Increasing for the inside documents sentence quantity of document sets is stated, the numerical value of the acceptable sequence ratio is gradually reduced, therefore proposes one The Optimization Steps to the sentence auto ordering method are planted, specifically:
Firstly, dividing to the sentence intersection, it is divided into several statement block intersections comprising 2-3 sentence;
Secondly, being calculated adjacent in the statement block intersection based on conditional entropy algorithm in conjunction with the term vector cluster result The logicality collocation information of word between statement block, to obtain the proximity in the statement block intersection between statement block;It utilizes Markov random walk model, obtains the ranking results of the statement block.
Again, it is based on conditional entropy algorithm, in conjunction with the term vector cluster result, calculates the sentence pair in each statement block Between word logicality collocation information, to obtain the proximity between the sentence in each statement block;Utilize Markov Random walk model obtains the ranking results of the sentence in each statement block.
Finally, the ranking results of the sentence in the ranking results of the statement block and the statement block are combined, can obtain To the final sequence of the document sentence.
By experimental verification, after the sentence auto ordering method takes the Optimization Steps, can slow down with the text The case where numerical value for increasing the acceptable sequence ratio of the inside documents sentence quantity of shelves collection is gradually reduced, to demonstrate The Optimization Steps strategy that the sentence auto ordering method is taken is feasible.
In addition, referring to fig. 2, the present invention also provides a kind of generating means of sentence auto-sequencing, comprising:
Document preprocessing module 100, for carrying out sentence segmentation to document sets, to obtain the corresponding sentence of the document sets Intersection;
Term vector cluster module 200 obtains term vector dictionary, and combine pre- for being trained to the sentence intersection If Chinese thesaurus the term vector is clustered;
Proximity computing module 300 is based on conditional entropy algorithm, in conjunction with the cluster result of the term vector, calculates institute's predicate In sentence intersection between sentence pair word logicality collocation information, to obtain the proximity in the sentence intersection between sentence;
Ranking results generation module 400, it is random using Markov for the proximity calculated result according to the sentence Migration model is ranked up the sentence, obtains ranking results.
Although in addition, have shown that and describe several embodiments and preferred embodiment of present general inventive concept, It is that it should be appreciated by those skilled in the art, can be right in the case where not departing from the principle and spirit of present general inventive concept These embodiments are changed, and the present general inventive concept is defined by the claims and their equivalents.

Claims (10)

1. a kind of sentence auto ordering method characterized by comprising
(1) punctuate pretreatment is carried out to document sets, obtains sentence intersection;
(2) the sentence intersection is trained and obtains term vector dictionary, in conjunction with preset Chinese thesaurus to the term vector It is clustered;
(3) it is based on conditional entropy algorithm, in conjunction with the term vector cluster result, calculates in the sentence intersection patrolling for word between sentence pair Property collocation information is collected, to obtain the proximity in the sentence intersection between sentence.
2. sentence auto ordering method according to claim 1, which is characterized in that the calculating of the conditional entropy algorithm is public Formula is as follows:
Wherein, H (Sm|Sm-1) value of conditional entropy, S between adjacent two sentence in the sentence intersectionmWith Sm-1For two adjacent languages Sentence, m are that the serial number of sentence and m are positive integer and are less than or equal to n more than or equal to 2 in the sentence intersection, and n is the sentence The sum of sentence in intersection;wiFor Sm-1The word of middle appearance, wjFor SmThe word of middle appearance, wherein i, j take positive integer;p(wiwj) be wi, wjThe probability occurred jointly, p (wj|wi) it is conditional probability.
3. sentence auto ordering method according to claim 1, which is characterized in that using it is neural network based can be from whole Body recurrence obtains global information and determines the algorithm of any node importance to realize to the sentence in the sentence intersection Sequence.
4. sentence auto ordering method according to claim 3, which is characterized in that the neural network algorithm is based on Ma Er Section husband random walk model.
5. sentence auto ordering method according to claim 1, which is characterized in that clustering the term vector for 500- 1500 classes.
6. sentence auto ordering method according to claim 1, which is characterized in that same in the preset Chinese thesaurus The number of adopted word is more than 7000 classes.
7. sentence auto ordering method according to claim 3, which is characterized in that the sentence auto ordering method is also Evaluation and test step including the ranking results to the sentence, the evaluation and test step is based on ROUGE-L to the sequence knot of the sentence Fruit scores.
8. sentence auto ordering method according to claim 7, which is characterized in that the threshold value of the ROUGE-L scoring It is set as 0.6, i.e., by the sentence ranking results ratio of the true sentence ranking results of the document and the sentence auto ordering method Compared with if ROUGE-L scoring is greater than or equal to threshold value, the two ranking results are similar.
9. sentence auto ordering method according to any one of claims 1-4, which is characterized in that the sentence intersection into Row divides, and is divided into several statement block intersections comprising 2-3 sentence;
Firstly, calculating sentence adjacent in the statement block intersection in conjunction with the term vector cluster result based on conditional entropy algorithm The logicality collocation information of word between block, to obtain the proximity in the statement block intersection between statement block;
Then, it is based on conditional entropy algorithm, in conjunction with the term vector cluster result, calculates word between the sentence pair in each statement block The logicality collocation information of language, to obtain the proximity between the sentence in each statement block.
10. a kind of generating means of sentence auto-sequencing characterized by comprising
Document preprocessing module obtains the corresponding sentence intersection of the document sets for carrying out sentence segmentation to document sets;
Term vector cluster module obtains term vector dictionary, and combine preset synonymous for being trained to the sentence intersection Word word woods clusters the term vector;
Proximity computing module is calculated in the sentence intersection based on conditional entropy algorithm in conjunction with the cluster result of the term vector The logicality collocation information of word between sentence pair, to obtain the proximity in the sentence intersection between sentence;
Ranking results generation module utilizes Markov random walk mould for the proximity calculated result according to the sentence Type is ranked up the sentence, obtains ranking results.
CN201810839470.5A 2018-07-27 2018-07-27 A kind of sentence auto ordering method and device Active CN108959269B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810839470.5A CN108959269B (en) 2018-07-27 2018-07-27 A kind of sentence auto ordering method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810839470.5A CN108959269B (en) 2018-07-27 2018-07-27 A kind of sentence auto ordering method and device

Publications (2)

Publication Number Publication Date
CN108959269A true CN108959269A (en) 2018-12-07
CN108959269B CN108959269B (en) 2019-07-05

Family

ID=64464076

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810839470.5A Active CN108959269B (en) 2018-07-27 2018-07-27 A kind of sentence auto ordering method and device

Country Status (1)

Country Link
CN (1) CN108959269B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113688606A (en) * 2021-07-30 2021-11-23 达观数据(苏州)有限公司 Method for automatically writing document report

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8510328B1 (en) * 2011-08-13 2013-08-13 Charles Malcolm Hatton Implementing symbolic word and synonym English language sentence processing on computers to improve user automation
CN105868175A (en) * 2015-12-03 2016-08-17 乐视网信息技术(北京)股份有限公司 Abstract generation method and device
CN106372208A (en) * 2016-09-05 2017-02-01 东南大学 Clustering method for topic views based on sentence similarity
CN107562717A (en) * 2017-07-24 2018-01-09 南京邮电大学 A kind of text key word abstracting method being combined based on Word2Vec with Term co-occurrence
CN108090049A (en) * 2018-01-17 2018-05-29 山东工商学院 Multi-document summary extraction method and system based on sentence vector

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8510328B1 (en) * 2011-08-13 2013-08-13 Charles Malcolm Hatton Implementing symbolic word and synonym English language sentence processing on computers to improve user automation
CN105868175A (en) * 2015-12-03 2016-08-17 乐视网信息技术(北京)股份有限公司 Abstract generation method and device
CN106372208A (en) * 2016-09-05 2017-02-01 东南大学 Clustering method for topic views based on sentence similarity
CN107562717A (en) * 2017-07-24 2018-01-09 南京邮电大学 A kind of text key word abstracting method being combined based on Word2Vec with Term co-occurrence
CN108090049A (en) * 2018-01-17 2018-05-29 山东工商学院 Multi-document summary extraction method and system based on sentence vector

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
周建设 等: "句法主语的主题表现力研究", 《语言文字应用》 *
薛涛 等: "基于条件熵和上下文邻近度的句子排序研究", 《计算机应用研究》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113688606A (en) * 2021-07-30 2021-11-23 达观数据(苏州)有限公司 Method for automatically writing document report

Also Published As

Publication number Publication date
CN108959269B (en) 2019-07-05

Similar Documents

Publication Publication Date Title
CN110442760B (en) Synonym mining method and device for question-answer retrieval system
CN104391942B (en) Short essay eigen extended method based on semantic collection of illustrative plates
CN111767408B (en) Causal event map construction method based on multiple neural network integration
CN104794169B (en) A kind of subject terminology extraction method and system based on sequence labelling model
CN109408642A (en) A kind of domain entities relation on attributes abstracting method based on distance supervision
CN108681574B (en) Text abstract-based non-fact question-answer selection method and system
Zhang et al. Automatic synonym extraction using Word2Vec and spectral clustering
CN106776562A (en) A kind of keyword extracting method and extraction system
CN105975458B (en) A kind of Chinese long sentence similarity calculating method based on fine granularity dependence
CN109783806B (en) Text matching method utilizing semantic parsing structure
CN111694927B (en) Automatic document review method based on improved word shift distance algorithm
Turdakov Word sense disambiguation methods
CN112036178A (en) Distribution network entity related semantic search method
Sadr et al. Unified topic-based semantic models: A study in computing the semantic relatedness of geographic terms
CN113761890A (en) BERT context sensing-based multi-level semantic information retrieval method
Ruan et al. A research on sentence similarity for question answering system based on multi-feature fusion
CN113515939B (en) System and method for extracting key information of investigation report text
CN113963748A (en) Protein knowledge map vectorization method
CN103336803A (en) Method for generating name-embedded spring festival scrolls through computer
CN108959269B (en) A kind of sentence auto ordering method and device
Tian et al. Measuring the similarity of short texts by word similarity and tree kernels
Liu et al. Keyword extraction using PageRank on synonym networks
CN114969324A (en) Chinese news title classification method based on subject word feature expansion
Singh et al. An Insight into Word Sense Disambiguation Techniques
Abdelaali et al. Swarm optimization for Arabic word sense disambiguation based on English pre-trained word embeddings

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220222

Address after: 100144 Beijing City, Shijingshan District Jin Yuan Zhuang Road No. 5

Patentee after: NORTH CHINA University OF TECHNOLOGY

Address before: 100048 No. 105 West Third Ring Road North, Beijing, Haidian District

Patentee before: Capital Normal University

TR01 Transfer of patent right