CN110516216A - A kind of automatic writing template base construction method of sports news - Google Patents

A kind of automatic writing template base construction method of sports news Download PDF

Info

Publication number
CN110516216A
CN110516216A CN201910404549.XA CN201910404549A CN110516216A CN 110516216 A CN110516216 A CN 110516216A CN 201910404549 A CN201910404549 A CN 201910404549A CN 110516216 A CN110516216 A CN 110516216A
Authority
CN
China
Prior art keywords
template
word
writing
sports news
construction method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910404549.XA
Other languages
Chinese (zh)
Inventor
吕学强
张乐
董志安
孙少奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Information Science and Technology University
Original Assignee
Beijing Information Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Information Science and Technology University filed Critical Beijing Information Science and Technology University
Priority to CN201910404549.XA priority Critical patent/CN110516216A/en
Publication of CN110516216A publication Critical patent/CN110516216A/en
Pending legal-status Critical Current

Links

Abstract

It include that writing template calculates, trigger condition constructs the present invention relates to a kind of automatic writing template base construction method of sports news, similarity calculation is carried out to template data according to the classification divided first and finds identical template, trigger condition identification followed by is carried out to template using CRF, ultimately forms trigger condition-template;It includes that the writing template based on cosine similarity calculates, the writing template based on Word2Vec calculates that writing template, which calculates,.The automatic writing template base construction method of sports news provided by the invention, achieve splendid accuracy, recall rate and F value, the similarity on text semantic can be indicated by calculating the similarity in vector space to realize that related term extends, it is write automatically for sports news and strong support is provided, to provide splendid condition to realize that accurately and efficiently sports news is write automatically, the needs of practical application can be met well.

Description

A kind of automatic writing template base construction method of sports news
Technical field
The invention belongs to computers to write technical field automatically, and in particular to a kind of automatic writing template library of sports news Construction method.
Background technique
Sports news write automatically be AI application field research hotspot, pass through AI technology and realize sports news Automatic writing can mitigate the labor intensity of sports news worker significantly.The automatic writing template library of sports news is to carry out body One of the indispensable important technical that news is write automatically is educated, the building in writing template library is also that sports news is automatic One of the critical operating procedure of writing.In the prior art, the construction method in writing template library designs unreasonable, Wu Fashi Existing related term extension, the accuracy of acquirement, recall rate and F value are barely satisfactory, and cannot write to provide automatically for sports news has by force The support of power causes sports news to be write automatically ineffective, is not able to satisfy the needs of practical application.It urgently researches and develops now A kind of automatic writing template base construction method of sports news that can overcome prior art defect.
Summary of the invention
For above-mentioned problems of the prior art, the purpose of the present invention is to provide the avoidable appearance of one kind is above-mentioned The automatic writing template base construction method of the sports news of technological deficiency.
In order to achieve the above-mentioned object of the invention, technical solution provided by the invention is as follows:
A kind of automatic writing template base construction method of sports news, comprising: writing template calculates, trigger condition constructs.
Further, the automatic writing template base construction method of the sports news includes: first according to the classification divided Similarity calculation is carried out to template data and finds identical template, trigger condition identification followed by is carried out to template using CRF, Ultimately form trigger condition-template.
Further, it includes that the writing template based on cosine similarity calculates that writing template, which calculates, is based on cosine similarity Writing template calculating include: the similarity that two sentences are calculated using cosine similarity, two sentences are segmented, All words are listed, word frequency is calculated, writes out word frequency vector;Word frequency vector is imagined as the two lines section in space, be all from Origin sets out, and is directed toward different directions, and shape in an angle, calculates the cosine similarity of angle between two lines section.
Further, if angle is 0, it is meant that direction is identical, line segment is overlapped, and indicates the text that two vectors represent It is essentially equal;If angle is 90 degree, it is meant that form right angle, direction is completely dissimilar;If angle is 180 degree, it is meant that Direction is exactly the opposite, and the similarity degree of vector is judged by the size of angle, and angle is smaller, just represents more similar.
Further, it includes that the writing template based on Word2Vec calculates that writing template, which calculates, writing based on Word2Vec It include: that the word in background corpus is carried out formalization representation with vector using Word2Vec tool as formwork calculation, to text This processing is reduced to the vector operation in vector space, indicates text semantic by calculating the similarity in vector space On similarity, realize related term extension.
Further, the writing template trigger condition based on CRF, which constructs, includes:
Sentence Text is described for a given matchiFor, the score difference of visiting team and host team is Diffsore, is write Making template is Y, trigger condition Xi
Diffsorei=Texti·Score1-Texti·Score2
Y=Diffsore (∑I=1Xi);
The score for calculating each text is poor, and is ranked up to diffsore
List=dis (diffsore);
List indicates to merge point identical data of difference, formation divides difference data based on the text set after point difference Collection, and trigger condition extraction is carried out to the data divided in difference data collection.
Further, the writing template trigger condition building based on CRF includes: role's label;Feature templates selection.
Further, role's label includes: definition trigger condition are as follows: a NBA match within a certain period of time, describes The fact condition, be denoted as CS;Defining trigger word is word used in a description CS, is denoted as CSword;Every a kind of triggering Condition includes many trigger words, i.e.,
Further, the writing template trigger condition building based on CRF includes: first to each sentence with mark Carry out participle and part-of-speech tagging;Secondly character data is labeled, finally selection word, part of speech, role are characterized, and utilize CRF Trigger word is identified.
Further, feature templates selection include: choose word, part of speech, role are characterized;Using B, I, E, O as triggering The label symbol of word, wherein B indicates that the lead-in of trigger word, I indicate that medium term, E indicate that the tail word of base trigger word, O indicate non-touch Word is sent out, and single features template and compound characteristics template is respectively adopted, trigger word is identified.
The automatic writing template base construction method of sports news provided by the invention carries out template extraction to news war communique, and Construction feature template and trigger condition carry out similarity calculation to template data according to the classification divided and find identical mould Plate, followed by using CRF to template carry out trigger condition identification, ultimately form trigger condition-template, achieve it is splendid just True rate, recall rate and F value can indicate the similarity on text semantic come real by calculating the similarity in vector space Existing related term extension, writes automatically for sports news and provides strong support, thus to realize accurately and efficiently sports news Automatic writing provides splendid condition, can meet the needs of practical application well.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, combined with specific embodiments below to this Invention is described further.It should be appreciated that described herein, specific examples are only used to explain the present invention, is not used to limit The fixed present invention.Based on the embodiments of the present invention, those of ordinary skill in the art institute without making creative work The every other embodiment obtained, shall fall within the protection scope of the present invention.
A kind of automatic writing template base construction method of sports news, removes the entity for including in sentence first, such as sportsman's name Claim, team's title, fixture etc., merger secondly is carried out to template using cosine similarity, removal repeats template, finally gives Each template constructs its trigger condition, completes template library building.
A large amount of writing template and material are collected by analyzing existing sports news war communique, using participle tool to body It educates news and carries out cutting, and the word word after cutting is retrieved, find Text1In word be present in TextnIn, That is every report the word template that uses be it is relatively-stationary, the fact that only compete, is different, therefore passes through analysis Existing news report is effective and feasible come the method for extracting writing template.
It, can be by the text in any sports news war communique according to the sports news war communique classification system set Description corresponds in corresponding classification.
By mainly including sportsman's name, team's name, score, fixture in sentence to having writing analysis of sentence discovery Etc. events entity and the true description of match.What that is the present invention left after removing the event entity of match will be writing Template.Therefore it needs to replace event entity, constructs writing template.The present invention is replaced according to 1.1 pairs of writing sentences of table
Table 1.1 replaces symbol
Team's name and sportsman's name crawl on the net, since sportsman's name is all full name, pre-process, will be each The word of the appearance of a serial number does a list, generates template after traversal.
The step of calculating the method for the present invention includes writing template with trigger condition building.
It includes the writing template calculating based on cosine similarity, the writing template based on Word2Vec that writing template, which calculates, It calculates.
Writing template based on cosine similarity calculates:
The present invention has found after having carried out template extraction to more than 3000 a sentences of 10 classes of extraction, each class template Inside there are many redundant datas, therefore the present invention needs to remove data useless, the present invention calculates two sentences using cosine similarity The similarity of son, gives one example to illustrate,
Sentence A: the ice cream of this strawberry taste is especially nice
Sentence B: the ice cream of this apple taste is nicier
Basic ideas are: if the word of this two word is more similar, their content just should be more similar.It therefore, can be with Start with from word frequency, calculates their similarity degree.
The first step, the present invention first segment two sentences, and the participle tool present invention segments tool using jieba.
Sentence A: this/strawberry/taste// ice cream/especially/nice.
Sentence B: this/apple/taste// ice cream/more/nice.
Second step lists all words.
This, strawberry, apple, taste, ice cream is fond of eating, especially, more
Third step calculates word frequency.
Sentence A: this 1, strawberry 1, apple 0, taste 1,1, ice cream 1, be fond of eating 1, special 1, more 0
Sentence B: this 1, strawberry 0, apple 1, taste 1,1, ice cream 1, be fond of eating 1, special 0, more 1
4th step writes out word frequency vector.
Sentence A:(1,1,0,1,1,1,1,1,0)
Sentence B:(1,0,1,1,1,1,1,0,1)
It arrives here, problem has reformed into the similarity degree for how calculating the two vectors.They are imagined as sky by the present invention Between in two lines section, be all to set out from origin ([0,0 ...]), be directed toward different directions.One is formed between two lines section Angle, if angle is 0 degree, it is meant that direction is identical, line segment is overlapped, this is the complete phase of text for indicating two vectors and representing Deng;If angle is 90 degree, it is meant that form right angle, direction is completely dissimilar;If angle is 180 degree, it is meant that direction is just It is good opposite.Therefore, the present invention can be by the size of angle, to judge the similarity degree of vector.Angle is smaller, gets over regard to representing It is similar.
The similarity of sentence A and sentence B are calculated using above formula, calculating process is as follows:
Cos (θ)=(1*1+1*0+0*1+1*1+1*1+1*1+1*1+1*0+0*1)/((√ 1*1+1*1+0*0+1*1+1* 1+1*1 +1*1+1*1+0*0)*(√1*1+0*0+1*1+1*1+1*1+1*1+1*1+0*0+1*1))≈0.714
0.714 closer to 1, it can be seen that two sentences are still substantially similar.
Writing template based on Word2Vec calculates:
By the way that template analysis, discovery still has many similar templates, is not matched and calculates, such as: " small height Tide " and " attack echelons ", at this moment because this even a word in font and literal upper difference, but be semantically it is identical, therefore The present invention needs to carry out normalization to this semantic similar templates.
Word2Vec is a deep learning tool for word being converted to real number value vector of the open source of Google in 2013, It can indicate the word in text with K dimensional vector by training using the thought of deep learning, pass through Word2Vec Obtained term vector can be used to do many work of natural language processing.For example cluster, part of speech is analyzed, looks for synonym etc.. When word is as feature, word can be mapped to the vector space of K dimension by Word2Vec, and be indicated with this K dimensional vector This word, text obtain deeper character representation.
It provides two kinds of training patterns, the continuous bag of words of respectively CBOW and Skip-gram model, they are all sharp With a kind of shallow-layer neural network training method, wherein based on context CBOW is and the Skip- come the probability for predicting current word Gram is the probability that context is predicted according to current word.Present invention is primarily based on Skip-gram models, and use Hierarchical Softmax method optimizes training.Specified window context is predicted using the term vector of current word Term vector, it is assumed that given training characteristics data w1, w2, w3…wT, then the objective function of Skip-gram model are as follows:
Wherein, JθObjective function is represented, T is the sum of characteristic, and c is the parameter for determining contextual window size, c More big, the training data needed is more, and the training time needed is also more, but can obtain higher accuracy rate.
In order to optimize computational efficiency, the present invention uses Hierarchical Softmax algorithm, which utilizes Huffman On Binary Tree Representation Feature Words, using T word of output layer as leaf node, using the frequency of occurrence of each word as Weight is encoded, and distributes shorter path to high frequency words, low-frequency word distributes longer path, so that each word can be from tree Root node be accessed to along unique paths.Therefore, p (u | w) function is defined as follows:
Wherein, L (u) is path length of the root node to u node,For root node into the path u j-th of leaf node Corresponding vector,Indicate that root node corresponding coding of j-th of node into the path u, v (w) indicate the term vector of w.
Then objective function is solved using gradient descent method, generates the term vector representation of word.Word2Vec word Extension:
The invention proposes a kind of word expansion algorithm based on Word2Vec, this method is using Word2Vec tool back Word in scape corpus carries out formalization representation with vector, and the processing to text is reduced to the vector operation in vector space, The similarity on text semantic is indicated by calculating the similarity in vector space, related term extension is realized, to enhance Keyword further increases the extraction quality of critical sentence in text to the indicative function of critical sentence.
In order to obtain a practical term vector model, the present invention crawls 70,000 a plurality of NBA Sports News again Text, the present invention is that each word constructs word vector space model according to the semantic information of word, if some words The continually co-occurrence in same sentence, then they can have certain semantic dependency.
In the environment of big data, the distance between two o'clock is exactly the degree of correlation of corresponding two words in vector space, Therefore the present invention measures the degree of correlation of word in other vocabulary and keyword set, COS distance bigger generation by COS distance The degree of correlation of two words of table is higher, and sets specific threshold value, and the high vocabulary of correlation is extracted and reaches expanded keyword Purpose.
After vocabulary is shown as corpus vector form, the present invention inputs crucial into the term vector file obtained after training Word, by calculate COS distance, the present invention can export in certain threshold value or certain sequence with the keyword in language Similar word in justice, to obtain the related word set of keyword, the vocabulary after extension is as shown in 1.2.
The extension of 1.2 Word2Vec word of table
Writing template trigger condition building based on CRF:
Whether template is activated, and to see live whether text data triggers corresponding condition completely, therefore the present invention needs Its corresponding trigger condition is constructed to each template.The present invention mainly completes template structure from score, trigger condition It builds.
Sentence Text is described for a given matchiFor, the score difference of visiting team and host team is Diffsore, is write Making template is Y, trigger condition Xi
Diffsorei=Texti·Score1-Texti·Score2 (1.4)
Y=Diffsore (∑I=1Xi) (1.5)
For the present invention according to score difference formula, the score for calculating each text is poor, and is ranked up List to diffsore =dis (diffsore) (1.6)
List expression will divide the identical data of difference to merge based on the text set after point difference, the present invention, and formation point is poor Data set, and trigger condition extraction is carried out to the data divided in difference data collection.
Role's label:
It defines 1 trigger condition: typically referring to NBA match within a certain period of time, the condition of the fact that description, note It is CS.
It defines 2 trigger words: typically referring to word used in a description CS, be denoted as CSword.
Each Sent contains at least one trigger word CSwordi, trigger word CSwordiTypically occur in sportsman or ball Between team's title and score, the fact that their main forms are match and details.All contain respectively not in each class template Same trigger condition.CS mainly includes penalty shot, three points, is grabbed, pause, presses the whistle, slamdunk etc., and every one kind trigger condition includes again Many trigger words, i.e.,Therefore the present invention needs to extract trigger condition, for writing template building touching Clockwork spring part, as shown in the table.
1.3 trigger word example of table
Former sentence Wei De this section will at the end of two penalize in two, bull is remained ahead with 38-32
It is poor to divide 6
Trigger word This section will terminate, two penalize in two, keep on top
Trigger condition Ending, penalty shot are leading
Writing template **At the end of this section is incited somebody to action*It penalizes*In,*Team with**-**It remains ahead
Based on CRF trigger word identification basic thought it is as follows: first to band mark each sentence carry out segment and Part-of-speech tagging;Secondly character data is labeled, finally selection word, part of speech, role are characterized, and are known using CRF to trigger word Not.
According to the position characteristic of trigger word, invention defines following roles:
1, Qiu Yuanming: generally sportsman's name typically occurs in front of trigger word in a template, such as: KaminskiTwo degrees are thrown In three points
2, team's name: generally team's title typically occurs in front of trigger word in a template, such as: waspIt wins four points in a row
3, score: the front and back that generally analogy typically occurs in trigger word in a template occurs, such as: rare talentScore is reached to 89-93, wasp is with 64-57It is leading
4, punctuate: usually each sentence of ", " segmentation contains trigger condition.Such as: Kaminski two penalizes one In, wasp is leading with 64-57
5, unrelated word: the word unrelated with role in corpus is labeled with N.
1.4 role's table of table
Feature templates selection:
Since CRF can express the Context-dependent information of long range, and effectively by various related or irrelevant informations It is fused together, present invention selection word, part of speech, role are characterized;Label symbol using B, I, E, O as trigger word, wherein B indicates that the lead-in of trigger word, I indicate that medium term, E indicate the tail word of base trigger word, and O indicates non-toggle word, and list is respectively adopted One feature templates and compound characteristics template identify trigger word as shown in table 1.5.
1.5 CRF feature templates of table
Wherein Word represents word, and Nominal represents part of speech, and Role represents role representated by each word;
Word (i) indicates current word, and Word (i+1) indicates that first word on the right of Word (i), Word (i-1) represent Word (i) first left word;
Nominal (i) represents the part of speech of current term, and Nominal (i+1) represents first word on the right of current Word (i) Part of speech, Nominal (i-1) represents the part of speech of current Word (i) first left word;
Role (i) represents current character, and Role (i+1) represents first word on the right of current Word (i), Role (i-1) generation Current Word (i) the first left word of table.Experimental result and analysis
The present invention uses the NBA writing template sentence 5053 marked.Merge 2079 using similarity calculation, logarithm According to progress trigger word extraction, and construct writing template.This experiment accuracy, recall rate and F value carry out evaluation experimental result.
Template similarity calculated result:
The present invention constructs the term vector model of a NBA sports news using Word2Vec, and utilizes cosine similarity meter Calculate the correlation between template.Experimental result is as shown in table 1.6
1.6 duplicate removal result of table
The present invention uses the NBA writing template sentence 5053 marked.Using only cosine similarity duplicate removal 1819, The term vector model of a NBA sports news is constructed using Word2Vec, and utilizes cosine similarity duplicate removal 2104.
For the present invention by taking leading class template as an example, the sentence using only cosine similarity duplicate removal is as follows:
Section four, it is more than half when, Murray three-point shot, spur is 38 points leading with 124-86.
When third section is more than half, Hayward three-point shot, jazz is 21 points leading with 74-53.
At the end of third section is incited somebody to action, Douglas Rodríguez hits three points, and 76 people are leading with 85-84.
After cosine similarity+word2vec method, the sentence of duplicate removal is as follows:
This section middle section, McGee's dunk shot in quick attack are gone smoothly, and warrier is leading with 75-68.
Second section middle section, Durant lay up in quick attack and go smoothly, and warrier is leading with 35-30.
It can be seen that using after word2vec method, dunk shot and lays up and be judged as identical word, so that help is preferably returned One changes.
Trigger word extracts result:
CRF combines the characteristics of maximum entropy model and hidden Markov model, more preferably can more fully utilize context Information for better results, also possesses all advantages of maximum entropy model.The present invention is done according to the different characteristic of selection Comparative experiments, the experimental results are shown inthe following table.
1.7 experimental result of table
It can be seen that composite shuttering from the experimental result of table 1.7 and be better than single template, the accuracy of composite shuttering is 89.04%, it is higher than single template 5.5%.Composite shuttering recall rate 80.52% is higher than single template 5.8%.Composite shuttering F value 84.57 is higher than single template 6.8%.
The present invention carries out template extraction, and construction feature template and trigger condition to news war communique, and basis first has been divided Classification to template data carry out similarity calculation find identical template, followed by using CRF to template progress trigger condition Identification, ultimately forms trigger condition-template, writes automatically for sports news and provide strong support, thus accurate to realize Efficiently sports news is write automatically provides splendid condition.
Embodiments of the present invention above described embodiment only expresses, the description thereof is more specific and detailed, but can not Therefore limitations on the scope of the patent of the present invention are interpreted as.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to guarantor of the invention Protect range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.

Claims (10)

1. a kind of automatic writing template base construction method of sports news characterized by comprising writing template calculates, triggers item Part building.
2. sports news writing template base construction method according to claim 1, which is characterized in that the sports news is certainly Dynamic writing template base construction method include: template data progress similarity calculation is found according to the classification divided first it is identical Template, followed by using CRF to template carry out trigger condition identification, ultimately form trigger condition-template.
3. sports news writing template base construction method according to claim 1, which is characterized in that writing template calculates packet It includes the writing template based on cosine similarity to calculate, the writing template calculating based on cosine similarity includes: similar using cosine It spends to calculate the similarity of two sentences, two sentences is segmented, list all words, calculate word frequency, write out word frequency Vector;Word frequency vector is imagined as the two lines section in space, is all to be directed toward different directions, two lines section from origin Between shape in an angle, calculate the cosine similarity of angle.
4. sports news writing template base construction method according to claim 1 to 3, which is characterized in that if angle is 0, Mean that direction is identical, line segment is overlapped, the text for indicating that two vectors represent is essentially equal;If angle is 90 degree, it is meant that Right angle is formed, direction is completely dissimilar;If angle is 180 degree, it is meant that direction is exactly the opposite, is sentenced by the size of angle The similarity degree of disconnected vector, angle is smaller, just represents more similar.
5. sports news writing template base construction method according to claim 1, which is characterized in that writing template calculates packet It includes the writing template based on Word2Vec to calculate, the writing template calculating based on Word2Vec includes: to utilize Word2Vec tool Word in background corpus is carried out formalization representation with vector, the processing to text is reduced to the fortune of the vector in vector space It calculates, indicates the similarity on text semantic by calculating the similarity in vector space, realize related term extension.
6. sports news writing template base construction method according to claim 1, which is characterized in that the writing based on CRF Template trigger condition constructs
Sentence Text is described for a given matchiFor, the score difference of visiting team and host team is Diffsore, writing template For Y, trigger condition Xi
Diffsorei=Texti·Score1-Texti·Score2
Y=Diffsore (ΣI=1Xi);
The score for calculating each text is poor, and is ranked up to diffsore
List=dis (diffsore);
List indicates to merge point identical data of difference, formation divides difference data collection, and right based on the text set after point difference The data divided in difference data collection carry out trigger condition extraction.
7. sports news writing template base construction method according to claim 1, which is characterized in that the writing based on CRF The building of template trigger condition includes: role's label;Feature templates selection.
8. sports news writing template base construction method described in -7 according to claim 1, which is characterized in that role, which marks, includes: Define trigger condition are as follows: within a certain period of time, the condition of the fact that description is denoted as CS for NBA match;Defining trigger word is one Word used in field description CS, is denoted as CSword;Every one kind trigger condition includes many trigger words, i.e.,
9. sports news writing template base construction method according to claim 1, which is characterized in that the writing based on CRF The building of template trigger condition includes: to carry out participle and part-of-speech tagging to each sentence with mark first;Secondly diagonal chromatic number According to being labeled, finally selection word, part of speech, role are characterized, and are identified using CRF to trigger word.
10. sports news writing template base construction method described in -7 according to claim 1, which is characterized in that feature templates choosing Select include: choose word, part of speech, role are characterized;Label symbol using B, I, E, O as trigger word, wherein B indicates trigger word Lead-in, I indicates that medium term, E indicate that the tail word of base trigger word, O indicate non-toggle word, and be respectively adopted single features template and Compound characteristics template, identifies trigger word.
CN201910404549.XA 2019-05-15 2019-05-15 A kind of automatic writing template base construction method of sports news Pending CN110516216A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910404549.XA CN110516216A (en) 2019-05-15 2019-05-15 A kind of automatic writing template base construction method of sports news

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910404549.XA CN110516216A (en) 2019-05-15 2019-05-15 A kind of automatic writing template base construction method of sports news

Publications (1)

Publication Number Publication Date
CN110516216A true CN110516216A (en) 2019-11-29

Family

ID=68622698

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910404549.XA Pending CN110516216A (en) 2019-05-15 2019-05-15 A kind of automatic writing template base construction method of sports news

Country Status (1)

Country Link
CN (1) CN110516216A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111310444A (en) * 2020-01-16 2020-06-19 北京大学 Park landscape service identification method
CN112084448A (en) * 2020-08-31 2020-12-15 北京金堤征信服务有限公司 Similar information processing method and device
CN112269856A (en) * 2020-09-23 2021-01-26 咪咕文化科技有限公司 Text similarity calculation method and device, electronic equipment and storage medium
CN112765950A (en) * 2021-01-08 2021-05-07 首都师范大学 Template library generation method and system based on cosine similarity and storage medium
CN112765949A (en) * 2021-01-08 2021-05-07 首都师范大学 Method, system and storage medium for automatically generating event character live broadcast text
CN113254574A (en) * 2021-03-15 2021-08-13 河北地质大学 Method, device and system for auxiliary generation of customs official documents
CN113411623A (en) * 2021-06-15 2021-09-17 首都师范大学 Automatic news generation method and system based on difference-time function algorithm and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050143971A1 (en) * 2003-10-27 2005-06-30 Jill Burstein Method and system for determining text coherence
CN105912526A (en) * 2016-04-15 2016-08-31 北京大学 Sports game live broadcasting text based sports news automatic constructing method and device
CN108549634A (en) * 2018-04-09 2018-09-18 北京信息科技大学 A kind of Chinese patent text similarity calculating method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050143971A1 (en) * 2003-10-27 2005-06-30 Jill Burstein Method and system for determining text coherence
CN105912526A (en) * 2016-04-15 2016-08-31 北京大学 Sports game live broadcasting text based sports news automatic constructing method and device
CN108549634A (en) * 2018-04-09 2018-09-18 北京信息科技大学 A kind of Chinese patent text similarity calculating method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈玉敬 等: "NBA赛事新闻的自动写作研究", 《北京大学学报(自然科学版)》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111310444A (en) * 2020-01-16 2020-06-19 北京大学 Park landscape service identification method
CN112084448A (en) * 2020-08-31 2020-12-15 北京金堤征信服务有限公司 Similar information processing method and device
CN112269856A (en) * 2020-09-23 2021-01-26 咪咕文化科技有限公司 Text similarity calculation method and device, electronic equipment and storage medium
CN112269856B (en) * 2020-09-23 2023-11-10 咪咕文化科技有限公司 Text similarity calculation method and device, electronic equipment and storage medium
CN112765950A (en) * 2021-01-08 2021-05-07 首都师范大学 Template library generation method and system based on cosine similarity and storage medium
CN112765949A (en) * 2021-01-08 2021-05-07 首都师范大学 Method, system and storage medium for automatically generating event character live broadcast text
CN113254574A (en) * 2021-03-15 2021-08-13 河北地质大学 Method, device and system for auxiliary generation of customs official documents
CN113411623A (en) * 2021-06-15 2021-09-17 首都师范大学 Automatic news generation method and system based on difference-time function algorithm and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN110516216A (en) A kind of automatic writing template base construction method of sports news
CN106997382B (en) Innovative creative tag automatic labeling method and system based on big data
CN106919673B (en) Text mood analysis system based on deep learning
Richard et al. Temporal action detection using a statistical language model
CN105095204B (en) The acquisition methods and device of synonym
CN109753660B (en) LSTM-based winning bid web page named entity extraction method
CN110134946B (en) Machine reading understanding method for complex data
CN110825877A (en) Semantic similarity analysis method based on text clustering
CN103699625A (en) Method and device for retrieving based on keyword
CN110704621A (en) Text processing method and device, storage medium and electronic equipment
CN110222172B (en) Multi-source network public opinion theme mining method based on improved hierarchical clustering
CN116187163B (en) Construction method and system of pre-training model for patent document processing
CN112256939A (en) Text entity relation extraction method for chemical field
CN112966525B (en) Law field event extraction method based on pre-training model and convolutional neural network algorithm
CN113505200A (en) Sentence-level Chinese event detection method combining document key information
CN108763192B (en) Entity relation extraction method and device for text processing
CN108549636A (en) A kind of race written broadcasting live critical sentence abstracting method
CN105678244B (en) A kind of near video search method based on improved edit-distance
CN113033183A (en) Network new word discovery method and system based on statistics and similarity
CN112052665A (en) Remote monitoring event extraction method and application thereof
Gong et al. A semantic similarity language model to improve automatic image annotation
Hu et al. Retrieval-based language model adaptation for handwritten Chinese text recognition
WO2019064137A1 (en) Extraction of expression for natural language processing
Wang et al. Improving handwritten Chinese text recognition by unsupervised language model adaptation
CN109918632A (en) Document based on scene template writes householder method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191129