CN107977196A - A kind of document creation method and server - Google Patents

A kind of document creation method and server Download PDF

Info

Publication number
CN107977196A
CN107977196A CN201610920284.5A CN201610920284A CN107977196A CN 107977196 A CN107977196 A CN 107977196A CN 201610920284 A CN201610920284 A CN 201610920284A CN 107977196 A CN107977196 A CN 107977196A
Authority
CN
China
Prior art keywords
data
behavior
description phrase
behavior expression
expression data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610920284.5A
Other languages
Chinese (zh)
Other versions
CN107977196B (en
Inventor
刘康
石卫国
蔡静
张雪娇
窦晓妍
张秋明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Beijing Co Ltd
Original Assignee
Tencent Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Beijing Co Ltd filed Critical Tencent Technology Beijing Co Ltd
Priority to CN201610920284.5A priority Critical patent/CN107977196B/en
Priority to PCT/CN2017/101852 priority patent/WO2018072577A1/en
Publication of CN107977196A publication Critical patent/CN107977196A/en
Application granted granted Critical
Publication of CN107977196B publication Critical patent/CN107977196B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/20Software design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90332Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

This application discloses a kind of document creation method and server.This method includes:The real-time code data for a theme are obtained, real-time code data write according to machine language, carry content-data under the theme;The behavior expression data of at least one object are identified from real-time code data;At least one description phrase is determined according to the behavior expression data of at least one object;And the behavior expression data according at least one object and at least one description phrase, generate the text of the theme.Utilize these technical solutions, it is possible to increase the efficiency of text generation and the resource utilization of server.

Description

A kind of document creation method and server
Technical field
This application involves technical field of data processing, more particularly to a kind of document creation method and server.
Background technology
At present, media representatives daily compiling contribution, issue content when, by some softwares can automatically generate report, The texts such as article, contribution.It is mainly that the written broadcasting live paragraph in live broadcast in both illustration and text system is monitored and prestored in the form of Data, therefrom gripping portion paragraph word, the filling of paragraph and data is carried out by default fixed form, so as to scrabble up one The texts such as piece report.
Above-mentioned text generation mode, is established on the basis of artificial written broadcasting live report, and non-software is write automatically, it is necessary to Third-party live broadcast in both illustration and text system is relied on to aid in completing;In addition, crawl can only be paragraph, the template for combining is also solid Fixed, the method for this paragraph crawl and combination causes generated report to piece together sense by force, also more mechanical.Therefore, generated Report it is readable poor, the information content of transmission is limited, can not meet understanding demand of the user to details, also reduces text Generate the resource utilization of equipment.
The content of the invention
In view of this, the present invention provides a kind of document creation method and server, it is possible to increase the efficiency of text generation And the resource utilization of server.
The technical proposal of the invention is realized in this way:
The present invention provides a kind of document creation method, the described method includes:
The real-time code data for a theme are obtained, the real-time code data are write according to machine language, carried Content-data under the theme;
The behavior expression data of at least one object are identified from the real-time code data;
At least one description phrase is determined according to the behavior expression data of at least one object;And
According to the behavior expression data of at least one object and at least one description phrase, the theme is generated Text.
Present invention also offers a kind of server, including:
Acquisition module, for obtaining the real-time code data for a theme, the real-time code data are according to machine language Speech is write, carries content-data under the theme;
Identification module, for identifying the row of at least one object from the real-time code data that the acquisition module obtains To show data;
Determining module, the behavior expression data of at least one object for being obtained according to the identification module determine to Few description phrase;And
Generation module, for the behavior expression data of at least one object that are obtained according to the identification module and it is described really At least one description phrase that cover half block is determined, generates the text of the theme.
Compared with prior art, method provided by the invention, has broken away from the prior art to manually written broadcasting live report Rely on, the related data that all ins and outs of match judge with performance can be reduced into as lively, hommization match Text expression, not only speed is fast, contains much information, and also has readability concurrently, improves the efficiency of text generation, is truly realized machine The class peopleization of report, also improves the resource utilization of server.
Brief description of the drawings
For the technical solution in the clearer explanation embodiment of the present invention, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.Wherein,
Fig. 1 is the structure diagram of the implementation environment involved by one embodiment of the invention;
Fig. 2 is the exemplary process diagram of the document creation method according to one embodiment of the invention;
Fig. 3 is the exemplary process diagram of the definite descriptor group according to one embodiment of the invention;
Fig. 4 by according to one embodiment of the invention generation text schematic diagram;
Fig. 5 is the exemplary process diagram of the document creation method according to another embodiment of the present invention;
Fig. 6 by according to another embodiment of the present invention generation text schematic diagram;
Fig. 7 for according to one embodiment of the invention institute's text exhibition schematic diagram;
Structure diagrams of the Fig. 8 according to the server of one embodiment of the invention;
Fig. 9 is the structure diagram of the server according to another embodiment of the present invention.
Embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is part of the embodiment of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, the every other implementation that those of ordinary skill in the art are obtained without making creative work Example, belongs to the scope of protection of the invention.
Fig. 1 is the structure diagram of the implementation environment involved by one embodiment of the invention.As shown in Figure 1, textual presentation system System 100 includes server 110 and client 120.Wherein, server 110 is again including code database 111, corpus data storehouse 112 With text generation processing unit 113.The code data for each theme, and real-time update are stored with code database 111; The language materials such as the phrase used during generation text, phrase are stored with corpus data storehouse 112.
In an embodiment of the present invention, text generation processing unit 113 is used to read the real-time generation in code database 111 Code data, identify behavior expression data, determine description phrase, and combine corpus data storehouse 112 and generate text.
Then, the text of generation is sent to client 120 by server 110, and client 120 should as media popularization side With program, recommend text exhibition to user, and social platform is provided and is interacted for user.Wherein, server 110 and client Wired or wireless connection can be carried out between 120.
Fig. 2 is the exemplary process diagram of the document creation method according to one embodiment of the invention.This method is applied to service Device.As shown in Fig. 2, this method may include following steps:
Step 201, the real-time code data for a theme are obtained, real-time code data are write according to machine language, taken With the content-data under the theme.
In this step, the code database in server obtains and stores the real-time code data for a theme, these Code data is write according to machine language, and carries the content-data under the theme.Machine language is, for example, java, PHP (super Text Pretreatment device), asp.net, ruby etc., every kind of machine language all has a set of programming rule of oneself.
Theme can be competitive sports, and the code data write includes each project in competitive sports, each movement The match details of member, the content-data such as achievement;And for example, theme can be singing contest, and the code data write includes each The content-datas such as link, the match details of each singer, achievement.And for example, theme can be remote-controlled toy vehicle contest, be write The content-datas such as code data is included in each racing track, the speed of service of each toy car, run time, ranking.
Step 202, the behavior expression data of at least one object are identified from real-time code data.
This step realizes from machine readable " code data " to client user's readable " behavior expression data " it Between conversion.Specifically, object includes personage, animal or object, i.e., the participation main body of a certain theme.For each object, Its behavior expression data includes one or more behaviors and judges data with the corresponding performance of each behavior.
For example, when theme is competitive sports, object is sportsman, and the behavior expression data of each sportsman correspond to movement Member includes each action of sportsman in each match details during the games, behavior, and data are judged in performance to be included each The score of action, evaluation result, overall scores and the awards result of judge etc..By taking diving is reported as an example, racing dive is scoring class Match, the behavior expression data of each sportsman include limp, treadmill, take-off, height, degree-of-difficulty factor, aerial statue, coordination Coordinate, entering hydrodynamic(al) work etc., each (corresponding show of i.e. each behavior is judged for ins and outs (i.e. multiple behaviors) and corresponding score Data).
Server when being identified, first according to the redaction rule of code set real-time code data and object, behavior, The mapping relations between data are judged in performance, and each object is then identified from real-time code data according to the mapping relations Data are judged in behavior and performance.
For example, exemplified by using java as machine language, will be a variety of in real-time code data according to the syntax rule of java Field is mapped as object, behavior, performance judge data.For example, field " object " is mapped to " object ", field " action " is reflected It is mapped to " behavior ", field " score " is mapped to " data are judged in performance ".
Table 1 lists behavior expression data result according to an embodiment of the invention.Theme is that Rio Women's Olympic is double People's three-metre springboard finals, server identify that object has two sportsmen, Minxia WU and Shi Tingmao, year from real-time code data Age is respectively 31 and 24, and behavior includes the dives of five wheels, and the score that data include often taking turns, ranking and final are judged in performance Total score and medal result.
The behavior expression data result that table 1 identifies
Step 203, at least one description phrase is determined according to the behavior expression data of at least one object.
This step is realized from " behavior expression data " to the association " description phrase ".Specifically, it is determined that descriptor The method of group has following three kinds:
Method one:The behavior expression data of multiple objects under the theme are compared, are selected from default description phrase Select out at least one description phrase to match with comparative result.
This method is to carry out across comparison to multiple objects for same project under same subject.For example, the Olympic Games This theme of racing dive, altogether including 8 events, is respectively:3 meters of Women's Synchronized Springboard diving, Men's Diving Synchronized 10m Platform jumps Plate diving, Women's Diving 3m Springboard diving, Men's Diving 3m Springboard diving, the diving of 10 three-metre springboard 3 of Women's Diving Synchronized 3m Springboard, men's synchronised diving 3 metre springboard Diving, the diving of 10 three-metre springboard 3 of woman, the diving of 10 three-metre springboard 3 of man.For same event, by the data of multiple sportsmen into Row is synchronous relatively, and describable phrase is restored out according to result of the comparison.
In one embodiment, by the behavior expression data of multiple objects carry out result of the comparison two-by-two include greater than, equal to Or be less than, server pre-sets out corresponding multiple description phrases.Table 2 is the basis according to one embodiment of the invention with going through History performance data compare more default description phrase.
The more default description phrase compared with data are showed with history of table 2
For example, in the men's Olympic 400m freestyle final of Rio, the final result of sportsman Sun Yang for 3 points 41 seconds 68, the final result of another sportsman's Horton for 3 points 41 seconds 55, the achievement of the two is compared, the achievement of Sun Yang is slightly below The achievement of Horton, then server can determine to describe phrase accordingly as " slightly backward ", " being defeated by opponent " and " very It is sorry ".
Method two:For each object, current real-time performance data and history performance data are subjected to longitudinal contrast.
Specifically, Fig. 3 is the exemplary process diagram of the definite descriptor group according to one embodiment of the invention.As shown in figure 3, Include the following steps:
Step 2031, for each object, the history for obtaining the object shows data.
By taking competitive sports as an example, history performance data packet includes the passing performance of sportsman, world rankings, project adjustment etc. Deng.
Step 2032, the behavior expression data and history of the object are showed data to carry out respectively by multiple data types pair Than.
The method that this step uses comparison of classification, is divided into multiple data types, for example, comparing the venue of sports event according to the attribute of data Point, subject age, the action of each link, score, ranking etc..
Step 2033, the comparing result for possessing displaying value is filtered out from multiple comparing results of multiple data types.
Whether possesses displaying value for a user in view of the content of text ultimately generated, from the more of multiple data types The comparing result for possessing displaying value is filtered out in a comparing result.The method of screening can be according to exhibition to each comparing result Show that value scores, be then ranked up scoring, therefrom select multiple comparing results for possessing displaying value
For example, for sportsman Sun Yang, participate in the place of freestyle race for it and be compared, such as " Beijing is difficult to understand National Games " are compared with " the Rio Olympic Games ", the value do not reported, then think that this comparing result does not possess displaying value, it is commented It is divided into 0.And for example, contrasted in Sun Yang freestyle swimmings per the speed in 50m stages, comparing result be able to indicate that it is different into Achievement gap, gap is bigger, and scoring is higher, more has been reported that value, this comparing result, which will be screened out, thinks possess displaying value.
Step 2034, the comparing result for selecting and possessing displaying value from default description phrase matches at least One description phrase.
Similar to the description of method one, the comparing result between behavior expression data and history performance data can also be divided into Greater than, equal to less than therefore, at least one description phrase for matching can also be selected from default description phrase.
Method three:For each object, current real-time performance data and performance expected data are contrasted.
Specifically, being directed to each object, the performance expected data of the object is obtained;By the behavior expression data of the object and Performance expected data is contrasted, and at least one descriptor to match with comparing result is selected from default description phrase Group.
The more default description phrase compared with it is expected to show data for the basis according to one embodiment of the invention of table 3.For example, By taking the achievement that Sun Yang participates in men's 400m freestyle final as an example, due to lagging behind Horton, judged according to the desired value of pre-games Going out, the description phrase that Sun Yang wins the silver medal is " sorry ", rather than beyond expection.
Table 3 is according to the more default description phrase compared with it is expected to show data
Step 204, the theme is generated according to the behavior expression data of at least one object and at least one description phrase Text.
This step realizes the extension from scattered " behavior expression data ", " description phrase " to complete " text ".It is raw Into specific method include, be each descriptor group selection link word in default corpus data storehouse;By at least one object Behavior expression data, link word and it is at least one description phrase connect at least one short sentence;At least one short sentence is combined Into at least one paragraph, connect at least one paragraph and obtain text.
Wherein, link word plays " rising ", " holding ", " turning ", the function of " conjunction ", specifically includes transitional word, the language of context Background introduction in conjunction on gas, conjunction in logic, history performance data etc..For example, with men's 400m freestyle Exemplified by the achievement of finals, determine that multiple description phrases have for object " Sun Yang ":" regretting, it is defending to fail ", " always by compatriots Express great expectations of winning the championship " and " being defeated by opponent's Horton ".For " regretting, it is defending to fail ", determine that link word " obtains sub- for appositive Army ";For " expressing great expectations of winning the championship by compatriots always ", determine that link word " is obtained in Olympic Games gold medal for cause as first place State swimming man player, while be also the previous Olympic champion of the project ";For " being defeated by opponent's Horton ", linking is determined Word is adversative " but final ".
Fig. 4 by according to one embodiment of the invention generation text schematic diagram.It is included in man in the text generated To the description of multiple objects under the theme of sub- 400m freestyle final, there are proud sportsman's Horton, Sun Yang, Qiu Zi, lid, moral bosom You, black pigment used by women in ancient times to paint their eyebrows carry.The text of generation includes three paragraphs, and each paragraph includes the achievement of sportsman, ranking and by underscore institute The description phrase of mark.
In the above-described embodiments, by obtaining the real-time code data for a theme, identified from real-time code data Go out the behavior expression data of at least one object, at least one description is determined according to the behavior expression data of at least one object Phrase, and the behavior expression data according at least one object and at least one description phrase, generate the text of the theme, directly Connect docking code database, it is not necessary to live broadcast system is relied on, has broken away from the dependence in the prior art to manually written broadcasting live report, The related data that all ins and outs of match judge with performance can be reduced into as lively, hommization match text table State, not only speed is fast, contains much information, and also has readability concurrently, is truly realized the class peopleization of machine report.
Based on the document creation method provided in above-described embodiment, robot can individually by machine itself study and Algorithm realizes the hommization statement of robot report, and the report text that this hommization statement technology is brought has passed through figure spirit Test is (i.e. if computer can answer a series of problems proposed by mankind tester, and its answer more than 30% in 5 minutes Allow tester to be mistakenly considered the mankind to be answered, then computer passes through test), Article quality is with manually reporting indifference.
In addition, piece together match word compared to live telecast phonetic explaining is converted into word description by speech recognition The method of report, due to the technology restriction of speech-to-text, it is at a relatively high that it converts error rate, can not batch application, and above-mentioned reality Apply the document creation method provided in example, by real-time code data->Behavior expression data->Description phrase->The expansion of paragraph Fill->The formation of text, ensure that the text of high quality, therefore with high volume applications, can improve the resource of text generating apparatus Utilization rate.
Fig. 5 is the exemplary process diagram of the document creation method according to another embodiment of the present invention.This method is applied to clothes Business device.As shown in figure 5, this method may include following steps:
Step 501, the real-time code data for a theme are obtained, real-time code data are write according to machine language, taken With the content-data under the theme.
Step 502, the behavior expression data of at least one object are identified from real-time code data according to mapping relations, Behavior expression data include one or more behaviors and judge data with the corresponding performance of each behavior.
With reference to above-mentioned steps 202 description, server first according to the redaction rule of code set real-time code data with Mapping relations between object, behavior, performance judge data, are then identified according to the mapping relations.
Fig. 6 by according to another embodiment of the present invention generation text schematic diagram.Interface 600 as shown in Figure 6 is shown The text of one generation, giving text entitled in square frame 610, " Olympic Games are dived the 1st gold medal!Minxia WU/Shi Tingmao is not born People's expectations is won the championship ", the popularization side for providing text in block 620 is " Tencent's physical culture ", and the text of text is given in square frame 630, Which includes all behavior expression data provided in above-mentioned table 1, identified with underscore.
Step 503, according to the behavior expression data of at least one object, the history of each object performance data and performance phase Data are hoped to determine at least one description phrase.
This step combines the three kinds of methods for determining description phrase provided in step 203, and details are not described herein.
Step 504, based on corpus data storehouse, according to the behavior expression data of at least one object and at least one descriptor Group, generates the text of the theme.
Description based on step 204, is each descriptor group selection rank when generating text, in default corpus data storehouse Connect word;The behavior expression data of at least one object, link word and at least one description phrase are connected into at least one short sentence; At least one short sentence is combined at least one paragraph, at least one paragraph is connected and obtains text.
There can be a variety of styles in view of text, when at least one short sentence is combined at least one paragraph, Pre-set the number of words limitation of the paragraph template and each paragraph template of multiple types.These different types of paragraph template structures Into the text of different-style.For example, the type of paragraph template includes summary, background introduction, detailed text, summary, annex etc..
For each paragraph template, determine at least one short sentence to match with the paragraph template, to it is identified at least One short sentence is combined to obtain a paragraph, and causes the number of words of the paragraph to be no more than the number of words limitation of the paragraph template.
Step 505, keyword examination & verification is carried out to the text of generation.
Examination & verification herein includes investigation keyword, and the contribution higher to risk weighting rank can also be submitted to manual examination and verification Window is audited.
Step 506, the text after examination & verification is sent to client to be shown.
Fig. 7 for according to one embodiment of the invention institute's text exhibition schematic diagram.In the display interface 700 of client, Recommend to illustrate the report of a competitive sports to user.Entitled " Zhang Mengxue gets Rio Olympic Games first gold medal for China delegation ", Popularization side's " Tencent's physical culture " is shown in block 720, and on the date " 2016-08-07 ", the release time of report is " 22:23 ", and " comment " option (see 721) is provided with " sharing " option (see 722) so that user carries out interaction in social platform.In square frame The summary of this report is provided in 730, the intermediate portions " match focus " of this report are given in square frame 740, in block 750 The detailed text " excellent playback " of report is given, and the annex " player's data " of report is given in square frame 760.
In the present embodiment, number is showed by the behavior expression data according at least one object, the history of each object At least one description phrase is determined according to performance expected data, can be associated with multiple " descriptors from " behavior expression data " Group ", enriches the content of text, further improves the information content and readability of text.In addition, when being combined into paragraph, lead to Cross and different types of paragraph template be set, can according to the different expression text of result of the match intelligent selection style, so as to The various lively, texts of hommization are exported, is browsed and is read for user.
Structure diagrams of the Fig. 8 according to the server of one embodiment of the invention.As shown in figure 8, server 800 includes:
Acquisition module 810, for obtaining the real-time code data for a theme, real-time code data are according to machine language Write, carry content-data under the theme;
Identification module 820, for identifying at least one object from the real-time code data that acquisition module 810 obtains Behavior expression data;
Determining module 830, the behavior expression data of at least one object for being obtained according to identification module 820 are determined At least one description phrase;And
Generation module 840, for the behavior expression data of at least one object obtained according to identification module 820 and determines At least one description phrase that module 830 is determined, generates the text of the theme.
In one embodiment, server 800 further comprises:
Setup module 850, for setting real-time code data to be commented with object, behavior, performance according to the redaction rule of code Sentence the mapping relations between data;
Wherein, behavior expression data include one or more behaviors and judge number with the corresponding performance of each behavior According to identification module 820 is used for, and the mapping relations set according to setup module 850 identify the behavior and performance of each object Judge data.
In one embodiment, determining module 830 is used for, and for each object, the history for obtaining the object shows data;Will Behavior expression data and history the performance data of the object are contrasted respectively by multiple data types, from multiple data types The comparing result for possessing displaying value is filtered out in multiple comparing results, is selected from default description phrase and possesses displaying At least one description phrase that the comparing result of value matches.
In one embodiment, determining module 830 is used for, and for each object, obtains the performance expected data of the object;Will The behavior expression data of the object and performance expected data are contrasted, and are selected from default description phrase and comparing result At least one description phrase to match.
In one embodiment, generation module 840 is used for, and is each descriptor group selection rank in default corpus data storehouse Connect word;The behavior expression data of at least one object, link word and at least one description phrase are connected into at least one short sentence; At least one short sentence is combined at least one paragraph, at least one paragraph is connected and obtains text.
Fig. 9 is the structure diagram of the server according to another embodiment of the present invention.The server 900 may include:Processing Device 910, memory 920, port 930 and bus 940.Processor 910 and memory 920 are interconnected by bus 940.Processor 910 can be received and be sent data by port 930.Wherein,
Processor 910 is used for the machine readable instructions module for performing the storage of memory 920.
Memory 920 is stored with the executable machine readable instructions module of processor 910.The executable finger of processor 910 Module is made to include:Acquisition module 921, identification module 922, determining module 923 and generation module 924.Wherein,
Acquisition module 921 can be when being performed by processor 910:The real-time code data for a theme are obtained, in real time Code data writes according to machine language, carries content-data under the theme.
Identification module 922 can be when being performed by processor 910:From the real-time code data that acquisition module 921 obtains Identify the behavior expression data of at least one object.
Determining module 923 can be when being performed by processor 910:At least one object obtained according to identification module 922 Behavior expression data determine at least one description phrase.
Generation module 924 can be when being performed by processor 910:At least one object obtained according to identification module 922 Behavior expression data and at least one description phrase determined of determining module 923, generate the text of the theme.
In one embodiment, the executable instruction module of processor 910 further comprises:Setup module 925.Wherein,
Setup module 925 can be when being performed by processor 910:Real-time code data are set according to the redaction rule of code The mapping relations between data are judged with object, behavior, performance;
Wherein, behavior expression data include one or more behaviors and judge number with the corresponding performance of each behavior According to can be when identification module 922 is performed by processor 910:The mapping relations set according to setup module 925 identify each Data are judged in the behavior and performance of object.
It can thus be seen that when the instruction module being stored in memory 920 is performed by processor 910, it can be achieved that preceding State the various functions of acquisition module in each embodiment, identification module, determining module, generation module and setup module.
In above device and system embodiment, modules and unit realize that the specific method of itself function is implemented in method It is described in example, which is not described herein again.
In addition, each function module in each embodiment of the present invention can be integrated in a processing unit, can also That modules are individually physically present, can also two or more modules integrate in a unit.Above-mentioned integrated list Member can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.
In addition, each embodiment of the present invention can pass through the data processing by data processing equipment such as computer execution Program is realized.Obviously, data processor constitutes the present invention.In addition, it is generally stored inside the data in a storage medium Processing routine by program by directly reading out storage medium or by installing or copying to data processing equipment by program Performed in storage device (such as hard disk and/or memory).Therefore, such storage medium also constitutes the present invention.Storage medium can be with Use any kind of recording mode, such as paper storage medium (such as paper tape), magnetic storage medium (such as floppy disk, hard disk, flash memory Deng), optical storage media (such as CD-ROM), magnetic-optical storage medium (such as MO) etc..
Therefore, the invention also discloses a kind of storage medium, wherein data processor is stored with, the data processor For performing any type embodiment of the above method of the present invention.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention God and any modification, equivalent substitution, improvement and etc. within principle, done, should be included within the scope of protection of the invention.

Claims (12)

  1. A kind of 1. document creation method, it is characterised in that including:
    The real-time code data for a theme are obtained, the real-time code data write according to machine language, carry the master Content-data under topic;
    The behavior expression data of at least one object are identified from the real-time code data;
    At least one description phrase is determined according to the behavior expression data of at least one object;And
    According to the behavior expression data of at least one object and at least one description phrase, the text of the theme is generated This.
  2. 2. according to the method described in claim 1, further comprise:
    Mapping between the real-time code data and object, behavior, performance judge data is set according to the redaction rule of code Relation;
    Wherein, the behavior expression data include one or more behaviors and judge number with the corresponding performance of each behavior According to the behavior expression data that at least one object is identified from the real-time code data include:
    Identify that data are judged in the behavior of each object and performance according to the mapping relations.
  3. 3. according to the method described in claim 1, wherein, the behavior expression data according at least one object determine Going out at least one description phrase includes:
    The behavior expression data of multiple objects under the theme are compared, selects from default description phrase and is tied compared with At least one description phrase that fruit matches.
  4. 4. according to the method described in claim 1, wherein, the behavior expression data according at least one object determine Going out at least one description phrase includes:
    For each object, the history for obtaining the object shows data;
    The behavior expression data of the object and history performance data are contrasted respectively by multiple data types, from The comparing result for possessing displaying value is filtered out in multiple comparing results of the multiple data type, from default description phrase In select and at least one description phrase that matches of comparing result for possessing displaying value.
  5. 5. according to the method described in claim 1, wherein, the behavior expression data according at least one object determine Going out at least one description phrase includes:
    For each object, the performance expected data of the object is obtained;
    The behavior expression data of the object and the performance expected data are contrasted, are selected from default description phrase Select out at least one description phrase to match with comparing result.
  6. 6. method according to any one of claim 1 to 5, wherein, the behavior according at least one object Data and at least one description phrase are showed, generating the text of the theme includes:
    It is each descriptor group selection link word in default corpus data storehouse;
    By the behavior expression data of at least one object, the link word and it is described it is at least one description phrase connect into A few short sentence;
    At least one short sentence is combined at least one paragraph, at least one paragraph is connected and obtains the text.
  7. 7. according to the method described in claim 6, further comprise:
    Pre-set the number of words limitation of the paragraph template and each paragraph template of multiple types;
    Wherein, it is described at least one short sentence is combined at least one paragraph to include:
    For each paragraph template, determine at least one short sentence to match with the paragraph template, to it is identified it is described at least One short sentence is combined to obtain a paragraph, and causes the number of words of the paragraph to be no more than the number of words limitation of the paragraph template.
  8. A kind of 8. server, it is characterised in that including:
    Acquisition module, for obtaining the real-time code data for a theme, the real-time code data are compiled according to machine language Write, carry content-data under the theme;
    Identification module, for identifying the behavior table of at least one object from the real-time code data that the acquisition module obtains Existing data;
    Determining module, the behavior expression data of at least one object for being obtained according to the identification module determine at least one A description phrase;And
    Generation module, for the behavior expression data of at least one object obtained according to the identification module and the definite mould At least one description phrase that block is determined, generates the text of the theme.
  9. 9. server according to claim 8, further comprises:
    Setup module, for setting the real-time code data to judge number with object, behavior, performance according to the redaction rule of code Mapping relations between;
    Wherein, the behavior expression data include one or more behaviors and judge number with the corresponding performance of each behavior According to the identification module is used for, and the mapping relations set according to the setup module identify behavior and the table of each object Now judge data.
  10. 10. server according to claim 8, wherein, the determining module is used for, and for each object, it is right to obtain this The history performance data of elephant;By the behavior expression data of the object and history performance data by multiple data types point Do not contrasted, the comparing result for possessing displaying value is filtered out from multiple comparing results of the multiple data type, from At least one descriptor to match with the comparing result for possessing displaying value is selected in default description phrase Group.
  11. 11. server according to claim 8, wherein, the determining module is used for, and for each object, it is right to obtain this The performance expected data of elephant;The behavior expression data of the object and the performance expected data are contrasted, from default Description phrase in select at least one description phrase to match with comparing result.
  12. 12. the server according to any one of claim 8 to 11, wherein, the generation module is used for, in default language Expect that in database be each descriptor group selection link word;By the behavior expression data of at least one object, the linking Word and at least one description phrase connect at least one short sentence;At least one short sentence is combined at least one section Fall, connect at least one paragraph and obtain the text.
CN201610920284.5A 2016-10-21 2016-10-21 Text generation method and server Active CN107977196B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610920284.5A CN107977196B (en) 2016-10-21 2016-10-21 Text generation method and server
PCT/CN2017/101852 WO2018072577A1 (en) 2016-10-21 2017-09-15 Text generation method and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610920284.5A CN107977196B (en) 2016-10-21 2016-10-21 Text generation method and server

Publications (2)

Publication Number Publication Date
CN107977196A true CN107977196A (en) 2018-05-01
CN107977196B CN107977196B (en) 2020-11-20

Family

ID=62004560

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610920284.5A Active CN107977196B (en) 2016-10-21 2016-10-21 Text generation method and server

Country Status (1)

Country Link
CN (1) CN107977196B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101714145A (en) * 2008-10-07 2010-05-26 英业达股份有限公司 Website news analyzing system and method thereof
CN102508830A (en) * 2011-11-28 2012-06-20 北京工商大学 Method and system for extracting social network from news document
US20120323558A1 (en) * 2011-02-14 2012-12-20 Decisive Analytics Corporation Method and apparatus for creating a predicting model
CN102929928A (en) * 2012-09-21 2013-02-13 北京格致璞科技有限公司 Multidimensional-similarity-based personalized news recommendation method
CN103186555A (en) * 2011-12-28 2013-07-03 腾讯科技(深圳)有限公司 Evaluation information generation method and system
CN103246710A (en) * 2013-04-22 2013-08-14 张经纶 Method and device for automatically generating multimedia travel notes
CN104239298A (en) * 2013-06-06 2014-12-24 腾讯科技(深圳)有限公司 Text message recommendation method, server, browser and system
US20150254219A1 (en) * 2014-03-05 2015-09-10 Adincon Networks LTD Method and system for injecting content into existing computerized data

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101714145A (en) * 2008-10-07 2010-05-26 英业达股份有限公司 Website news analyzing system and method thereof
US20120323558A1 (en) * 2011-02-14 2012-12-20 Decisive Analytics Corporation Method and apparatus for creating a predicting model
CN102508830A (en) * 2011-11-28 2012-06-20 北京工商大学 Method and system for extracting social network from news document
CN103186555A (en) * 2011-12-28 2013-07-03 腾讯科技(深圳)有限公司 Evaluation information generation method and system
CN102929928A (en) * 2012-09-21 2013-02-13 北京格致璞科技有限公司 Multidimensional-similarity-based personalized news recommendation method
CN103246710A (en) * 2013-04-22 2013-08-14 张经纶 Method and device for automatically generating multimedia travel notes
CN104239298A (en) * 2013-06-06 2014-12-24 腾讯科技(深圳)有限公司 Text message recommendation method, server, browser and system
US20150254219A1 (en) * 2014-03-05 2015-09-10 Adincon Networks LTD Method and system for injecting content into existing computerized data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KLINT FINLEY: "This News-Writing Bot Is Now Free for Everyone", 《HTTPS://WWW.WIRED.COM/2015/10/THIS-NEWS-WRITING-BOT-IS-NOW-FREE-FOR-EVERYONE/》 *

Also Published As

Publication number Publication date
CN107977196B (en) 2020-11-20

Similar Documents

Publication Publication Date Title
Xu et al. Imagereward: Learning and evaluating human preferences for text-to-image generation
Yu et al. Scaling autoregressive models for content-rich text-to-image generation
Levy In the plex: How Google thinks, works, and shapes our lives
Bonnycastle In Search of Authority-: An Introductory Guide to Literary Theory
US20070288404A1 (en) Dynamic interaction menus from natural language representations
Zhao et al. Chatbridge: Bridging modalities with large language model as a language catalyst
Chandrasekar Elementary? Question answering, IBM’s Watson, and the Jeopardy! challenge
Hillenbrand Literature, modernity, and the practice of resistance: Japanese and Taiwanese fiction, 1960-1990
Bennett God and progress: religion and history in British intellectual culture, 1845-1914
Saito From Novels to Video Games: Romantic Love and Narrative Form in Japanese Visual Novels and Romance Adventure Games
CN108874789A (en) Generation method, device, storage medium and the electronic device of sentence
Jonasson Sport has never been modern
Finan Idols you can make: The player as auteur in Japan’s media mix
CN107977196A (en) A kind of document creation method and server
Del Moral The shifting origins of international law
Escudeiro et al. Digital Assisted Communication.
Toncu et al. Escape from dungeon—modeling user intentions with natural language processing techniques
She Media Represention of the Chinese First lady: Identity constraction in the news media reports
Bauer The stuff of fiction: Advice on craft
Weinstein Directing laughter: Modes of modern Chinese comedy, 1907–1997
Deakin Modern Language, Philosophy and Criticism
Paaß et al. What Are the Capabilities of Artificial Intelligence?
Dumpala et al. VISLA Benchmark: Evaluating Embedding Sensitivity to Semantic and Lexical Alterations
Jiang Nyfelt The Diary from Qutang Gorge and the letters about Donner Lake: A literary study of Mulberry and Peach by Nie Hualing
BOYD-GRABER QUESTIONING ARTIFICIAL INTELLIGENCE

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant