CN107977196B - Text generation method and server - Google Patents

Text generation method and server Download PDF

Info

Publication number
CN107977196B
CN107977196B CN201610920284.5A CN201610920284A CN107977196B CN 107977196 B CN107977196 B CN 107977196B CN 201610920284 A CN201610920284 A CN 201610920284A CN 107977196 B CN107977196 B CN 107977196B
Authority
CN
China
Prior art keywords
data
performance
behavior
module
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610920284.5A
Other languages
Chinese (zh)
Other versions
CN107977196A (en
Inventor
刘康
石卫国
蔡静
张雪娇
窦晓妍
张秋明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Beijing Co Ltd
Original Assignee
Tencent Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Beijing Co Ltd filed Critical Tencent Technology Beijing Co Ltd
Priority to CN201610920284.5A priority Critical patent/CN107977196B/en
Priority to PCT/CN2017/101852 priority patent/WO2018072577A1/en
Publication of CN107977196A publication Critical patent/CN107977196A/en
Application granted granted Critical
Publication of CN107977196B publication Critical patent/CN107977196B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/20Software design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90332Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files

Abstract

The application discloses a text generation method and a server. The method comprises the following steps: acquiring real-time code data aiming at a theme, wherein the real-time code data is compiled according to a machine language and carries content data under the theme; identifying performance data of at least one object from the real-time code data; determining at least one description phrase according to the behavior data of at least one object; and generating a text of the subject according to the behavioral expression data of the at least one object and the at least one description phrase. By using the technical scheme, the text generation efficiency and the resource utilization rate of the server can be improved.

Description

Text generation method and server
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a text generation method and a server.
Background
At present, when a media organization edits manuscripts and publishes contents daily, texts such as reports, articles and manuscripts can be automatically generated through certain software. The method mainly adopts the form that the live text paragraphs in the image-text live broadcast system are monitored and data are prestored, part of paragraph characters are captured from the live text paragraphs, and the paragraphs and the data are filled through a preset fixed template, so that texts such as a report and the like are spliced.
The text generation mode is established on the basis of manual character live broadcast reports, is not automatically written by software, and is assisted by a third-party image-text live broadcast system; in addition, only paragraphs are captured, and templates for combination are fixed, so that the method for capturing and combining the paragraphs can generate reports with strong splicing feeling and is also mechanical. Therefore, the readability of the generated reports is poor, the amount of transmitted information is limited, the requirement of users on knowing detailed information cannot be met, and the resource utilization rate of the text generation equipment is reduced.
Disclosure of Invention
In view of this, the present invention provides a text generation method and a server, which can improve the efficiency of text generation and the resource utilization rate of the server.
The technical scheme of the invention is realized as follows:
the invention provides a text generation method, which comprises the following steps:
acquiring real-time code data aiming at a theme, wherein the real-time code data is compiled according to a machine language and carries content data under the theme;
identifying performance data of at least one object from the real-time code data;
determining at least one description phrase according to the behavior data of the at least one object; and a process for the preparation of a coating,
and generating a text of the subject according to the behavioral expression data of the at least one object and the at least one description phrase.
The present invention also provides a server, comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring real-time code data aiming at a theme, and the real-time code data is compiled according to a machine language and carries content data under the theme;
the identification module is used for identifying the behavioral expression data of at least one object from the real-time code data obtained by the acquisition module;
the determining module is used for determining at least one description phrase according to the behavior expression data of at least one object obtained by the identifying module; and a process for the preparation of a coating,
and the generating module is used for generating a text of the subject according to the behavior data of the at least one object obtained by the identifying module and the at least one description phrase determined by the determining module.
Compared with the prior art, the method provided by the invention gets rid of the dependence on the artificial character live broadcast report in the prior art, can restore all technical details of the match and the related data of the performance judgment into vivid and humanized match text expression, has the advantages of high speed, large information amount and readability, improves the efficiency of text generation, truly realizes the humanization of machine report, and also improves the resource utilization rate of the server.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Wherein the content of the first and second substances,
FIG. 1 is a schematic block diagram of an exemplary environment in accordance with an embodiment of the present invention;
FIG. 2 is an exemplary flow diagram of a text generation method in accordance with an embodiment of the present invention;
FIG. 3 is an exemplary flow chart for determining descriptive phrases in accordance with one embodiment of the present invention;
FIG. 4 is a diagram illustrating a generated text according to an embodiment of the present invention;
FIG. 5 is an exemplary flow diagram of a text generation method according to another embodiment of the invention;
FIG. 6 is a diagram illustrating a generated text according to another embodiment of the present invention;
FIG. 7 is a diagram of displayed text in accordance with one embodiment of the present invention;
FIG. 8 is a block diagram of a server according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a server according to another embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic configuration diagram of an implementation environment according to an embodiment of the present invention. As shown in fig. 1, the text presentation system 100 includes a server 110 and a client 120. The server 110 further includes a code database 111, a corpus database 112, and a text generation processing unit 113. The code database 111 stores code data for each topic and updates the code data in real time; the corpus database 112 stores corpora such as phrases and phrases used in generating texts.
In an embodiment of the present invention, the text generation processing unit 113 is configured to read real-time code data in the code database 111, identify behavioral data, determine a descriptive phrase, and generate a text in combination with the corpus database 112.
Then, the server 110 sends the generated text to the client 120, and the client 120 serves as an application program of the media populator, recommends the presentation text to the user, and provides a social platform for the user to interact. Wherein, the server 110 and the client 120 can be connected in a wired or wireless way.
Fig. 2 is an exemplary flowchart of a text generation method according to an embodiment of the present invention. The method is applied to the server. As shown in fig. 2, the method may include the steps of:
step 201, acquiring real-time code data for a theme, wherein the real-time code data is written according to a machine language and carries content data under the theme.
In this step, a code database in the server obtains and stores real-time code data for a topic, and the code data is written according to a machine language and carries content data under the topic. Machine languages are, for example, java, PHP (hypertext preprocessor), asp, ruby, etc., each with its own set of programming rules.
The theme can be a sports event, and the written code data comprises content data of various items in the sports event, match details, scores and the like of various athletes; for another example, the subject may be a singing game, and the written code data includes content data of each link, game details, scores and the like of each singer. As another example, the theme may be a remote controlled toy vehicle tournament, with the coded data being written including content data in individual tracks, operating speeds, operating times, rankings, etc. of individual toy vehicles.
At step 202, behavioral data of at least one object is identified from the real-time code data.
This step enables the conversion from machine-readable "code data" to client-user-readable "performance data". Specifically, the object includes a person, an animal, or an object, i.e., a subject of participation on a certain subject. For each object, its behavioral performance data includes one or more behaviors and performance assessment data corresponding to each behavior.
For example, when the subject is a sports event, the subject is an athlete, performance data of each athlete corresponds to each game detail of the athlete during the game, the performance includes each action of the athlete, and the performance evaluation data includes a score of each action, an evaluation result of a referee, a total score, a prize result, and the like. Taking the diving report as an example, the diving game is a scoring and classifying game, and the performance data of each athlete comprises each technical detail (namely a plurality of behaviors) and corresponding score (namely performance evaluation data corresponding to each behavior) such as walking board, running platform, take-off, height, difficulty coefficient, air attitude, coordination and coordination, water entry action and the like.
When the server identifies, firstly, the mapping relation between the real-time code data and the object, the behavior and the performance judging data is set according to the writing rule of the code, and then the behavior and the performance judging data of each object are identified from the real-time code data according to the mapping relation.
For example, using java as machine language as an example, mapping various fields in the real-time code data into object, behavior and performance evaluation data according to the grammar rules of java. For example, the field "object" is mapped to "object", the field "action" is mapped to "action", and the field "score" is mapped to "performance judgment data".
Table 1 lists performance data results according to an embodiment of the invention. The theme is a 3-meter double springboard playoff of the Riyoyo-Olympic Congress, the server identifies that the object comprises two athletes, namely Wuminxia and Shitingv 25035, the ages of which are 31 and 24 respectively, the behaviors comprise five rounds of diving actions, and the performance judging data comprises scores and ranks of each round and final total scores and medal results.
Figure BDA0001135560630000041
Figure BDA0001135560630000051
Table 1 identified performance data results
Step 203, determining at least one description phrase according to the behavior data of at least one object.
This step implements the association from "performance data" to "descriptive phrases". Specifically, there are three methods for determining a descriptive phrase:
the method comprises the following steps: comparing the behavior expression data of the plurality of objects under the theme, and selecting at least one description phrase matched with the comparison result from preset description phrases.
The method is to transversely compare a plurality of objects aiming at the same project under the same theme. For example, the theme of the diving game of the olympic games includes a total of 8 game items, which are: the diving board has the advantages of 3 meters for women, 3 meters for men, 3 meters for women, 3 meters for men, 10 meters for women, 10 meters for men, 10 meters for women and 10 meters for men. And synchronously comparing the data of a plurality of athletes aiming at the same game item, and restoring a describable phrase according to the comparison result.
In an embodiment, the results of comparing the behavior performance data of the plurality of objects pairwise include greater than, equal to, or less than, and the server presets a plurality of corresponding description phrases. Table 2 shows the default descriptive phrases based on comparison with historical performance data according to an embodiment of the present invention.
Figure BDA0001135560630000052
Figure BDA0001135560630000061
TABLE 2 Preset descriptive phrases based on comparison with historical performance data
For example, in the Liyoyo Olympic 400-meter freestyle playoff, the player has a final score of 3 minutes 41 seconds 68 for Sunpun, and the other player has a final score of 3 minutes 41 seconds 55 for Holton, and comparing the two scores, the score of Sunpun is slightly lower than that of Holton, and the server can determine that the corresponding descriptive phrases are "slightly behind", "not enemy opponent", and "regret".
The second method comprises the following steps: for each subject, the current real-time performance data is compared longitudinally with the historical performance data.
Specifically, fig. 3 is an exemplary flowchart for determining descriptive phrases in accordance with one embodiment of the present invention. As shown in fig. 3, the method comprises the following steps:
step 2031, for each object, obtaining historical performance data of the object.
In the case of a sporting event, the historical performance data may include past performance of the athlete, world rankings, project adjustments, and the like.
Step 2032, comparing the behavior expression data of the object with the historical expression data according to a plurality of data types.
The step adopts a classification comparison method, and is divided into a plurality of data types according to the attributes of the data, such as the match place, the age of the object, the action of each link, the score, the ranking and the like.
Step 2033, selecting a comparison result with a display value from the comparison results of the data types.
And considering whether the finally generated text content has a showing value for the user, screening out a showing value comparison result from a plurality of comparison results of a plurality of data types. The screening method can be that each comparison result is scored according to the display value, then the scores are sorted, and a plurality of comparison results with the display value are selected from the scores
For example, for the athlete's grandson pops, there is no reported value for the comparison of their locations participating in the freestyle game, such as "Beijing Olympic Games" versus "Riyoyo Olympic Games", and the comparison is deemed to have no demonstrated value and is scored as 0. For another example, the speed of each 50m stage in the free swimming of grand poplar is compared, the comparison result can indicate different achievement gaps, the larger the gap is, the higher the score is, the more the reported value is, and the comparison result is screened out to be considered to have the display value.
Step 2034, selecting at least one description phrase matching with the comparison result with the display value from the preset description phrases.
Similar to the description of the first method, the comparison result between the performance data and the historical performance data can be divided into greater than, equal to and less than, so that at least one matching description phrase can be selected from the preset description phrases.
The third method comprises the following steps: for each subject, the current real-time performance data is compared to performance expectation data.
Specifically, for each object, acquiring performance expectation data of the object; comparing the behavior expression data of the object with the expression expectation data, and selecting at least one description phrase matched with the comparison result from a preset description phrase group.
Table 3 shows the default descriptive phrases in accordance with the comparison with the expected performance data according to one embodiment of the present invention. For example, taking the performance of grand poplar participating in a man 400 m freestyle playoff as an example, the grand poplar was judged to have "regret" rather than exceeded expectations based on the expected value before the playoff, since it was behind Holton.
Figure BDA0001135560630000071
TABLE 3 Preset descriptive phrases based on comparison with expected performance data
Step 204, generating a text of the subject according to the behavior data of the at least one object and the at least one description phrase.
This step enables the extension from scattered "behavioral expression data", "descriptive phrases" to complete "text". The specific generation method comprises the steps of selecting a connecting word for each description phrase in a preset corpus database; connecting the behavior expression data, the conjunction words and the description phrases of at least one object into at least one short sentence; and combining at least one short sentence into at least one paragraph, and connecting at least one paragraph to obtain the text.
The linking words have functions of "play", "forward", "close", and specifically include transition words of context, connective words of tone, connective words of logic, background introduction in historical presentation data, and the like. For example, taking the performance of a man 400 m freestyle finals as an example, for the object "grand poplar", a plurality of descriptive phrases are determined: unfortunately, the crown cannot be defended, the countryside can be captured by the Chinese all the time, and the enemy-free opponent Hall is avoided. For "unfortunately failing to defend the crown", determine the conjunction word as the co-located word "get army"; for the 'always being caught by a Chinese to catch a great hope', the linking word is determined to be the reason of 'the Chinese swimming male player who obtains the Olympic Game gold medal as the first place and is also the last Olympic game champion of the project'; for "not enemy opponent, Holton", the conjunctive word is determined to be the disjunctive word "but final".
FIG. 4 is a diagram illustrating a generated text according to an embodiment of the invention. Included in the generated text is a description of a number of objects under the theme of a man's 400 m freestyle duel, including athletes holton, grand poplar, qiubao, gai, debyler, daiti. The generated text includes three paragraphs, each including a player's performance, ranking, and a descriptive phrase identified by underlining.
In the embodiment, the real-time code data for a theme is acquired, the behavioral expression data of at least one object is identified from the real-time code data, at least one description phrase is determined according to the behavioral expression data of the at least one object, the text of the theme is generated according to the behavioral expression data of the at least one object and the at least one description phrase, and the code database is directly connected without depending on a live broadcast system, so that the dependence on artificial character live broadcast reports in the prior art is eliminated, all technical details of a match and related data of performance judgment can be restored into a vivid and humanized match text expression, the speed is high, the information quantity is large, the readability is achieved, and the humanization of machine reports is really achieved.
Based on the text generation method provided in the above embodiment, the robot can independently realize humanized expression of robot reports through the learning and algorithm of the robot, and the report text brought by the humanized expression technology passes the turing test (i.e. if the computer can answer a series of questions proposed by a human tester within 5 minutes and more than 30% of the answers are mistaken for the human being, the computer passes the test), the quality of the manuscript is not different from that of the manual report.
In addition, compared with a method for splicing match text reports by converting live television voice explanations into text descriptions through voice recognition, due to the technical limitation of voice-to-text conversion, the conversion error rate is quite high, and batch application is not available.
Fig. 5 is an exemplary flowchart of a text generation method according to another embodiment of the present invention. The method is applied to the server. As shown in fig. 5, the method may include the steps of:
step 501, acquiring real-time code data for a theme, wherein the real-time code data is written according to a machine language and carries content data under the theme.
Step 502, identifying the behavioral performance data of at least one object from the real-time code data according to the mapping relationship, wherein the behavioral performance data comprises one or more behaviors and performance judgment data corresponding to each behavior.
Referring to the description of step 202, the server first sets a mapping relationship between the real-time code data and the object, behavior, and performance evaluation data according to the writing rule of the code, and then performs recognition according to the mapping relationship.
FIG. 6 is a diagram illustrating a generated text according to another embodiment of the present invention. The interface 600 shown in FIG. 6 illustrates a generated text, which is presented at block 610 with the title "Olympic Water 1 st gold! Wuminoxix/sturtian \25035, a promotional party for the text is given in box 620 as "Tencent sports", and a body of text is given in box 630, including all performance data given in table 1 above, identified by underlining.
Step 503, determining at least one description phrase according to the behavior performance data of at least one object, the historical performance data of each object and the performance expectation data.
This step combines the three methods for determining the descriptive phrases given in step 203, and will not be described herein again.
Step 504, based on the corpus database, generating a text of the subject according to the behavioral expression data of the at least one object and the at least one description phrase.
Based on the description in step 204, when generating a text, selecting a conjunction word for each description phrase in a preset corpus database; connecting the behavior expression data, the conjunction words and the description phrases of at least one object into at least one short sentence; and combining at least one short sentence into at least one paragraph, and connecting at least one paragraph to obtain the text.
Considering that texts may have various styles, when at least one short sentence is combined into at least one paragraph, a plurality of types of paragraph templates and a word count limit of each paragraph template are preset. These different types of paragraph templates constitute different styles of text. For example, types of paragraph templates include abstract, background introduction, detailed text, overview, appendix, and the like.
And for each paragraph template, determining at least one short sentence matched with the paragraph template, and combining the determined at least one short sentence to obtain a paragraph, wherein the number of words of the paragraph does not exceed the word number limit of the paragraph template.
And 505, performing keyword review on the generated text.
The review includes keyword review, and the manuscripts with higher risk weighting level can be submitted to a manual review window for review.
And step 506, sending the checked text to the client for display.
FIG. 7 is a diagram of displayed text according to an embodiment of the invention. In the client's display interface 700, a story showing a sporting event is recommended to the user. Entitled "Zhang Meng Xue wins Ri Yong Olympic capital for the Chinese military," Ten Cungo sports "for the promotional party," Ten Cungo sports "for date" 2016-08-07, "and" 22:23 "for the roll-out time reported, and provides a" comment "option (see 721) and a" share "option (see 722) for the user to interact on the social platform. An abstract of the report is given in block 730, the highlight of the report is given in block 740, "game focus", the detailed text of the report is given in block 750, "highlight playback", and the appendix of the report is given in block 760, "player material".
In this embodiment, at least one description phrase is determined according to the behavior expression data of at least one object, the historical expression data of each object, and the expression expectation data, so that the behavior expression data can be associated with a plurality of description phrases, the content of the text is enriched, and the information amount and readability of the text are further improved. In addition, when the paragraphs are combined, by setting different types of paragraph templates, different styles of expression texts can be intelligently selected according to the match result, so that various vivid and humanized texts can be output for the user to browse and read.
Fig. 8 is a schematic structural diagram of a server according to an embodiment of the present invention. As shown in fig. 8, the server 800 includes:
the acquiring module 810 is configured to acquire real-time code data for a theme, where the real-time code data is written according to a machine language and carries content data under the theme;
an identifying module 820, configured to identify performance data of at least one object from the real-time code data obtained by the obtaining module 810;
a determining module 830, configured to determine at least one description phrase according to the behavior data of the at least one object obtained by the identifying module 820; and a process for the preparation of a coating,
a generating module 840, configured to generate a text of the topic according to the behavior data of the at least one object obtained by the identifying module 820 and the at least one description phrase determined by the determining module 830.
In an embodiment, the server 800 further comprises:
a setting module 850, configured to set a mapping relationship between the real-time code data and the object, behavior, and performance evaluation data according to a writing rule of the code;
the behavior performance data includes one or more behaviors and performance evaluation data corresponding to each behavior, and the identifying module 820 is configured to identify the behavior and the performance evaluation data of each object according to the mapping relationship set by the setting module 850.
In an embodiment, the determining module 830 is configured to, for each object, obtain historical performance data of the object; the behavior expression data and the historical expression data of the object are respectively compared according to a plurality of data types, the comparison result with the display value is screened out from a plurality of comparison results of the plurality of data types, and at least one description phrase matched with the comparison result with the display value is selected from a preset description phrase group.
In one embodiment, the determining module 830 is configured to, for each object, obtain performance expectation data of the object; comparing the behavior expression data of the object with the expression expectation data, and selecting at least one description phrase matched with the comparison result from a preset description phrase group.
In an embodiment, the generating module 840 is configured to select a conjunction word for each description phrase in a preset corpus database; connecting the behavior expression data, the conjunction words and the description phrases of at least one object into at least one short sentence; and combining at least one short sentence into at least one paragraph, and connecting at least one paragraph to obtain the text.
Fig. 9 is a schematic structural diagram of a server according to another embodiment of the present invention. The server 900 may include: a processor 910, a memory 920, a port 930, and a bus 940. The processor 910 and the memory 920 are interconnected by a bus 940. Processor 910 may receive and transmit data through port 930. Wherein the content of the first and second substances,
processor 910 is configured to execute modules of machine-readable instructions stored by memory 920.
Memory 920 stores modules of machine-readable instructions executable by processor 910. The processor 910 may execute instruction modules including: the device comprises an acquisition module 921, a recognition module 922, a determination module 923 and a generation module 924. Wherein the content of the first and second substances,
the obtaining module 921 when executed by the processor 910 may be: and acquiring real-time code data aiming at a theme, wherein the real-time code data is written according to a machine language and carries content data under the theme.
The recognition module 922 when executed by the processor 910 may be: behavioral performance data of at least one object is identified from the real-time code data obtained by the obtaining module 921.
The determining module 923, when executed by the processor 910, may be: at least one description phrase is determined according to the behavior data of at least one object obtained by the recognition module 922.
The generation module 924 when executed by the processor 910 may be: and generating a text of the subject according to the behavior performance data of the at least one object obtained by the recognition module 922 and the at least one description phrase determined by the determination module 923.
In one embodiment, the instruction modules executable by the processor 910 further include: a setup module 925. Wherein the content of the first and second substances,
the setup module 925 when executed by the processor 910 may be to: setting a mapping relation between real-time code data and object, behavior and performance evaluation data according to a writing rule of the code;
where the performance data includes one or more behaviors and performance assessment data corresponding to each behavior, the recognition module 922 when executed by the processor 910 may be to: the behavior and performance evaluation data of each object are identified according to the mapping relationship set by the setting module 925.
It can be seen that the instruction modules stored in the memory 920 can implement various functions of the acquisition module, the identification module, the determination module, the generation module and the setting module in the foregoing embodiments when executed by the processor 910.
In the above device and system embodiments, the specific method for each module and unit to implement its own function is described in the method embodiment, and is not described here again.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each module may exist alone physically, or two or more modules are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
In addition, each of the embodiments of the present invention can be realized by a data processing program executed by a data processing apparatus such as a computer. It is clear that the data processing program constitutes the invention. Further, the data processing program, which is generally stored in one storage medium, is executed by directly reading the program out of the storage medium or by installing or copying the program into a storage device (such as a hard disk and/or a memory) of the data processing device. Such a storage medium therefore also constitutes the present invention. The storage medium may use any type of recording means, such as a paper storage medium (e.g., paper tape, etc.), a magnetic storage medium (e.g., a flexible disk, a hard disk, a flash memory, etc.), an optical storage medium (e.g., a CD-ROM, etc.), a magneto-optical storage medium (e.g., an MO, etc.), and the like.
The invention therefore also discloses a storage medium in which a data processing program is stored which is designed to carry out any one of the embodiments of the method according to the invention described above.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (14)

1. A text generation method, comprising:
acquiring real-time code data aiming at a theme, wherein the real-time code data is compiled according to a machine language and carries content data under the theme;
identifying performance data of at least one object from the real-time code data;
determining at least one description phrase according to the behavior data of the at least one object; and a process for the preparation of a coating,
generating a text of the subject according to the behavioral expression data of the at least one object and the at least one description phrase;
wherein the method further comprises:
setting a mapping relation between the real-time code data and object, behavior and performance judgment data according to a code writing rule;
wherein the performance data includes one or more behaviors and performance assessment data corresponding to each behavior, and the identifying of the performance data of at least one object from the real-time code data includes:
and identifying the behavior and the performance judgment data of each object according to the mapping relation.
2. The method of claim 1, wherein the setting of the mapping relationship between the real-time code data and the object, behavior, performance evaluation data according to the writing rule of the code comprises:
and mapping various fields in the real-time code data into object, behavior and performance judgment data according to a grammatical rule of a machine language.
3. The method of claim 1, wherein the determining at least one descriptive phrase from the performance data for the at least one object comprises:
comparing the behavior expression data of the plurality of objects under the theme, and selecting at least one description phrase matched with the comparison result from preset description phrases.
4. The method of claim 1, wherein the determining at least one descriptive phrase from the performance data for the at least one object comprises:
for each object, acquiring historical performance data of the object;
and respectively comparing the behavior performance data and the historical performance data of the object according to a plurality of data types, screening out a comparison result with a display value from a plurality of comparison results of the plurality of data types, and selecting at least one description phrase matched with the comparison result with the display value from a preset description phrase group.
5. The method of claim 1, wherein the determining at least one descriptive phrase from the performance data for the at least one object comprises:
for each object, acquiring performance expectation data of the object;
and comparing the behavior performance data of the object with the performance expectation data, and selecting the at least one description phrase matched with the comparison result from a preset description word group.
6. The method of any one of claims 1 to 5, wherein generating text of the topic from the performance data of the at least one object and the at least one descriptive phrase comprises:
selecting a connecting word for each description phrase in a preset corpus database;
connecting the behavioral expression data of the at least one object, the conjunction words and the at least one description phrase into at least one short sentence;
and combining the at least one short sentence into at least one paragraph, and connecting the at least one paragraph to obtain the text.
7. The method of claim 6, further comprising:
presetting a plurality of types of paragraph templates and word number limit of each paragraph template;
wherein said combining the at least one phrase into at least one paragraph comprises:
and for each paragraph template, determining at least one short sentence matched with the paragraph template, and combining the determined at least one short sentence to obtain a paragraph, wherein the number of words of the paragraph does not exceed the word number limit of the paragraph template.
8. A server, comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring real-time code data aiming at a theme, and the real-time code data is compiled according to a machine language and carries content data under the theme;
the identification module is used for identifying the behavioral expression data of at least one object from the real-time code data obtained by the acquisition module;
the determining module is used for determining at least one description phrase according to the behavior expression data of at least one object obtained by the identifying module; and a process for the preparation of a coating,
the generating module is used for generating a text of the subject according to the behavior data of the at least one object obtained by the identifying module and the at least one description phrase determined by the determining module;
wherein the server further comprises:
the setting module is used for setting the mapping relation between the real-time code data and the object, behavior and performance judging data according to the writing rule of the code;
the behavior performance data comprises one or more behaviors and performance judging data corresponding to each behavior, and the identification module is used for identifying the behavior and the performance judging data of each object according to the mapping relation set by the setting module.
9. The server of claim 8, wherein the setup module is configured to map the plurality of fields in the real-time code data into object, behavior, performance evaluation data according to grammatical rules of a machine language.
10. The server of claim 8, wherein the determining module is configured to, for each object, obtain historical performance data for the object; and respectively comparing the behavior performance data and the historical performance data of the object according to a plurality of data types, screening out a comparison result with a display value from a plurality of comparison results of the plurality of data types, and selecting at least one description phrase matched with the comparison result with the display value from a preset description phrase group.
11. The server of claim 8, wherein the determining module is configured to, for each object, obtain performance expectation data for the object; and comparing the behavior performance data of the object with the performance expectation data, and selecting the at least one description phrase matched with the comparison result from a preset description word group.
12. The server according to any one of claims 8 to 11, wherein the generating module is configured to select a conjunction word for each description phrase in a preset corpus database; connecting the behavioral expression data of the at least one object, the conjunction words and the at least one description phrase into at least one short sentence; and combining the at least one short sentence into at least one paragraph, and connecting the at least one paragraph to obtain the text.
13. A server comprising a memory and a processor, the memory having stored therein computer-readable instructions which, when executed by the processor, implement the method of any one of claims 1 to 7.
14. A computer-readable storage medium having computer-readable instructions stored thereon which, when executed by at least one processor, implement the method of any one of claims 1 to 7.
CN201610920284.5A 2016-10-21 2016-10-21 Text generation method and server Active CN107977196B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610920284.5A CN107977196B (en) 2016-10-21 2016-10-21 Text generation method and server
PCT/CN2017/101852 WO2018072577A1 (en) 2016-10-21 2017-09-15 Text generation method and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610920284.5A CN107977196B (en) 2016-10-21 2016-10-21 Text generation method and server

Publications (2)

Publication Number Publication Date
CN107977196A CN107977196A (en) 2018-05-01
CN107977196B true CN107977196B (en) 2020-11-20

Family

ID=62004560

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610920284.5A Active CN107977196B (en) 2016-10-21 2016-10-21 Text generation method and server

Country Status (1)

Country Link
CN (1) CN107977196B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929928A (en) * 2012-09-21 2013-02-13 北京格致璞科技有限公司 Multidimensional-similarity-based personalized news recommendation method

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101714145B (en) * 2008-10-07 2011-12-07 英业达股份有限公司 Website news analyzing system and method thereof
US8818932B2 (en) * 2011-02-14 2014-08-26 Decisive Analytics Corporation Method and apparatus for creating a predictive model
CN102508830A (en) * 2011-11-28 2012-06-20 北京工商大学 Method and system for extracting social network from news document
CN103186555B (en) * 2011-12-28 2016-05-11 腾讯科技(深圳)有限公司 Evaluation information generates method and system
CN103246710A (en) * 2013-04-22 2013-08-14 张经纶 Method and device for automatically generating multimedia travel notes
CN104239298B (en) * 2013-06-06 2018-10-30 腾讯科技(深圳)有限公司 Text message recommends method, server, browser and system
US20150254219A1 (en) * 2014-03-05 2015-09-10 Adincon Networks LTD Method and system for injecting content into existing computerized data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929928A (en) * 2012-09-21 2013-02-13 北京格致璞科技有限公司 Multidimensional-similarity-based personalized news recommendation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
This News-Writing Bot Is Now Free for Everyone;KLINT FINLEY;《https://www.wired.com/2015/10/this-news-writing-bot-is-now-free-for-everyone/》;20151020;全文 *

Also Published As

Publication number Publication date
CN107977196A (en) 2018-05-01

Similar Documents

Publication Publication Date Title
US11765439B2 (en) Intelligent commentary generation and playing methods, apparatuses, and devices, and computer storage medium
CN107423274B (en) Artificial intelligence-based game comment content generation method and device and storage medium
Fang et al. Video2commonsense: Generating commonsense descriptions to enrich video captioning
US10803762B2 (en) Body-motion assessment device, dance assessment device, karaoke device, and game device
US20130084976A1 (en) Game paradigm for language learning and linguistic data generation
CN112333459B (en) Video live broadcasting method and device and computer storage medium
JP2018525675A (en) Method and device for generating live text broadcast content using past broadcast text
US20210394060A1 (en) Method and system for automatically generating video highlights for a video game player using artificial intelligence (ai)
CN113961692A (en) Machine reading understanding method and system
Marjieh et al. Words are all you need? capturing human sensory similarity with textual descriptors
KR102124790B1 (en) System and platform for havruta learning
CN113617036A (en) Game dialogue processing method, device, equipment and storage medium
WO2018072577A1 (en) Text generation method and server
CN114297354B (en) Bullet screen generation method and device, storage medium and electronic device
CN107977196B (en) Text generation method and server
CN114491152B (en) Method for generating abstract video, storage medium and electronic device
Warwick et al. Assumption of knowledge and the Chinese room in Turing test interrogation
Toncu et al. Escape from dungeon—modeling user intentions with natural language processing techniques
Pincus et al. Towards automatic identification of effective clues for team word-guessing games
Timmins Towards a Realist Philosophy of History
CN112752142B (en) Dubbing data processing method and device and electronic equipment
CN113838445B (en) Song creation method and related equipment
Tian et al. Script-to-Storyboard: A New Contextual Retrieval Dataset and Benchmark
van der Lee Next Steps in Data-to-Text Generation: Towards Better Data, Models, and Evaluation
Merayo et al. Applying machine learning to assess emotional reactions to video game content streamed on Spanish Twitch channels

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant