CN116956019A - Text generation method, text generation device, electronic equipment and computer readable storage medium - Google Patents

Text generation method, text generation device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN116956019A
CN116956019A CN202310532220.8A CN202310532220A CN116956019A CN 116956019 A CN116956019 A CN 116956019A CN 202310532220 A CN202310532220 A CN 202310532220A CN 116956019 A CN116956019 A CN 116956019A
Authority
CN
China
Prior art keywords
text
written
sentence
sentences
game
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310532220.8A
Other languages
Chinese (zh)
Inventor
代勇
陈万顺
周聪
陈梓阳
陈凌
牛彦琦
张玉律
杜楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202310532220.8A priority Critical patent/CN116956019A/en
Publication of CN116956019A publication Critical patent/CN116956019A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/40Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the application provides a text generation method, a device, electronic equipment and a computer readable storage medium, which are at least applied to the field of artificial intelligence and the field of game explanation, wherein the method comprises the following steps: responding to receiving an operation instruction aiming at a current virtual scene, and acquiring a pre-written text corresponding to the operation instruction; performing text splitting on the pre-written text to obtain a plurality of pre-written sentences; carrying out factual judgment on each pre-written sentence to obtain a factual judgment result of the corresponding pre-written sentence; based on the facts judging result, carrying out text processing on the pre-written sentences to obtain a generated sentence corresponding to each pre-written sentence; and performing text splicing processing on all the generated sentences to obtain scene explanation texts suitable for the current virtual scene. The generated scene explanatory text can be applied to any game event and has diversity.

Description

Text generation method, text generation device, electronic equipment and computer readable storage medium
Technical Field
The embodiment of the application relates to the field of Internet, and relates to a text generation method, a text generation device, electronic equipment and a computer readable storage medium.
Background
With the development of society, games become very important components in life of people, ecology built around the games enriches life of people to a large extent and promotes development of social economy, and lubrication and promotion effects are achieved on life happiness feeling of individuals and promotion of comprehensive well-being of society. The game commentary is very important for one game, and not only can the players learn each other, but also the game atmosphere is more active, the interactivity and the interestingness of the game can be greatly increased, and the viscosity and the retention rate of the user can be improved. The current mainstream game commentary relies on live commentary, i.e., live commentary that the host invites the commentator to play while the actual event is held. The on-site commentary has higher requirements for the commentators, not only requires the commentators to understand the game deeply, but also requires the commentators to have better mouth, and has rich commentary content. Thus, in general, manual interpretation costs are very expensive. However, most players cannot participate in such large-sized events, i.e., most players cannot enjoy the active atmosphere brought by the game commentary and the game interest better experienced, and cannot learn and communicate in the general game event commentary. Then, how to enable the game narrative to serve each event appears to be critical.
The game explanation in the related art is mainly based on manual writing, and then the explanation of the current game event is triggered through rule and data driving. However, this commentary is fixed, i.e. each time the same event is triggered, the commentary is identical, which makes the game commentary lose the diversity and interest it would have had, becoming predictable and quite boring. However, if a game comment text with diversity is to be generated, more manual intervention cost is required to manually modify the text, so how to popularize the game comment text to each event and maintain the diversity and the interestingness of the comment text is a problem to be solved currently.
Disclosure of Invention
The embodiment of the application provides a text generation method, a text generation device, electronic equipment and a computer readable storage medium, which can be at least applied to the field of artificial intelligence and the field of game explanation, and the generated scene explanation text can be applicable to any game event and has diversity.
The technical scheme of the embodiment of the application is realized as follows:
the embodiment of the application provides a text generation method, which comprises the following steps: responding to receiving an operation instruction aiming at a current virtual scene, and acquiring a pre-written text corresponding to the operation instruction; performing text splitting on the pre-written text to obtain a plurality of pre-written sentences; carrying out factual judgment on each pre-written sentence to obtain a factual judgment result of the corresponding pre-written sentence; based on the facts judging result, carrying out text processing on the pre-written sentences to obtain a generated sentence corresponding to each pre-written sentence; and performing text splicing processing on all the generated sentences to obtain scene explanation texts suitable for the current virtual scene.
The embodiment of the application provides a text generation device, which comprises: the acquisition module is used for responding to the received operation instruction aiming at the current virtual scene and acquiring a pre-written text corresponding to the operation instruction; the text splitting module is used for splitting the text of the pre-written text to obtain a plurality of pre-written sentences; the facts judging module is used for carrying out facts judgment on each pre-written statement to obtain facts judging results of the corresponding pre-written statement; the text processing module is used for carrying out text processing on the pre-written sentences based on the facts judgment result to obtain a generated sentence corresponding to each pre-written sentence; and the text splicing module is used for carrying out text splicing processing on all the generated sentences to obtain scene explanation texts suitable for the current virtual scene.
In some embodiments, the acquisition module is further to: acquiring an operation instruction aiming at a current virtual scene; if the number of the operation instructions is one, determining an event identifier of a virtual scene event corresponding to the operation instructions; acquiring a pre-written text corresponding to the event identifier from a pre-set conversation library; if the number of the operation instructions is multiple, acquiring the priority of each operation instruction; determining an event identifier of a virtual scene event corresponding to the operation instruction with the highest priority; and acquiring a pre-written text corresponding to the event identifier from a pre-arranged speech library.
In some embodiments, in the preset voice library, each of the event identifications corresponds to a plurality of pre-written texts; the acquisition module is further configured to: acquiring the plurality of pre-written texts corresponding to the event identification from the preset conversation library; and randomly selecting one pre-written text from the plurality of pre-written texts as the pre-written text corresponding to the operation instruction.
In some embodiments, the text splitting module is further to: performing text symbol recognition on the pre-written text to obtain at least one symbol mark in the pre-written text; determining a symbol identifier belonging to the same type as a preset symbol identifier from the at least one symbol identifier as a target symbol identifier; and carrying out text splitting on the pre-written text based on the target symbol mark to obtain a plurality of pre-written sentences.
In some embodiments, the factual decision is a factual classification process; the facts determination module is further configured to: carrying out the fact classification processing on each pre-written sentence through a pre-trained fact judgment model to obtain a fact judgment result of the pre-written sentence, wherein the fact judgment result comprises the following steps: the corresponding pre-written statement is a first type of decision result of the factual statement and the corresponding pre-written statement is a second type of decision result of the non-factual statement.
In some embodiments, the text processing module is further to: responding to the fact judgment result as the first type judgment result, and carrying out sentence rewriting on the pre-written sentence to obtain a rewritten sentence corresponding to the pre-written sentence; responding to the fact judging result as the second type judging result, and performing free sentence generation on the pre-written sentences to obtain free generation sentences corresponding to the pre-written sentences; wherein the rewrite sentence and the free-form sentence constitute the generation sentence.
In some embodiments, the text processing module is further to: responding to the fact judgment result as the first type judgment result, and performing word segmentation processing on the pre-written sentence to obtain at least one fact sentence word segmentation; performing part-of-speech recognition on each fact sentence segmentation to obtain the part-of-speech of the corresponding fact sentence segmentation; determining keywords and non-keywords of the pre-written sentence from the at least one factual sentence word based on the part of speech of each factual sentence word; and (3) rewriting the non-keywords in the pre-written sentences to obtain the rewritten sentences corresponding to the pre-written sentences.
In some embodiments, the text processing module is further to: responding to the fact judgment result as the first type judgment result, and carrying out sentence analysis on the pre-written sentence to obtain the sentence pattern type of the pre-written sentence; determining a modified sentence type corresponding to the sentence type; and carrying out sentence modification on the pre-written sentence according to the modified sentence type to obtain a rewritten sentence corresponding to the pre-written sentence.
In some embodiments, the text processing module is further to: responding to the fact judging result as the second type judging result, and carrying out shielding treatment on non-fact sentences in the pre-written text to obtain the pre-written text after sentence shielding; inputting the pre-written text after the sentence shielding into a pre-trained text generation model, and predicting sentences at shielding positions based on sentences at non-shielding positions in the pre-written text after the sentence shielding by the text generation model to obtain generated sentences at the shielding positions; and determining the generation statement of the shielding position as a free generation statement corresponding to the pre-written statement.
In some embodiments, the apparatus further comprises: a model training module for training the text generation model by: acquiring a first type training sample set and a second type training sample set, wherein the first type training sample set comprises training data of a plurality of general fields, and the second type training sample set comprises training data of a plurality of virtual scene fields; obtaining model parameters of an initial text generation model; inputting training data in the first training sample set into the initial text generation model to obtain initial model output data; correcting model parameters in the initial text generation model based on the initial model output data to obtain a trained initial text generation model; inputting training data in the second training sample set into the trained initial text generation model to obtain training model output data; and correcting model parameters in the trained initial text generation model based on the training model output data to obtain a trained text generation model.
In some embodiments, the model training module is further to: performing data preprocessing on the training data in the first type training sample set and the second type training sample set to obtain a preprocessed first type training sample set and a preprocessed second type training sample set; correspondingly, training data in the preprocessed first-type training sample set is input into the initial text generation model, and training data in the preprocessed second-type training sample set is input into the trained initial text generation model.
In some embodiments, the apparatus further comprises: the historical text acquisition module is used for acquiring a historical scene explanation text before the current moment; the different text splicing modules are used for carrying out text splicing according to the sequence of the historical scene explanation text, the blank text and the scene explanation text at the current moment to form spliced texts; a blank text prediction module, configured to input the spliced text into a pre-trained text generation model, and predict, through the text generation model, a sentence corresponding to the blank text based on the historical scene explanation text and the scene explanation text; and the sentence adding module is used for adding sentences corresponding to the blank texts to the text positions of the blank texts in the spliced text to obtain continuous explanation texts aiming at the current virtual scene.
An embodiment of the present application provides an electronic device, including: a memory for storing executable instructions; and the processor is used for realizing the text generation method when executing the executable instructions stored in the memory.
Embodiments of the present application provide a computer program product comprising executable instructions stored in a computer readable storage medium; the processor of the electronic device reads the executable instructions from the computer readable storage medium and executes the executable instructions to implement the text generation method.
The embodiment of the application provides a computer readable storage medium, which stores executable instructions for causing a processor to execute the executable instructions to implement the text generation method.
The embodiment of the application has the following beneficial effects: when an operation instruction aiming at the current virtual scene is received, acquiring a pre-written text corresponding to the operation instruction; text splitting is carried out on the pre-written text, and a plurality of pre-written sentences are obtained; and carrying out factual judgment on each pre-written sentence, thereby obtaining a scene explanation text which is finally applicable to the current virtual scene based on the results of the factual judgment. In this way, since the text generation method of the embodiment of the application is adopted to process the text of the pre-written text based on the pre-written text of the pre-written operation instruction, the facts of each pre-written sentence in the pre-written text are judged first during text processing, and the pre-written sentences are rewritten or automatically generated based on the results of the facts judgment, so that the pre-written text can be generated into various scene explanation texts without manual processing, and the generated scene explanation texts can be suitable for any game event and have various scenes.
Drawings
FIG. 1 is a diagram of a real-time battle situation of a game in the related art;
FIG. 2 is a flow chart of a game explanation based on manual writing in the related art;
FIG. 3 is a schematic diagram of an alternative architecture of a text generation system provided by an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
FIG. 5 is a schematic flow chart of an alternative text generation method according to an embodiment of the present application;
FIG. 6 is a schematic flow chart of another alternative text generation method provided by an embodiment of the present application;
FIG. 7 is a schematic diagram of an implementation flow for performing statement rewrite on a pre-written statement according to an embodiment of the present application;
FIG. 8 is a schematic diagram of an implementation flow of free sentence generation for a pre-written sentence according to an embodiment of the present application;
FIG. 9 is a flowchart of a training method of a text generation model according to an embodiment of the present application;
FIG. 10 is a schematic diagram of a game according to an embodiment of the present application;
FIG. 11 is a flow chart of a game ticket generation algorithm provided by an embodiment of the present application;
fig. 12 is a schematic diagram of a model reasoning process of the text generation model provided by the embodiment of the application.
Detailed Description
The present application will be further described in detail with reference to the accompanying drawings, for the purpose of making the objects, technical solutions and advantages of the present application more apparent, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which embodiments of this application belong. The terminology used in the embodiments of the application is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
Before explaining the text generation method of the embodiment of the present application, the technical terms involved in the embodiment of the present application are explained first.
(1) Game explanation: in a relatively large game event, a currently occurring game event is interpreted, commented, predicted, etc., such as a live commentary or AI commentary on a game in a nationwide game.
(2) Text generation (text generation): one natural language task, another task is natural language understanding. The text generation comprises free text generation and controllable text generation, wherein the free text generation is to enable a model to freely output a text, and the controllable text generation is to enable the model to output the text which is related to the constraint as much as possible under the constraint of a user (such as the constraint of topics, keywords and the like). The controllable text generation is a technology which is more in the ground of each business scene, and the embodiment of the application focuses on applying the controllable text generation to game explanation.
(3) The facts are controllable: in the embodiment of the application, the fact that the text is generated in a controllable manner refers to the text generation under the restriction of the manual explanation of the game explanation; more specifically, the controlled element is a pre-provided sentence of text which is manually written, then the text which is generated by the hope model is consistent with the fact of manually written, and other facts which are newly appeared are not generated, but the difference between the text and the fact of manually written word is larger as much as possible, and the fact of the text is controllably generated and rewritten.
The method in the related art will be described below.
Games have become an important component in the work and life of people, and game solutions are key elements for increasing game interest and increasing user viscosity. The embodiment of the application provides a fact-controllable text generation algorithm to increase the diversity of game speaking solutions and overall promote the interest of game comments, and the embodiment of the application provides two technologies related to the model and the algorithm, namely a free text generation technology and a controllable text generation technology, which belong to the category of text generation, and the relation between the game speaking solutions and the technologies of the two aspects is explained from the beginning of the game comments.
The game explanation flow based on manual writing is as follows: the game data are processed into feature vectors, predefined events are triggered, a section of manual pre-written commentary is randomly selected by the events, and real-time explanation is carried out through a voice system. For example, fig. 1 is a real-time battle situation diagram of a game in the related art, fig. 2 is a flow chart of a game explanation based on a manual, as shown in fig. 1 and fig. 2, the real-time battle situation in fig. 1 is converted into feature vectors in fig. 2, so as to obtain a game feature 201, then the game feature 201 triggers a predefined event 202, and the predefined event 202 randomly picks a corresponding manual explanation 203, and the explanation is performed through a voice system. The model and the algorithm provided by the embodiment of the application rewrite the manual writing operation, so that the manual writing operation obtains more writing operations with lower cost under the action of the algorithm and the model provided by the embodiment of the application.
However, based on manually written game comments, it is a significant drawback that writing comments for hundreds or even more events can be time consuming, labor intensive, and expensive, while the quality of the written comments is directly dependent on the expertise of the writer. Thus, it is unlikely that a large number of writes will be made for each event. This results in a high probability that the same event will be interpreted by the same conversation, leaving the interpretation progressively less diverse and interesting.
The text free generation technology based on deep learning enables the model to freely output a text segment under the condition that the user gives a start or does not give any start, and the user does not restrict the algorithm and generates and outputs the text according to the characteristics of the model. And the controllable text generation is to design a model and an algorithm to generate the text under the condition that a user provides certain constraint, for example, the user gives a theme, and some keyword constraint is given to enable the model and the algorithm to generate. The difference between controlled generation and free generation is given constraints, whether the user is explicit.
The generation of free text, as the name suggests, is not directly applicable to the game narrative business because the text generated by these models and algorithms is uncontrolled and the generated text may be quite different from the game speaking of embodiments of the present application. General controllable text generation cannot guarantee the fact consistency of the generated text and the original speech technology. While text rewrite (text rewrite) of the subclass of controllable text generation is to rewrite sentence patterns and word levels under the condition of ensuring facts.
Based on at least one of the above problems existing in the related art, in order to better rewrite and generate the manually written text, the embodiment of the application proposes that the model and algorithm be specially designed to generate and rewrite in three dimensions of fact compliance, fact consistency line and variability. For smoothness, the embodiment of the application collects a large amount of comment data and barrage data related to games, trains a large-scale pre-training model, and ensures smoothness of generated texts by using the large-scale pre-training model; for fact consistency, because the game explanation has higher requirement on the fact consistency, the embodiment of the application adopts the text rewrite technology to rewrite the fact part in the conversation, completely maintains the fact of the conversation and modifies the sentence pattern and the word level to a certain extent; for diversity or diversity, the embodiment of the application provides free generation of non-actual parts, and increases the diversity of the overall generated text.
According to the text generation method provided by the embodiment of the application, firstly, responding to the received operation instruction aiming at the current virtual scene, and acquiring a pre-written text corresponding to the operation instruction; then, carrying out text splitting on the pre-written text to obtain a plurality of pre-written sentences; then, carrying out factual judgment on each pre-written sentence to obtain a factual judgment result of the corresponding pre-written sentence; based on the facts judging result, carrying out text processing on the pre-written sentences to obtain a generated sentence corresponding to each pre-written sentence; and finally, performing text splicing processing on all the generated sentences to obtain scene explanation texts suitable for the current virtual scene. In this way, since the text generation method of the embodiment of the application is adopted to process the text of the pre-written text based on the pre-written text of the pre-written operation instruction, the facts of each pre-written sentence in the pre-written text are judged first during text processing, and the pre-written sentences are rewritten or automatically generated based on the results of the facts judgment, so that the pre-written text can be generated into various scene explanation texts without manual processing, and the generated scene explanation texts can be suitable for any game event and have various scenes.
Here, first, an exemplary application of the text generating device, which is an electronic device for implementing the text generating method, of the embodiment of the present application will be described. In one implementation manner, the text generating device (i.e., the electronic device) provided by the embodiment of the present application may be implemented as a terminal or as a server. In one implementation manner, the text generating device provided by the embodiment of the application can be implemented as any terminal with game running and game explanation functions, such as a notebook computer, a tablet computer, a desktop computer, a mobile phone, a portable music player, a personal digital assistant, a special message device, a portable game device, an intelligent robot, an intelligent household appliance, an intelligent vehicle-mounted device and the like; in another implementation manner, the text generating device provided by the embodiment of the present application may be implemented as a server, where the server may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content distribution networks (CDN, content Delivery Network), and basic cloud computing services such as big data and artificial intelligence platforms. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiment of the present application. In the following, an exemplary application when the text generating device is implemented as a server will be described.
Referring to fig. 3, fig. 3 is an optional architecture diagram of a text generation system provided in an embodiment of the present application, where the text generation method is applied to any type of game illustration as an example. When the game application is in the running process, operations of a player need to be explained, and when the game is explained, the text generation method provided by the embodiment of the application can be adopted to generate the scene explanation text of the current game, and the scene explanation text is output through voice. In the embodiment of the present application, the text generation system 10 at least includes a terminal 100, a network 200, and a server 300. The server 300 may be a server of the game application or may be a text generation server other than a server independent of the game application. The server 300 may constitute a text generating apparatus of an embodiment of the present application. The terminal 100 is connected to the server 300 through the network 200, and the network 200 may be a wide area network or a local area network, or a combination of both.
In the embodiment of the present application, a game application is run on the terminal 100, and when text generation is performed, the terminal receives an operation instruction for a currently running game input by a player, and sends the operation instruction to the server 300 through the network 200. The server 300 responds to receiving an operation instruction for the current virtual scene, and obtains a pre-written text corresponding to the operation instruction; then, carrying out text splitting on the pre-written text to obtain a plurality of pre-written sentences; then, carrying out factual judgment on each pre-written sentence to obtain a factual judgment result of the corresponding pre-written sentence; based on the facts judging result, carrying out text processing on the pre-written sentences to obtain a generated sentence corresponding to each pre-written sentence; and finally, performing text splicing processing on all the generated sentences to obtain scene explanation texts suitable for the current virtual scene. After obtaining the scene comment text, the server 300 may convert the scene comment text into a scene comment voice and send the scene comment voice to the terminal 100, so as to output the scene comment voice on the terminal. After obtaining the scene explanation text, the server 300 may also directly send the scene explanation text to the terminal 100, and the terminal 100 performs voice conversion on the scene explanation text to obtain a scene explanation voice and outputs the scene explanation voice.
In some embodiments, the text generation method may also be implemented by a terminal, that is, the terminal 100 has a game application installed thereon, and a player inputs an operation instruction through a client of the game application. After receiving an operation instruction, the terminal responds to the received operation instruction aiming at the current virtual scene, and a pre-written text corresponding to the operation instruction is obtained; the terminal splits the text of the pre-written text to obtain a plurality of pre-written sentences; the terminal carries out factual judgment on each pre-written sentence to obtain a factual judgment result of the corresponding pre-written sentence; text processing is carried out on the pre-written sentences based on the facts judgment result, and generated sentences corresponding to each pre-written sentence are obtained; and the terminal performs text splicing processing on all the generated sentences to obtain scene explanation texts suitable for the current virtual scene.
The text generation method provided by the embodiment of the application can also be implemented based on a cloud platform and through cloud technology, for example, the server 300 can be a cloud server. The method comprises the steps that a pre-written text corresponding to an operation instruction is obtained through a cloud server, or the pre-written text is subjected to text splitting through the cloud server, and each pre-written sentence is subjected to factual judgment through the cloud server; and carrying out text processing on the pre-written sentences through the cloud server based on the facts judgment result.
In some embodiments, a cloud memory may be further provided, and the pre-written text corresponding to each operation instruction may be stored in the cloud memory, or the scene description text may be further stored in the cloud memory. Thus, when any operation instruction is received, the pre-written text corresponding to the operation instruction can be directly obtained from the cloud storage, and the text generation efficiency is improved.
Here, cloud technology (Cloud technology) refers to a hosting technology that unifies serial resources such as hardware, software, and networks in a wide area network or a local area network to implement calculation, storage, processing, and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied by the cloud computing business mode, can form a resource pool, and is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data need strong system rear shield support, which can be realized through cloud computing.
Fig. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application, where the electronic device shown in fig. 4 may be a text generating device, and the text generating device includes: at least one processor 310, a memory 350, at least one network interface 320, and a user interface 330. The various components in the text generating device are coupled together by a bus system 340. It is understood that the bus system 340 is used to enable connected communications between these components. The bus system 340 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled as bus system 340 in fig. 4.
The processor 310 may be an integrated circuit chip with signal processing capabilities such as a general purpose processor, which may be a microprocessor or any conventional processor, or the like, a digital signal processor (DSP, digital Signal Processor), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.
The user interface 330 includes one or more output devices 331 that enable presentation of media content, and one or more input devices 332.
Memory 350 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 350 optionally includes one or more storage devices physically located remote from processor 310. Memory 350 includes volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a random access Memory (RAM, random Access Memory). The memory 350 described in embodiments of the present application is intended to comprise any suitable type of memory. In some embodiments, memory 350 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.
The operating system 351 including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks; network communication module 352 for reaching other computing devices via one or more (wired or wireless) network interfaces 320, exemplary network interfaces 320 include: bluetooth, wireless compatibility authentication (WiFi), and universal serial bus (USB, universal Se rial Bus), etc.; an input processing module 353 for detecting one or more user inputs or interactions from one of the one or more input devices 332 and translating the detected inputs or interactions.
In some embodiments, the apparatus provided by the embodiments of the present application may be implemented in a software manner, fig. 4 shows a text generating apparatus 354 stored in a memory 350, where the text generating apparatus 354 may be a text generating apparatus in an electronic device, and may be software in the form of a program and a plug-in, and the following software modules include: the acquisition module 3541, text splitting module 3542, factual decision module 3543, text processing module 3544, and text stitching module 3545 are logical, and thus can be arbitrarily combined or further split depending on the functionality implemented. The functions of the respective modules will be described hereinafter.
In other embodiments, the apparatus provided by the embodiments of the present application may be implemented in hardware, and by way of example, the apparatus provided by the embodiments of the present application may be a processor in the form of a hardware decoding processor that is programmed to perform the text generation method provided by the embodiments of the present application, for example, the processor in the form of a hardware decoding processor may employ one or more application specific integrated circuits (ASIC, application Specific Integrated Circuit), DSP, programmable logic device (PLD, programmable Logic Device), complex programmable logic device (CPLD, complex Programmable Logi c Device), field programmable gate array (FPGA, field-Programmable Gate Array), or other electronic component.
The text generation method provided by the embodiments of the present application may be executed by an electronic device, where the electronic device may be a server or a terminal, that is, the text generation method of the embodiments of the present application may be executed by the server or the terminal, or may be executed by interaction between the server and the terminal.
Fig. 5 is a schematic flowchart of an alternative text generation method according to an embodiment of the present application, and the following steps are described in conjunction with the steps shown in fig. 5, and as shown in fig. 5, the method includes the following steps S101 to S105:
step S101, in response to receiving an operation instruction for the current virtual scene, acquiring a pre-written text corresponding to the operation instruction.
Here, the current virtual scene may be a game scene of an arbitrary game running on the terminal, and under the current virtual scene, a user (i.e., a player of the game) may perform any one of operations, thereby issuing an operation instruction. Wherein the operation instructions include, but are not limited to: attack instruction, defending instruction, kill instruction, skill release instruction, and instructions corresponding to any other function provided by the game.
In the embodiment of the application, the operation input by the user at the game client can be acquired through the terminal, so that the operation instruction is generated based on the operation of the user.
In one implementation manner, the server of the game application may execute corresponding game data processing in response to the operation instruction, and may also obtain, based on the operation instruction, a pre-written text corresponding to the operation instruction from a preset speech library, so as to implement an explanation of the game in the game running process. For example, the server of the game application, that is, the server for executing the text generation method in the embodiment of the application, may be a server in a distributed system, and different processes are respectively invoked to the game data processing process through the server in the distributed system, so as to realize the response to the player input operation instruction, and complete the normal running of the game logic; meanwhile, the server in the distributed system can call other processes to acquire the operation instruction, and acquire a pre-written text corresponding to the operation instruction from a pre-set conversation library, so that a scene explanation text applicable to the current virtual scene is generated based on the pre-written text, and therefore the current running game scene is explained based on the scene explanation text, and synchronous explanation is realized while the game is run.
In another implementation manner, the client of the game application may feed back all the operation instructions in the time period to the server once every a certain time period, and the server obtains, based on all the operation instructions, a pre-written text corresponding to at least one operation instruction in all the operation instructions from the preset voice library, so as to implement an explanation of the game in the game running process. Here, the operation instruction may not be transmitted to the server in real time, and the server may acquire the operation instruction once at intervals, and the operation instruction acquired in the period of time may be one or more. When the obtained operation instruction is one, a corresponding pre-written text can be obtained based on the operation instruction, and a scene explanation text is further generated; if the acquired operation instructions are multiple, one operation instruction can be selected from the multiple operation instructions to generate the scene explanation text, and the multiple operation instructions can be selected to generate the scene explanation text together.
In still another implementation manner, after the game application runs the current round of game, game data of the current round of game may be generated, where the game data includes operation instructions corresponding to operations performed by a user at each time or at each time point, so as to obtain a pre-written text corresponding to each operation instruction, so that an explanation of the game after the game is finished is implemented. The method for generating the text in the embodiment of the application is used for generating corresponding scene explanation text for each operation instruction in the game, thereby generating explanation voice of the game video, and integrating the explanation voice into the game video to realize the manufacture of the game video.
The pre-written text refers to a manually pre-written comment text corresponding to a certain operation instruction. For each operation instruction, a plurality of pre-written texts may be pre-written. In this way, when generating the scene comment text, one of the plurality of pre-written texts can be randomly selected to generate the scene comment text. In the embodiment of the application, a plurality of pre-written text mappings corresponding to each operation instruction can be stored in the pre-written text library, so that after the current operation instruction is determined, the corresponding pre-written text can be obtained from the pre-written text library.
And step S102, carrying out text splitting on the pre-written text to obtain a plurality of pre-written sentences.
Here, text splitting refers to splitting a pre-written text into a plurality of sentences, each sentence constituting one pre-written sentence. For example, the pre-written text may be split according to punctuation to obtain a plurality of pre-written sentences.
Step S103, carrying out factual judgment on each pre-written sentence to obtain a factual judgment result of the corresponding pre-written sentence.
Here, the factual decision refers to determining whether the content in each pre-written sentence is factual or non-factual. The fact that the event corresponding to the content in the pre-written sentence is the fact that the event is actually present, that is, the fact that the event can be accurately predicted and completely copied under a certain external condition. Accurate, mechanical predictability and replicability are important disciplines of the real world content. The content of facts corresponds to the substantial facts, which are the premise and basis on which natural science can be established and progress is continuously being made, and the substantial facts are derived from the inert reaction modes of natural facts to objective environments. The inert reaction mode is a passive and simple reaction mode, and natural things can react in nature as long as the objective environment is the same. In the embodiment of the application, the actual content can be the content corresponding to the event which has occurred or actually exists in the game application.
For example, in a game application, when a player performs a game play operation while the player plays, the game play is a realistic content, and a realistic event corresponding to the realistic content is an event that the game play is performed.
The non-factual content is other content than the factual content, for example, the non-factual content may include modified content in text, a mood word, a free sentence unrelated to facts, and the like. In the embodiment of the application, the fact judgment can be carried out on each pre-written sentence to obtain the fact judgment result of the corresponding pre-written sentence, wherein when the fact judgment is carried out, the fact judgment result can be obtained by firstly determining which content is the fact content, and after the fact content is determined, the rest content is determined to be the non-fact content.
The actual determination results include: the corresponding pre-written statement is a first type of decision result of the factual statement and the corresponding pre-written statement is a second type of decision result of the non-factual statement. When the fact judgment result of any one of the pre-written sentences is a first type judgment result, the pre-written sentences are shown to be the fact sentences, and when the fact judgment result of any one of the pre-written sentences is a second type judgment result, the pre-written sentences are shown to be the non-fact sentences.
In some embodiments, in making the facts determination, a fact determination model may be employed, which may be a two-class model whose outputs have two types: 0 and 1, when the output result is 0, indicating that the factual judgment result of the pre-written sentence input to the factual judgment model is a second type judgment result; when the output result is 1, the factual judgment result indicating the pre-written sentence input to the factual judgment model is the first type judgment result.
Step S104, based on the facts judging result, carrying out text processing on the pre-written sentences to obtain the generated sentences corresponding to each pre-written sentence.
In the embodiment of the application, different text processing modes can be adopted for processing different facts judging results.
For example, the factual sentence may be rewritten in such a manner that the factual sentence is not changed, that is, the meaning of the sentence to be expressed is not changed by the writing of the sentence for the factual content. The method can adopt a free sentence generation mode for the non-factual sentence, and based on the context and the semantics of the front and rear sentences of the non-factual sentence, a section of generation sentence can be freely generated, and the generation sentence can be any type of sentence and any expression form sentence, so that the diversity of the generation sentence can be improved through the freely generated generation sentence, and the diversity of finally generated scene explanation texts is further improved.
And step S105, performing text splicing processing on all the generated sentences to obtain scene explanation texts suitable for the current virtual scene.
In the embodiment of the application, after all the generated sentences are obtained, text splicing processing can be carried out on all the generated sentences according to the sequence of the corresponding pre-written sentences in the pre-written text, so as to obtain the scene explanation text applicable to the current virtual scene.
Here, the text splicing process refers to connecting the front and rear generation sentences into a continuous text with more characters, when the text splicing process is performed, punctuation marks between the two generation sentences can be determined according to the relation between the front and rear generation sentences, and then the punctuation marks are added between the two generation sentences to form the scene explanation text.
It should be noted that, the text generation method of the embodiment of the present application is used for generating a scene comment text corresponding to the current one or more operation instructions, and a large number of operation instructions are received during the running process of the game, and the running time of the game has a certain duration, so that in the whole game, a section of scene comment text is generated based on the current operation instruction by adopting the text generation method of the embodiment of the present application, a game comment is performed on a game segment in a current period, another section of scene comment text is generated on a game segment in a next period, and the cycle is repeated in this way until the acquisition of the operation instruction and the generation of the scene comment text are stopped when the whole game is completed.
According to the text generation method provided by the embodiment of the application, when an operation instruction aiming at the current virtual scene is received, a pre-written text corresponding to the operation instruction is obtained; text splitting is carried out on the pre-written text, and a plurality of pre-written sentences are obtained; and carrying out factual judgment on each pre-written sentence, thereby obtaining a scene explanation text which is finally applicable to the current virtual scene based on the results of the factual judgment. In this way, since the text generation method of the embodiment of the application is adopted to process the text of the pre-written text based on the pre-written text of the pre-written operation instruction, the facts of each pre-written sentence in the pre-written text are judged first during text processing, and the pre-written sentences are rewritten or automatically generated based on the results of the facts judgment, so that the pre-written text can be generated into various scene explanation texts without manual processing, and the generated scene explanation texts can be suitable for any game event and have various scenes.
In some embodiments, the text generation system at least comprises a terminal and a server, wherein the terminal can be provided with a game application or a game application and a text generation application, and the server forms a server of the game application or a server of the text generation application. The text generation method and the explanation of the game scene in the game application are realized through the interaction between the terminal and the server.
The text generation method of the embodiment of the application can be applied to any one of the following scenes:
scene one: the terminal is provided with a game application, and the server forms a server of the game application. The server of the game application not only can realize the processing of game data, but also can finish the running of the game application; the game commentary function is also provided in the game application, and the server of the game application can realize the game commentary on the game scene in the game application through the text generation method provided by the embodiment of the application. In the implementation process, when the player starts playing the game by running the game application, the player can select the game comment function, so that the scene comment text of the operation instruction for the player can be generated in the process of running the game application by the player, and the scene comment text of the current game of the player can be output in the form of voice on the terminal of the player in the process of running the game application.
Scene II: the terminal is provided with a game application, and the server forms a server of the game application. The server of the game application not only can realize the processing of game data, but also can finish the running of the game application; the game application also provides a game commentary function and a game recording function, and the server of the game application can realize the game commentary on the game scene in the game application through the text generation method provided by the embodiment of the application. In the implementation process, when the player starts playing the game by running the game application, the player can select the game recording function and the game explanation function, so that in the process of playing the game by running the game application by the player, the scene explanation text aiming at each operation instruction of the player can be generated by adopting the text generation method provided by the embodiment of the application while the game is recorded on the game picture. After generating the scene narrative text, the scene narrative text may be converted into scene narrative speech and the scene narrative speech is fused into a recorded game video to generate a game recording video with game narratives.
Scene III: the terminal is provided with a game application and a text generation application, and the server forms a server of the text generation application. In the implementation process, the game application server and the text generation application server can be linked, wherein the linkage refers to that data transmission and sharing can be performed between the game application server and the text generation application server. When a player starts playing a game by running a game application, a game comment key can be clicked on the current interface of the game application, and then a server of the game application can respond to clicking operation on the game comment key to realize linkage with a server of the text generation application. After linkage is completed between the server of the game application and the server of the text generation application, the server of the game application can send an operation instruction to the server of the text generation application after receiving each operation instruction input by a player at the client of the game application, and at this time, the server of the text generation application can adopt the text generation method provided by the embodiment of the application to obtain the scene comment text (namely, the game comment text) of the current game. After the server of the text generation application obtains the scene comment text, the scene comment text may be sent to the server of the game application, or scene comment voices may be generated based on the scene comment text, and the scene comment voices may be sent to the server of the game application.
Scene four: the terminal is provided with a game application and a text generation application, and the server forms a server of the text generation application. In the implementation process, the server of the game application and the server of the text generation application can be linked, and when the player starts playing the game by running the game application under the condition of linking the server of the game application and the server of the text generation application, the player can select the game recording function and the game commentary function, so that in the process of playing the game by the player running the game application, the game recording of the game picture can be realized through the server of the game application, and the game recording video and the operation instruction of the player are simultaneously transmitted to the server of the text generation application. The server of the text generation application may generate the scene comment text for each operation instruction of the player by adopting the text generation method provided by the embodiment of the present application. After generating the scene comment text, the scene comment text may be converted into scene comment voices, and the scene comment voices are fused into game record videos sent by a server of the game application, so as to generate the game record videos with game comments. After the server of the text generation application generates the game recorded video with the game comment, the game recorded video with the game comment may be displayed at the client of the text generation application, or the game recorded video with the game comment may also be sent to the server of the game application, so as to realize that the game recorded video with the game comment is displayed on the client of the game application.
Next, a text generation method according to an embodiment of the present application will be described with reference to the first scenario.
Fig. 6 is another optional flowchart of a text generating method according to an embodiment of the present application, as shown in fig. 6, the method includes the following steps S201 to S220:
step S201, when the terminal runs the current virtual scene, the terminal receives a selection operation for the game comment function.
Here, the current virtual scene may be a game scene of any game running on the terminal, a game commentary function is provided in the game application, a player can select the game commentary function in the process of running the game, and the text generation method of the embodiment of the application is realized through the server of the game application, so that the game commentary of the running game application is realized.
In the implementation process, a game comment key is displayed on a client of the game application, and a player inputs a selection operation for a game comment function by clicking the game comment key on a current interface of the game application.
In step S202, the terminal transmits a selection operation for the game comment function to the server.
In step S203, the server triggers the start of the game comment function in response to the selection operation.
In the embodiment of the application, after receiving a selection operation for the game comment function, the server starts the game comment function in response to the selection operation, and at this time, an operation instruction input by a user in the game is continuously received, and a comment text of the game, namely, a scene comment text applicable to the current virtual scene is generated based on the operation instruction of the user.
In step S204, the terminal receives an operation instruction for the current virtual scene.
Here, when the player starts game play, the player needs to constantly input an operation instruction in the game to control the game play. For the operation instruction input by the player, the terminal responds to the operation instruction of the player in the game scene and also sends the operation instruction of the player to the server so as to realize the game explanation function based on the operation instruction of the player through the server.
In step S205, the terminal sends an operation instruction to the server.
In step S206, the server determines the number of operation instructions.
Step S207, if the number of the operation instructions is one, the server determines the event identification of the virtual scene event corresponding to the operation instruction; and acquiring a pre-written text corresponding to the event identifier from a pre-arranged speech library.
In some embodiments, each operation instruction corresponds to a virtual scene event, which refers to an event generated or represented in response to the operation instruction. For example, when the operation instruction is a rightward movement instruction, the virtual scene event corresponding to the operation instruction is a rightward movement event; when the operation instruction is a release instruction of the skill a, the virtual scene event corresponding to the operation instruction is a skill a release event. Each virtual scene event has an event identification by which the corresponding virtual scene event can be uniquely identified.
In the embodiment of the application, after acquiring an operation instruction, a server firstly determines a virtual scene event corresponding to the operation instruction and an event identifier corresponding to the virtual scene event based on the type of the operation instruction, and then acquires a pre-written text corresponding to the event identifier from a preset speech library.
And storing the pre-written texts corresponding to the plurality of virtual scene events in a pre-call library, wherein during storage, the event identification of the virtual scene event and the corresponding plurality of pre-written texts are mapped and then stored in the pre-call library, so that after the virtual scene event corresponding to the current operation instruction is determined, the corresponding pre-written text can be queried from the pre-call library according to the event identification of the virtual scene event.
For example, in a game, a plurality of types of operation instructions may be executed, each type of operation instruction corresponding to one virtual scene event, thereby having a plurality of virtual scene events. A unique event identification can be set for each virtual scene event, and corresponding speech is manually written in advance for each virtual scene event to form a plurality of pre-written texts corresponding to the virtual scene event. And then, the event identification of the virtual scene event and a plurality of corresponding pre-written texts can be mapped mutually and then stored in a preset conversation library.
In step S208, if the number of operation instructions is plural, the server acquires the priority of each operation instruction.
In some embodiments, for multiple types of operation instructions in the game, a priority may be set for each operation instruction in advance, and the priority of each operation instruction may be stored in the preset voice library correspondingly. For example, the priority of the movement instructions may be lower than the priority of the skill release instructions.
When the number of the operation instructions sent to the server by the terminal is a plurality of, the priority of each operation instruction can be obtained from a preset voice library.
In some embodiments, the plurality of operation instructions may be that the player sends the plurality of operation instructions simultaneously, for example, the player releases a skill while performing a movement operation, and at this time, the terminal acquires the movement operation instruction and the skill release instruction input by the player simultaneously. In other embodiments, the plurality of operation instructions may be that the player triggers the plurality of operation instructions sequentially in a period of time, that is, the server receives the operation instruction fed back by the terminal once every a certain period of time, and may have the plurality of operation instructions.
Step S209, the server determines the event identification of the virtual scene event corresponding to the operation instruction with the highest priority; and acquiring a pre-written text corresponding to the event identifier from a pre-arranged speech library.
In the embodiment of the application, the operation instruction with the highest priority can be selected from a plurality of operation instructions, the event identification of the virtual scene event corresponding to the operation instruction with the highest priority is determined, and then the pre-written text corresponding to the event identification is acquired from the pre-arranged speech library.
In some embodiments, each event identification corresponds to a plurality of pre-written text; the step S207 and the step S209 of obtaining the pre-written text corresponding to the event identifier from the pre-set speech library may be implemented in the following manner: firstly, acquiring a plurality of pre-written texts corresponding to event identifications from a preset conversation library; then, one of the plurality of pre-written texts is randomly selected as a pre-written text corresponding to the operation instruction. That is, all or part of the pre-written text corresponding to the event identification is first acquired, and then one pre-written text is randomly selected from all or part of the pre-written text as the finally selected pre-written text.
Step S210, the server carries out text symbol recognition on the pre-written text to obtain at least one symbol identifier in the pre-written text.
Here, the text symbol recognition may be to recognize punctuation marks in the pre-written text, resulting in at least one punctuation mark (i.e., symbol identification) in the pre-written text.
In step S211, the server determines, from at least one symbol identifier, a symbol identifier belonging to the same type as the preset symbol identifier as the target symbol identifier.
Here, the preset symbol mark may be at least one of a comma or a semicolon. In one implementation, commas and semicolons in the pre-written text may be determined as target symbol identifications.
And step S212, the server performs text splitting on the pre-written text based on the target symbol identification to obtain a plurality of pre-written sentences.
Here, the pre-written text may be text-split according to the positions of commas and semicolons in the pre-written text, to form a plurality of pre-written sentences.
In step S213, the server performs a factual classification process on each pre-written sentence through a pre-trained fact determination model, so as to obtain a factual determination result of the pre-written sentence.
Here, the actual determination result includes: the corresponding pre-written statement is a first type of decision result of the factual statement and the corresponding pre-written statement is a second type of decision result of the non-factual statement.
In step S214, the server rewrites the sentence on the pre-written sentence in response to the fact that the determination result is the first type determination result, and obtains the rewritten sentence corresponding to the pre-written sentence.
In some embodiments, referring to fig. 7, fig. 7 shows that step S214 performs sentence writing on the pre-written sentence, which may be implemented by the following steps S2141 to S2144:
in step S2141, the pre-written sentence is subjected to word segmentation processing in response to the fact determination result being the first type determination result, so as to obtain at least one fact sentence word segmentation.
Step S2142, part-of-speech recognition is carried out on each part-of-speech sentence segmentation to obtain the part-of-speech of the corresponding part-of-speech sentence segmentation.
Step S2143, based on the part of speech of each factual sentence word, determining the keywords and non-keywords of the pre-written sentence from at least one factual sentence word.
Here, the part of speech of the factual sentence segmentation includes: nouns, verbs, pronouns, prepositions, adjectives, and the like.
In one implementation, the factual sentence fragments having parts of speech as nouns and verbs may be determined as keywords, and the factual sentence fragments other than nouns and verbs may be determined as non-keywords.
Step S2144, the non-keywords in the pre-written sentence are rewritten to obtain a rewritten sentence corresponding to the pre-written sentence.
Here, non-keywords are rewritten and keywords are retained. That is, the keywords in the pre-written statement are not changed, but only the non-keywords in the pre-written statement are modified. When the non-keyword is rewritten, the non-keyword in the pre-written sentence can be rewritten by means of the substitution of the paraphrasing words, the position adjustment of the non-keyword, and the like, so that a rewritten sentence with a different expression form from the pre-written sentence but with unchanged semantics can be obtained.
With continued reference to fig. 7, fig. 7 also shows that step S214 performs sentence writing on the pre-written sentence, which may also be implemented by the following steps S2145 to S2147:
in step S2145, in response to the fact that the determination result is the first type determination result, sentence analysis is performed on the pre-written sentence, so as to obtain a sentence pattern of the pre-written sentence.
Step S2146, a modified sentence type corresponding to the sentence type is determined.
Step S2147, according to the modified sentence pattern, sentence pattern modification is carried out on the pre-written sentence, and a rewritten sentence corresponding to the pre-written sentence is obtained.
For example, if the sentence pattern type of the pre-written sentence is "handle sentence" in the active verb so-called sentence, it is determined that the modified sentence pattern type corresponding to the sentence pattern type may be "to-be-written sentence", and thus, the pre-written sentence of the "handle sentence" type may be modified to be a rewritten sentence of the "to-be-written sentence" type.
In step S215, the server generates free sentences for the pre-written sentences in response to the fact that the determination result is the second type determination result, and obtains free generated sentences corresponding to the pre-written sentences.
Here, the rewrite term and the free-form term constitute a generation term corresponding to the pre-written term.
In some embodiments, referring to fig. 8, fig. 8 illustrates that step S215 performs free sentence generation on the pre-written sentence, which may be implemented by the following steps S2151 to S2153:
in step S2151, in response to the fact determination result being the second type determination result, the non-fact sentence in the pre-written text is subjected to shielding processing (i.e., MASK processing), so as to obtain the pre-written text after sentence shielding.
Step S2152, the pre-written text after sentence shielding is input into a pre-trained text generation model, and the sentence at the shielding position is predicted based on the sentence at the non-shielding position in the pre-written text after sentence shielding through the text generation model, so as to obtain the generated sentence at the shielding position.
Step S2153, the generation sentence of the shielding position is determined as the free generation sentence corresponding to the pre-written sentence.
In some embodiments, the Text generation model may be implemented using an autoregressive Pre-Training language model (GPT), T5 (Transfer Text-to-Text transducer), BART, or the like model structure. Wherein GPT is an autoregressive generation model, GPT generates text from front to back, and is just a decoder; and T5 and BART are the structures of Seq2Seq (Ssequence to Sequence), including an encoder that reads text input by the user bi-directionally and a decoder that decodes the feature vectors obtained by the encoder and the text generated before the decoder from left to back for output. In the embodiment of the application, in order to ensure the accuracy of the generated result, a BART model can be adopted to carry out model training so as to obtain the text generation model.
And step S216, the server performs text splicing processing on all the generated sentences to obtain scene explanation texts suitable for the current virtual scene.
In some embodiments, the text generation method according to the embodiments of the present application is used to generate a scene comment text corresponding to a current one or more operation instructions, and a large number of operation instructions are received during the running process of the game, and the running time of the game has a certain duration, so in the whole game, a section of scene comment text is generated based on the current operation instruction by adopting the text generation method according to the embodiments of the present application, a game comment is performed on a game segment in a current period of time, another section of scene comment text is generated on a game segment in a next period of time, and the cycle is repeated in this way until the acquisition of the operation instructions and the generation of the scene comment text are stopped when the whole game is completed.
Then, after obtaining the scene explanatory text, the following processing may also be performed: firstly, acquiring a historical scene explanation text before the current moment; performing text splicing according to the sequence of the historical scene explanation text, the blank text and the scene explanation text at the current moment to form a spliced text; then, inputting the spliced text into a pre-trained text generation model, and predicting sentences corresponding to the blank text based on the historical scene explanation text and the scene explanation text through the text generation model; and finally, adding sentences corresponding to the blank texts to the text positions of the blank texts in the spliced text to obtain continuous explanation texts aiming at the current virtual scene. After the continuous comment text is obtained, the continuous comment text may be determined as the comment text of the current virtual scene.
In step S217, the server performs a voice conversion process on the scene interpretation text to obtain a scene interpretation voice corresponding to the scene interpretation text.
Here, any one of the voice conversion processing methods may be employed to perform voice conversion processing on the scene explanatory text, and voices of different human colors may also be provided at the time of performing voice conversion processing. For example, a user may select a sound of a character such as a star, an expert, or a player at a client of a game application to generate a scene narrative speech having the tone of the character.
In step S218, the server transmits the scene interpretation voice to the terminal.
In step S219, the terminal fuses the scene interpretation voice to the scene video corresponding to the current virtual scene, and generates a virtual scene interpretation video.
Here, any audio/video fusion method may be adopted to fuse the scene interpretation voice into the scene video corresponding to the current virtual scene, and it should be noted that, for the scene interpretation voice, a fusion position corresponding to each voice segment in the scene interpretation voice in the scene video may be determined, and then, according to the fusion position, each voice segment is fused into a scene video frame to generate the virtual scene interpretation video.
Step S220, the terminal displays the virtual scene description video on the current interface.
According to the text generation method provided by the embodiment of the application, the pre-written texts of different operation instructions are pre-written, so that when the operation instructions are received, the pre-written texts corresponding to the operation instructions can be obtained, text processing is carried out on the pre-written texts, during text processing, the facts of each pre-written sentence in the pre-written texts are judged, and based on the facts judgment result, the pre-written sentences are rewritten or automatically generated, so that the pre-written texts can be generated into various scene explanation texts without manual processing, and therefore, the generated scene explanation texts can be suitable for any game event and have various characteristics.
For the above-mentioned text generation model, the embodiment of the present application further provides a training method for a text generation model, and fig. 9 is a schematic flow chart of the training method for a text generation model provided in the embodiment of the present application, where the training method for a text generation model may be executed by a model training module. The model training module can be a module in the text generating device (namely the electronic device), namely the model training module can be a server or a terminal; alternatively, the model training module may be another device independent of the text generating device, i.e. the model training module is different from the other electronic devices except the server and the terminal for implementing the text generating method. As shown in fig. 9, the training method of the text generation model includes the following steps S301 to S306:
in step S301, the model training module acquires a first class training sample set and a second class training sample set, where the first class training sample set includes training data of multiple general fields, and the second class training sample set includes training data of multiple virtual scene fields.
In the embodiment of the application, the data in different general fields can be obtained from different channels, for example, the data in different fields can be obtained from channels such as web pages, games, video comments and the like as training data. For the virtual scene field, data of the field can be acquired as training data through channels such as a webpage and a video corresponding to the virtual scene field, for example, for the game field, data of channels such as a game introduction webpage, a game barrage and a comment of a game video can be acquired as training data of the game field.
In some embodiments, after the first type training sample set and the second type training sample set are obtained, data preprocessing may be further performed on training data in the first type training sample set and the second type training sample set, so as to obtain a preprocessed first type training sample set and a preprocessed second type training sample set. After that, the training data in the first training sample set after preprocessing can be input into the initial text generation model, and the training data in the second training sample set after preprocessing can be input into the initial text generation model after training.
Here, data preprocessing includes, but is not limited to: illegal data rejection, repeated data de-duplication and other processes. For example, for training data obtained from a web page, the links to pictures, www in the web page may be pruned; for text content, illegal characters may be removed.
In the embodiment of the application, after the training data in the first type training sample set and the second type training sample set are subjected to data preprocessing, training data in the preprocessed first type training sample set and training data in the preprocessed second type training sample set are adopted to train the text generation model.
In step S302, the model training module obtains model parameters of the initial text generation model.
In step S303, the model training module inputs training data in the first training sample set to the initial text generating model to obtain initial model output data.
In step S304, the model training module corrects the model parameters in the initial text generating model based on the initial model output data, so as to obtain a trained initial text generating model.
In the embodiment of the application, training data in the preprocessed first-class training sample set can be input into the initial text generation model, and the generated text, namely initial model output data, is output through the initial text generation model; and then, determining whether model parameters in the current initial text generation model are reasonable or not through analyzing texts corresponding to the initial model output data, and whether the initial text generation model needs to be trained continuously or not, namely whether the model parameters in the initial text generation model need to be modified or not, so that the initial text generation model can output texts more conforming to human language habits, namely, the text output by the initial text generation model is guaranteed to have smoothness. Training the initial text generation model through a plurality of training data loops in the preprocessed first-class training sample set to obtain a final trained initial text generation model.
In step S305, the model training module inputs training data in the second training sample set to the trained initial text generating model, so as to obtain training model output data.
In step S306, the model training module corrects the model parameters in the trained initial text generation model based on the training model output data, so as to obtain the trained text generation model.
In the embodiment of the application, training data in the preprocessed second-type training sample set can be input into the trained initial text generation model, and the generated text, namely training model output data, is output through the trained initial text generation model; then, by analyzing the text corresponding to the output data of the training model, whether the model parameters in the current trained initial text generation model are reasonable or not is determined, whether the training of the trained initial text generation model is needed to be continued or not is determined, namely whether the model parameters in the trained initial text generation model are needed to be modified or not is determined, so that the trained initial text generation model can output the output text which accords with human language habits better in the virtual scene field, namely, the text output by the trained initial text generation model is ensured to be the text in the virtual scene field and has smoothness. Training the trained initial text generation model through a plurality of training data loops in the preprocessed second training sample set to obtain a final trained text generation model.
According to the text generation model training method provided by the embodiment of the application, the initial text generation model is trained sequentially through the first type training sample set in the general field and the second type training sample set in the virtual scene field, so that the text output by the trained text generation model is ensured to have higher smoothness and be suitable for the field of generating virtual scenes.
In some embodiments, after the initial text generation model is trained through the first type training sample set in the general field and the second type training sample set in the virtual scene field, the text generation model can be further trained by adopting a pre-built rewrite prediction, the rewrite prediction can ensure that the semantics of the fact sentence are not changed when the trained text generation model rewrites the fact sentence, and the sentence smoothness and the fact of the rewritten fact sentence can be ensured.
Here, the rewrite expectation may be training data composed of a plurality of factual sentences, that is, the number of rewrite predictions is a plurality of factual sentences. Training data formed by each factual sentence can be input into a text generation model trained by a first training sample set in the general field and a second training sample set in the virtual scene field, and the factual sentences are rewritten by the text generation model, so that rewritten sentences are output; and then analyzing the rewritten sentence to realize training of the text generation model on the rewritten facts, and obtaining the trained text generation model.
In the embodiment of the application, when the pre-written text after sentence shielding is input into a pre-trained text generation model, and the sentence at the shielding position is predicted by the text generation model based on the sentence at the non-shielding position in the pre-written text after sentence shielding, the sentence at the shielding position is predicted by the trained text generation model obtained by orderly training through the first training sample set, the second training sample set and the rewriting and prediction, so that the generated sentence at the shielding position is obtained, and the generated sentence has the smoothness and the facts. Meanwhile, the generation statement is freely generated, so that the generation statement also has diversity.
In the following, an exemplary application of the embodiment of the present application in a practical application scenario will be described.
The embodiment of the application provides a Text Generation method, which adopts a controllable Text Generation (Controllable Text Generation) technology of Text Generation (Text Generation) in deep learning, takes a manual writing operation as input, and adopts a proposed algorithm or model as constraint to generate more writing operations. The following three characteristics are satisfied for the generated session: compliance, fact-consistent rows and variability. The smoothness is the most basic property, and the generated text meets the reading habit of human beings, is smooth and reasonable to read, has no grammar error, and is ensured by using a powerful language model; the fact consistency refers to that the generated text needs to be completely consistent with the input text, and new contents cannot be generated to meet the application of game explanation scenes, and the embodiment of the application carries out structural rewrite (paraphrase) and replacement of a paraphrasing fact part to maintain the fact consistency as possible to the greatest extent; variability refers to the fact that the text generated and the input speech are as different as possible, such as structural and word differences, and the generation of sentences without adding facts, etc. The embodiment of the application relies on a large end-to-end pre-training model, adopts a rewriting technology and a masking technology (namely shielding treatment) and realizes the generation of a fact-controllable game speaking (namely scene explanation text).
The embodiment of the application is a model and algorithm scheme, and the functions of the model and the algorithm are to provide a better text rewriting and generating model for intelligent explanation of a multiplayer online tactical competition game (MOBA, multiplayer Online Battle Arena) type game.
The embodiment of the application can be used as an iteration scheme for the conversation rewrite in an AI explanation system of any MOBA game. The solutions in the related art are explanatory solutions based entirely on artificial speech, which are very costly to produce new speech, require writers to have knowledge of games, have better Chinese knowledge, etc., and the number of labels for each event is typically ten speech or even less. Fig. 10 is a functional schematic diagram of a game according to an embodiment of the present application. As can be seen from the view of figure 10,
the real-time battle situation of the game is converted into the feature vector in fig. 10, so as to obtain the game feature 1001, then the game feature 1001 triggers the predefined event 1002, and the predefined event 1002 randomly selects the corresponding manual script 1003 (corresponding to the above pre-written text), and the scheme provided by the embodiment of the application rewrites and generates the manual script 1003, so that more narratives 1004 can be generated quickly and at lower cost, and the scene narrative text can be obtained.
The game speech generation algorithm (namely the text generation method) provided by the embodiment of the application is an end-to-end learning algorithm, the input of the model is a manual speech writing, namely a text segment describing a game event, the output is another text segment, and the ideal output text has the following properties: compliance, fact-consistent rows and variability. FIG. 11 is a flowchart of a game ticket operation generation algorithm provided by an embodiment of the present application, where the embodiment of the present application uses the language modeling capability of a large model to ensure the smoothness of the generated text; training the model by utilizing the rewritten corpus, so that the model has the rewriting capability; and (3) respectively processing the manually written text by using a fact judgment model, and writing the non-fact part, so that the generated sentences have diversity.
As shown in FIG. 11, for the smoothness, the embodiment of the present application uses the strong language modeling capability of the large pre-training model, and first pre-trains the BART model 1102 by using more training data in the general field (i.e. pre-training expectation 1101 in FIG. 11), so that the BART has the understanding capability to the common text, and the trained BART model is obtained. And then, the game expectation 1103 such as comment data and barrage data in the game field is utilized to continue training the model (namely, training the trained BART model by adopting the data in the game field), so that the trained BART model 1104 can better understand the data in the game field, and better ensure that the trained BART model 1104 (corresponding to the text generation model) can understand and hold the special knowledge in the game field.
Regarding facts, facts are extremely important for game commentary, such as that a character does not appear in the game itself, but appears in the art, and it is fatal for game commentary. The embodiment of the application only carries out the most conservative rewriting on the actual sentences in the manual writing, namely only changes part of words of the manual writing, changes sentence patterns of the manual writing and the like, and does not change any semantics. The BART model may be trained by rewriting the expectation 1105, where the BART model may be the BART model 1102 or the trained BART model 1104. That is, the training method in fig. 11 may be used to train the BART model 1102 to obtain a trained BART model; then, the trained BART model is retrained by the training method of fig. 11 for compliance. Alternatively, the training method in fig. 11 may be used to train the BART model 1102 for the smoothness to obtain a trained BART model; the trained BART model is then retrained for facts using the training method of fig. 11.
For game commentary, if only sentence patterns and partial words are changed, diversity becomes very limited, and the game participant feels very boring after hearing a few commentary. In order to make the scene description text generated by the model more diverse, the embodiment of the application firstly makes a sentence division on the game expectation 1106 to obtain a plurality of sentences, then makes a factual judgment on each sentence by adopting the factual judgment model 1107, rewrites any sentence if the sentence is a factual sentence, and adopts a Masking (MASK) strategy 1108 if the sentence is a non-factual sentence, so that the BART model is free to generate a non-factual part. Through these two operations, the BART model can be further trained on diversity.
In the actual text generation stage, manual speech is divided into a plurality of sentences (corresponding to the pre-written sentences), then each sentence is subjected to facts judgment, if any sentence is a facts sentence, the sentence is rewritten, and if the sentence is a non-facts sentence, a Masking (MASK) strategy is adopted, so that a text generation model (namely a trained BART model) is free to generate a non-facts part. Through the two operations, the scene explanation text generated by the embodiment of the application maintains the facts of manual writing and the generated scene explanation text has diversity.
Fig. 12 is a schematic diagram of a model reasoning process of a text generation model provided in an embodiment of the present application, as shown in fig. 12, in a reasoning stage after training is completed, a user only needs to input a manually written text 1201 (corresponding to the above pre-written text), the text 1201 is firstly divided into sentences, then input into a fact judgment model 1202 to make a fact judgment, and finally, through overwriting of the fact sentences and writing of non-fact sentences, overwriting and generation of game speech are completed. As shown in fig. 12, the text 1201 is first divided into two sentences, then the first sentence (the bold text in fig. 12) is determined as a factual sentence, the second sentence is a non-factual sentence, the BART model 1203 (i.e., the text generation model) rewrites the factual sentence, the non-factual sentence is MASK-processed, and the BART model 1203 automatically generates a non-factual part, so that the factual sentence is preserved, and the variety of the generated speech is increased in the renewal of the non-factual sentence, thereby generating the final scene interpretation text 1204.
Aiming at the problem of insufficient diversity of the existing game artificial intelligence explanation, the embodiment of the application provides an end-to-end speaking operation generation scheme with low price and high flexibility, and the scheme has the greatest characteristics that modeling capability of a language model is utilized, and smoothness of speaking operation generation is ensured; the fact consistency generated by speech surgery is ensured by rewriting a corpus training model; by utilizing the renewal of the non-factual sentences, the generated speech operation is ensured to have certain diversity under the condition of the fact guarantee. The embodiment of the application is a very robust and reliable scheme capable of landing the language model on the game commentary, the problems of high price and few samples of manual commentary are solved, the game commentary is enriched, the commentary is more various and interesting, and the commentary process can attract audience more.
It should be noted that, in one implementation manner, the technical solution of the embodiment of the present application needs to rely on a fact determination model to perform a fact determination on a manual writing, and use a rewrite model to rewrite a fact sentence, and use a corpus in the game field and a masking policy to perform a continuation. In another implementation manner, an end-to-end training mode can be provided, starting from data collection, arrangement and labeling, and training can be performed on the models by utilizing the characteristic that the latest diffusion model is more accurate, so that error accumulation of a plurality of models is reduced, and the accuracy of the generated scene interpretation text is further ensured.
It may be appreciated that in the embodiment of the present application, the content of the user information, for example, the operation instruction of the player, the pre-written text corresponding to each operation instruction, the scene description text, and the like, if the data related to the user information or the enterprise information is related, when the embodiment of the present application is applied to a specific product or technology, the user permission or consent needs to be obtained, and the related data collection process should obtain the informed consent or the independent consent of the personal information body in the instance application strictly according to the requirements of the relevant national laws and regulations, and develop the subsequent data use and processing behavior within the authorized range of the laws and regulations and the personal information body.
Continuing with the description below, the text generating device 354 provided in an embodiment of the present application is implemented as an exemplary structure of a software module, and in some embodiments, as shown in fig. 4, the text generating device 354 includes: an obtaining module 3541, configured to obtain, in response to receiving an operation instruction for a current virtual scene, a pre-written text corresponding to the operation instruction; the text splitting module 3542 is configured to perform text splitting on the pre-written text to obtain a plurality of pre-written sentences; a facts judging module 3543, configured to perform facts judgment on each of the pre-written sentences to obtain facts judgment results of the corresponding pre-written sentences; a text processing module 3544, configured to perform text processing on the pre-written sentences based on the facts determination result, to obtain a generated sentence corresponding to each pre-written sentence; and the text splicing module 3545 is configured to perform text splicing processing on all the generated sentences to obtain a scene explanation text applicable to the current virtual scene.
In some embodiments, the acquisition module is further to: acquiring an operation instruction aiming at a current virtual scene; if the number of the operation instructions is one, determining an event identifier of a virtual scene event corresponding to the operation instructions; acquiring a pre-written text corresponding to the event identifier from a pre-set conversation library; if the number of the operation instructions is multiple, acquiring the priority of each operation instruction; determining an event identifier of a virtual scene event corresponding to the operation instruction with the highest priority; and acquiring a pre-written text corresponding to the event identifier from a pre-arranged speech library.
In some embodiments, in the preset voice library, each of the event identifications corresponds to a plurality of pre-written texts; the acquisition module is further configured to: acquiring the plurality of pre-written texts corresponding to the event identification from the preset conversation library; and randomly selecting one pre-written text from the plurality of pre-written texts as the pre-written text corresponding to the operation instruction.
In some embodiments, the text splitting module is further to: performing text symbol recognition on the pre-written text to obtain at least one symbol mark in the pre-written text; determining a symbol identifier belonging to the same type as a preset symbol identifier from the at least one symbol identifier as a target symbol identifier; and carrying out text splitting on the pre-written text based on the target symbol mark to obtain a plurality of pre-written sentences.
In some embodiments, the factual decision is a factual classification process; the facts determination module is further configured to: carrying out the fact classification processing on each pre-written sentence through a pre-trained fact judgment model to obtain a fact judgment result of the pre-written sentence, wherein the fact judgment result comprises the following steps: the corresponding pre-written statement is a first type of decision result of the factual statement and the corresponding pre-written statement is a second type of decision result of the non-factual statement.
In some embodiments, the text processing module is further to: responding to the fact judgment result as the first type judgment result, and carrying out sentence rewriting on the pre-written sentence to obtain a rewritten sentence corresponding to the pre-written sentence; responding to the fact judging result as the second type judging result, and performing free sentence generation on the pre-written sentences to obtain free generation sentences corresponding to the pre-written sentences; wherein the rewrite sentence and the free-form sentence constitute the generation sentence.
In some embodiments, the text processing module is further to: responding to the fact judgment result as the first type judgment result, and performing word segmentation processing on the pre-written sentence to obtain at least one fact sentence word segmentation; performing part-of-speech recognition on each fact sentence segmentation to obtain the part-of-speech of the corresponding fact sentence segmentation; determining keywords and non-keywords of the pre-written sentence from the at least one factual sentence word based on the part of speech of each factual sentence word; and (3) rewriting the non-keywords in the pre-written sentences to obtain the rewritten sentences corresponding to the pre-written sentences.
In some embodiments, the text processing module is further to: responding to the fact judgment result as the first type judgment result, and carrying out sentence analysis on the pre-written sentence to obtain the sentence pattern type of the pre-written sentence; determining a modified sentence type corresponding to the sentence type; and carrying out sentence modification on the pre-written sentence according to the modified sentence type to obtain a rewritten sentence corresponding to the pre-written sentence.
In some embodiments, the text processing module is further to: responding to the fact judging result as the second type judging result, and carrying out shielding treatment on non-fact sentences in the pre-written text to obtain the pre-written text after sentence shielding; inputting the pre-written text after the sentence shielding into a pre-trained text generation model, and predicting sentences at shielding positions based on sentences at non-shielding positions in the pre-written text after the sentence shielding by the text generation model to obtain generated sentences at the shielding positions; and determining the generation statement of the shielding position as a free generation statement corresponding to the pre-written statement.
In some embodiments, the apparatus further comprises: a model training module for training the text generation model by: acquiring a first type training sample set and a second type training sample set, wherein the first type training sample set comprises training data of a plurality of general fields, and the second type training sample set comprises training data of a plurality of virtual scene fields; obtaining model parameters of an initial text generation model; inputting training data in the first training sample set into the initial text generation model to obtain initial model output data; correcting model parameters in the initial text generation model based on the initial model output data to obtain a trained initial text generation model; inputting training data in the second training sample set into the trained initial text generation model to obtain training model output data; and correcting model parameters in the trained initial text generation model based on the training model output data to obtain a trained text generation model.
In some embodiments, the model training module is further to: performing data preprocessing on the training data in the first type training sample set and the second type training sample set to obtain a preprocessed first type training sample set and a preprocessed second type training sample set; correspondingly, training data in the preprocessed first-type training sample set is input into the initial text generation model, and training data in the preprocessed second-type training sample set is input into the trained initial text generation model.
In some embodiments, the apparatus further comprises: the historical text acquisition module is used for acquiring a historical scene explanation text before the current moment; the different text splicing modules are used for carrying out text splicing according to the sequence of the historical scene explanation text, the blank text and the scene explanation text at the current moment to form spliced texts; a blank text prediction module, configured to input the spliced text into a pre-trained text generation model, and predict, through the text generation model, a sentence corresponding to the blank text based on the historical scene explanation text and the scene explanation text; and the sentence adding module is used for adding sentences corresponding to the blank texts to the text positions of the blank texts in the spliced text to obtain continuous explanation texts aiming at the current virtual scene.
It should be noted that, the description of the apparatus according to the embodiment of the present application is similar to the description of the embodiment of the method described above, and has similar beneficial effects as the embodiment of the method, so that a detailed description is omitted. For technical details not disclosed in the present apparatus embodiment, please refer to the description of the method embodiment of the present application for understanding.
Embodiments of the present application provide a computer program product comprising executable instructions, the executable instructions being a computer instruction; the executable instructions are stored in a computer readable storage medium. The executable instructions, when read from the computer readable storage medium by a processor of an electronic device, when executed by the processor, cause the electronic device to perform the method of embodiments of the present application described above.
Embodiments of the present application provide a storage medium having stored therein executable instructions which, when executed by a processor, cause the processor to perform a method provided by embodiments of the present application, for example, as shown in fig. 5.
In some embodiments, the storage medium may be a computer readable storage medium, such as a ferroelectric Memory (FRAM, ferromagnetic Random Access Memory), read Only Memory (ROM), programmable Read Only Memory (PROM, programmable Read Only Memory), erasable programmable Read Only Memory (EPROM, erasable Programmable Read Only Memory), charged erasable programmable Read Only Memory (EEPR OM, electrically Erasable Programmable Read Only Memory), flash Memory, magnetic surface Memory, optical Disk, or Compact Disk-Read Only Memory (CD-ROM), among others; but may be a variety of devices including one or any combination of the above memories.
In some embodiments, the executable instructions may be in the form of programs, software modules, scripts, or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and they may be deployed in any form, including as stand-alone programs or as modules, components, subroutines, or other units suitable for use in a computing environment.
As an example, the executable instructions may, but need not, correspond to files in a file system, may be stored as part of a file that holds other programs or data, for example, in one or more scripts in a hypertext markup language (HTML, hyper Text Mar kup Language) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code). As an example, executable instructions may be deployed to be executed on one electronic device or on multiple electronic devices located at one site or, alternatively, on multiple electronic devices distributed across multiple sites and interconnected by a communication network.
The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement, etc. made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (15)

1. A method of text generation, the method comprising:
responding to receiving an operation instruction aiming at a current virtual scene, and acquiring a pre-written text corresponding to the operation instruction;
performing text splitting on the pre-written text to obtain a plurality of pre-written sentences;
carrying out factual judgment on each pre-written sentence to obtain a factual judgment result of the corresponding pre-written sentence;
based on the facts judging result, carrying out text processing on the pre-written sentences to obtain a generated sentence corresponding to each pre-written sentence;
and performing text splicing processing on all the generated sentences to obtain scene explanation texts suitable for the current virtual scene.
2. The method of claim 1, wherein the obtaining, in response to receiving an operation instruction for a current virtual scene, a pre-written text corresponding to the operation instruction comprises:
Acquiring an operation instruction aiming at a current virtual scene;
if the number of the operation instructions is one, determining an event identifier of a virtual scene event corresponding to the operation instructions; acquiring a pre-written text corresponding to the event identifier from a pre-set conversation library;
if the number of the operation instructions is multiple, acquiring the priority of each operation instruction;
determining an event identifier of a virtual scene event corresponding to the operation instruction with the highest priority; and acquiring a pre-written text corresponding to the event identifier from a pre-arranged speech library.
3. The method of claim 2, wherein each of the event identifications corresponds to a plurality of pre-written texts in the pre-arranged speech library; the obtaining the pre-written text corresponding to the event identifier from the pre-arranged speech library comprises the following steps:
acquiring the plurality of pre-written texts corresponding to the event identification from the preset conversation library;
and randomly selecting one pre-written text from the plurality of pre-written texts as the pre-written text corresponding to the operation instruction.
4. The method of claim 1, wherein the text splitting the pre-written text to obtain a plurality of pre-written sentences comprises:
Performing text symbol recognition on the pre-written text to obtain at least one symbol mark in the pre-written text;
determining a symbol identifier belonging to the same type as a preset symbol identifier from the at least one symbol identifier as a target symbol identifier;
and carrying out text splitting on the pre-written text based on the target symbol mark to obtain a plurality of pre-written sentences.
5. The method of claim 1, wherein the factual decision is a factual classification process;
carrying out factual judgment on each pre-written sentence to obtain a factual judgment result of the corresponding pre-written sentence, wherein the method comprises the following steps:
carrying out the fact classification processing on each pre-written sentence through a pre-trained fact judgment model to obtain a fact judgment result of the pre-written sentence, wherein the fact judgment result comprises the following steps: the corresponding pre-written statement is a first type of decision result of the factual statement and the corresponding pre-written statement is a second type of decision result of the non-factual statement.
6. The method of claim 5, wherein the text processing the pre-written sentences based on the factual determination results to obtain a generated sentence corresponding to each pre-written sentence, comprises:
Responding to the fact judgment result as the first type judgment result, and carrying out sentence rewriting on the pre-written sentence to obtain a rewritten sentence corresponding to the pre-written sentence;
responding to the fact judging result as the second type judging result, and performing free sentence generation on the pre-written sentences to obtain free generation sentences corresponding to the pre-written sentences; wherein the rewrite sentence and the free-form sentence constitute the generation sentence.
7. The method of claim 6, wherein the writing the pre-written sentence in response to the fact determination result being the first type determination result, to obtain a written sentence corresponding to the pre-written sentence, comprises:
responding to the fact judgment result as the first type judgment result, and performing word segmentation processing on the pre-written sentence to obtain at least one fact sentence word segmentation;
performing part-of-speech recognition on each fact sentence segmentation to obtain the part-of-speech of the corresponding fact sentence segmentation;
determining keywords and non-keywords of the pre-written sentence from the at least one factual sentence word based on the part of speech of each factual sentence word;
And (3) rewriting the non-keywords in the pre-written sentences to obtain the rewritten sentences corresponding to the pre-written sentences.
8. The method of claim 6, wherein the writing the pre-written sentence in response to the fact determination result being the first type determination result, to obtain a written sentence corresponding to the pre-written sentence, comprises:
responding to the fact judgment result as the first type judgment result, and carrying out sentence analysis on the pre-written sentence to obtain the sentence pattern type of the pre-written sentence;
determining a modified sentence type corresponding to the sentence type;
and carrying out sentence modification on the pre-written sentence according to the modified sentence type to obtain a rewritten sentence corresponding to the pre-written sentence.
9. The method of claim 6, wherein the performing free sentence generation on the pre-written sentence in response to the fact determination result being the second type determination result, to obtain a free-generated sentence corresponding to the pre-written sentence, comprises:
responding to the fact judging result as the second type judging result, and carrying out shielding treatment on non-fact sentences in the pre-written text to obtain the pre-written text after sentence shielding;
Inputting the pre-written text after the sentence shielding into a pre-trained text generation model, and predicting sentences at shielding positions based on sentences at non-shielding positions in the pre-written text after the sentence shielding by the text generation model to obtain generated sentences at the shielding positions;
and determining the generation statement of the shielding position as a free generation statement corresponding to the pre-written statement.
10. The method according to claim 9, wherein the method further comprises: training the text generation model by:
acquiring a first type training sample set and a second type training sample set, wherein the first type training sample set comprises training data of a plurality of general fields, and the second type training sample set comprises training data of a plurality of virtual scene fields;
obtaining model parameters of an initial text generation model;
inputting training data in the first training sample set into the initial text generation model to obtain initial model output data;
correcting model parameters in the initial text generation model based on the initial model output data to obtain a trained initial text generation model;
Inputting training data in the second training sample set into the trained initial text generation model to obtain training model output data;
and correcting model parameters in the trained initial text generation model based on the training model output data to obtain a trained text generation model.
11. The method according to claim 10, wherein the method further comprises:
performing data preprocessing on the training data in the first type training sample set and the second type training sample set to obtain a preprocessed first type training sample set and a preprocessed second type training sample set;
correspondingly, training data in the preprocessed first-type training sample set is input into the initial text generation model, and training data in the preprocessed second-type training sample set is input into the trained initial text generation model.
12. The method according to any one of claims 1 to 11, further comprising:
acquiring a historical scene explanation text before the current moment;
performing text splicing according to the sequence of the historical scene explanation text, the blank text and the scene explanation text at the current moment to form a spliced text;
Inputting the spliced text into a pre-trained text generation model, and predicting sentences corresponding to the blank text based on the historical scene explanation text and the scene explanation text through the text generation model;
and adding sentences corresponding to the blank texts to the text positions of the blank texts in the spliced text to obtain continuous explanation texts aiming at the current virtual scene.
13. A text generation apparatus, the apparatus comprising:
the acquisition module is used for responding to the received operation instruction aiming at the current virtual scene and acquiring a pre-written text corresponding to the operation instruction;
the text splitting module is used for splitting the text of the pre-written text to obtain a plurality of pre-written sentences;
the facts judging module is used for carrying out facts judgment on each pre-written statement to obtain facts judging results of the corresponding pre-written statement;
the text processing module is used for carrying out text processing on the pre-written sentences based on the facts judgment result to obtain a generated sentence corresponding to each pre-written sentence;
and the text splicing module is used for carrying out text splicing processing on all the generated sentences to obtain scene explanation texts suitable for the current virtual scene.
14. An electronic device, comprising:
a memory for storing executable instructions; a processor for implementing the text generation method of any of claims 1 to 12 when executing executable instructions stored in the memory.
15. A computer readable storage medium, characterized in that executable instructions are stored for causing a processor to execute the executable instructions for implementing the text generation method of any of claims 1 to 12.
CN202310532220.8A 2023-05-11 2023-05-11 Text generation method, text generation device, electronic equipment and computer readable storage medium Pending CN116956019A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310532220.8A CN116956019A (en) 2023-05-11 2023-05-11 Text generation method, text generation device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310532220.8A CN116956019A (en) 2023-05-11 2023-05-11 Text generation method, text generation device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN116956019A true CN116956019A (en) 2023-10-27

Family

ID=88453694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310532220.8A Pending CN116956019A (en) 2023-05-11 2023-05-11 Text generation method, text generation device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN116956019A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117316159A (en) * 2023-11-30 2023-12-29 深圳市天之眼高新科技有限公司 Vehicle voice control method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117316159A (en) * 2023-11-30 2023-12-29 深圳市天之眼高新科技有限公司 Vehicle voice control method, device, equipment and storage medium
CN117316159B (en) * 2023-11-30 2024-01-26 深圳市天之眼高新科技有限公司 Vehicle voice control method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110717017B (en) Method for processing corpus
Singh et al. Survey of various AI chatbots based on technology used
US8285654B2 (en) Method and system of providing a personalized performance
CN116702737B (en) Document generation method, device, equipment, storage medium and product
CN110234018B (en) Multimedia content description generation method, training method, device, equipment and medium
JP2017517028A (en) Method and system for handling dialogue with robots
Spierling et al. Towards accessible authoring tools for interactive storytelling
McTear et al. Creating a conversational interface using chatbot technology
JP2021009665A (en) Method, apparatus, and device for generating file, and storage medium
Tingiris et al. Exploring GPT-3: An unofficial first look at the general-purpose language processing API from OpenAI
US11036996B2 (en) Method and apparatus for determining (raw) video materials for news
CN113392331A (en) Text processing method and equipment
CN116700839B (en) Task processing method, device, equipment, storage medium and program product
CN115221294A (en) Dialogue processing method, dialogue processing device, electronic equipment and storage medium
CN116956019A (en) Text generation method, text generation device, electronic equipment and computer readable storage medium
CN117216234A (en) Artificial intelligence-based speaking operation rewriting method, device, equipment and storage medium
CN111553138A (en) Auxiliary writing method and device for standardizing content structure document
Devi et al. ChatGPT: Comprehensive Study On Generative AI Tool
Kublik et al. GPT-3: The Ultimate Guide to Building NLP Products with OpenAI API
CN113609866A (en) Text marking method, device, equipment and storage medium
CN116956902A (en) Text rewriting method, device, equipment and computer readable storage medium
CN117453880A (en) Multi-mode data processing method and device, electronic equipment and storage medium
CN117453885A (en) Question information processing method, device, equipment, storage medium and product
Kulkarni et al. Applied Generative AI for Beginners
CN112749553B (en) Text information processing method and device for video file and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40100497

Country of ref document: HK