CN113822034B

CN113822034B - Method, device, computer equipment and storage medium for replying text

Info

Publication number: CN113822034B
Application number: CN202110630068.8A
Authority: CN
Inventors: 闫昭; 刘昊岩; 周辉阳
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-06-07
Filing date: 2021-06-07
Publication date: 2024-04-19
Anticipated expiration: 2041-06-07
Also published as: CN113822034A

Abstract

The application provides a method, a device, computer equipment and a storage medium for replying texts, which can be applied to the field of cloud computing or artificial intelligence and are used for solving the problem of low accuracy of the replying texts. The method comprises the following steps: obtaining a text to be repeated; analyzing the text to be repeated based on a preset analysis strategy to obtain grammar structure information and semantic information of the text to be repeated; determining a text composition template of the text to be repeated based on the grammar structure information, and screening at least one target text template meeting a preset matching condition with the text composition template from a pre-stored candidate text template set; and respectively combining the semantic information with the at least one target text template to obtain at least one target reproduction text of the text to be reproduced.

Description

Method, device, computer equipment and storage medium for replying text

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a computer device, and a storage medium for replying text.

Background

With the continuous development of technology, devices can perform more and more intelligent tasks. For example, the device may generate output text for the received input text in response thereto. However, due to the diversity of language expressions, the expressions of the input text received by the device may be varied, and for some expressions, the device may not recognize the semantics expressed by the input text and thus may not reply accordingly.

In order to enable the device to recognize more expression modes of the input text, the device can repeat the input text after obtaining the input text, and the obtained repeated text is used for determining the repeated text which can be recognized by the device, so that the output text is generated for corresponding reply.

The repeated text obtained according to the conventional manner of repeated text may not be a usual expression. For example, by synonymously replacing keywords in an input text, the grammatical structure of the input text cannot be changed, and the replaced synonyms may not be commonly used with other words in the input text, so that the obtained repeated text is not a commonly used expression. It can be seen that the accuracy of the repeated text is low.

Disclosure of Invention

The embodiment of the application provides a method, a device, computer equipment and a storage medium for replying texts, which are used for solving the problem of low accuracy of the replying texts.

In a first aspect, a method of reproducing text is provided, comprising:

obtaining a text to be repeated;

Analyzing the text to be repeated based on a preset analysis strategy to obtain grammar structure information and semantic information of the text to be repeated;

Determining a text composition template of the text to be repeated based on the grammar structure information, and screening at least one target text template meeting a preset matching condition with the text composition template from a pre-stored candidate text template set;

and respectively combining the semantic information with the at least one target text template to obtain at least one target reproduction text of the text to be reproduced.

In a second aspect, there is provided an apparatus for reproducing text, comprising:

the acquisition module is used for: for obtaining text to be repeated;

the processing module is used for: the method comprises the steps of analyzing the text to be repeated based on a preset analysis strategy to obtain grammar structure information and semantic information of the text to be repeated;

The processing module is further configured to: determining a text composition template of the text to be repeated based on the grammar structure information, and screening at least one target text template meeting a preset matching condition with the text composition template from a pre-stored candidate text template set;

The processing module is further configured to: and respectively combining the semantic information with the at least one target text template to obtain at least one target reproduction text of the text to be reproduced.

Optionally, the acquiring module is specifically configured to:

responding to input operation of a target client to obtain at least one piece of input information;

for the at least one input information, the following operations are performed:

If one of the at least one input message is text message, word segmentation processing is carried out on the text message, so as to obtain a plurality of corresponding input sub-texts;

if the input information is multimedia information, performing text recognition processing on the multimedia information to obtain at least one input sub-text;

And combining the obtained input sub-texts according to a preset grammar rule to obtain the text to be repeated.

Optionally, the processing module is further configured to:

Analyzing the text to be repeated based on a preset analysis strategy, and acquiring a preset reference text set before acquiring grammar structure information and semantic information of the text to be repeated;

Respectively performing word segmentation processing on each reference text in the reference text set to obtain a plurality of reference sub-texts corresponding to each reference text;

and respectively counting the occurrence frequency of different reference sub-texts in the obtained reference sub-texts to obtain a first mapping relation between the reference sub-texts and word frequencies.

Optionally, the processing module is specifically configured to:

Word segmentation processing is carried out on the text to be repeated to obtain corresponding sub-texts to be repeated;

Based on the first mapping relation, determining the word frequency of each sub-text to be repeated respectively;

screening target sub-texts with word frequency meeting preset word frequency screening conditions from the sub-texts to be repeated, and analyzing the obtained target sub-texts to obtain grammar structure information of the text to be repeated;

And analyzing the sub-texts to be repeated except the screened target sub-text in the sub-texts to be repeated to obtain semantic information of the text to be repeated.

Optionally, the processing module is specifically configured to:

Analyzing the part-of-speech information of each sub-text to be repeated and the association relation of each sub-text to be repeated in the text to be repeated to obtain grammar structure information and semantic information of the text to be repeated.

Optionally, the processing module is specifically configured to:

determining the sub-text grade corresponding to the part-of-speech information of each sub-text to be repeated based on a second mapping relation between the preset part-of-speech information and the sub-text grade;

screening target sub-texts with sub-text grades within a preset grade range and no association relation with the appointed sub-text grades based on the association relation of each sub-text to be repeated in the text to be repeated, and analyzing the obtained target sub-texts to obtain grammar structure information of the text to be repeated;

Optionally, the processing module is further configured to:

Before at least one target text template meeting a preset matching condition between the template and the text composition template is screened out from a pre-stored candidate text template set, a pre-stored sample text set is obtained, wherein the sample text set comprises a plurality of sample texts with the same repeated relation;

determining respective sample text templates of the respective sample texts based on respective grammar structure information of the respective sample texts;

And establishing a candidate text template set based on the respective sample text templates of the respective sample texts and the respective repeated marks of the respective sample texts, wherein the same repeated marks exist among a plurality of candidate text templates which are in repeated relation in the candidate text templates contained in the candidate text template set.

Optionally, the processing module is specifically configured to:

Extracting template feature vectors of the text composition templates and template feature vectors of the candidate text templates;

Respectively determining the template feature vectors of the text composition templates, and obtaining the vector similarity between the template feature vectors of the candidate text templates and the template feature vectors of the candidate text templates to obtain corresponding similarity results;

and screening out candidate text templates with the same repeated marks in the candidate text templates corresponding to the maximum similarity result from the candidate text template set based on the obtained similarity result, and taking the candidate text templates as target text templates.

Optionally, the processing module is further configured to:

After the semantic information and the at least one target text template are respectively combined to obtain at least one target reproduction text of the text to be reproduced, the sample text set and the candidate text template set are updated based on the text to be reproduced and the at least one target text template.

Optionally, the processing module is specifically configured to:

Adopting a trained template recognition model, and fusing the text characteristics of the text to be repeated and the template characteristics of the template formed by the text to obtain a text characteristic vector of the text to be repeated;

respectively matching the text feature vector of the text to be repeated with the template feature vector of each candidate text template to obtain a corresponding matching result;

and screening candidate text templates with the matching results meeting preset matching conditions based on the obtained matching results, and taking the candidate text templates as target text templates.

Optionally, the processing module is specifically configured to:

for each target text template in the at least one target text template, respectively performing the following operations:

Determining a plurality of sub-text combination strategies corresponding to a target text template based on template characteristics of the target text template by adopting a trained text prediction model;

combining the semantic information with the target text template by adopting the obtained sub-text combination strategies to obtain corresponding combination results;

And screening out target combination results with the combination results meeting preset combination conditions based on the obtained combination results, and taking the target combination results as the target repeated text.

Optionally, the processing module is specifically configured to:

Acquiring a pre-stored reference semantic sub-text set;

Screening at least one reference semantic sub-text meeting preset semantic matching conditions with the semantic information from the reference semantic sub-text set based on the semantic information;

And combining the at least one reference semantic sub-text with the target text template to obtain a corresponding combination result.

In a third aspect, there is provided a computer device comprising:

A memory for storing program instructions;

and a processor for calling program instructions stored in the memory and executing the method according to the first aspect according to the obtained program instructions.

In a fourth aspect, there is provided a computer-readable storage medium storing computer-executable instructions for causing a computer to perform the method of the first aspect.

In the embodiment of the application, the grammar structure information and the semantic information of the text to be repeated are obtained by analyzing the text to be repeated, and at least one target text template is screened out from the candidate text template set based on the grammar structure information, so that a plurality of reasonable common expression modes of the text to be repeated in the grammar structure angle are obtained instead of simple synonym substitution or word order conversion. After the target text template is obtained, combining the obtained target text templates, combining semantic information with each target text template to enable the semantic information to be matched with the target text templates, wherein the obtained target repeated text accords with the current language environment, word collocation errors and the like caused by synonym replacement and the like are avoided, and the problem that expression modes caused by language order transformation do not accord with the current language environment is avoided. And determining the target repeated text from two angles of a grammar structure and semantics, simulating an artificial real repeated scene, and greatly improving the accuracy of the repeated text.

Drawings

FIG. 1 is a schematic diagram of a method for replying text according to an embodiment of the present application;

fig. 2 is an application scenario of a method for replying text provided in an embodiment of the present application;

FIG. 3 is a schematic flow chart of a method for replying text provided in an embodiment of the present application;

FIG. 4a is a schematic diagram II of a method for replying text according to an embodiment of the present application;

FIG. 4b is a schematic diagram III of a method for replying to text according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a method for replying text according to an embodiment of the present application;

FIG. 6a is a schematic diagram of a method for replying to text according to an embodiment of the present application;

FIG. 6b is a schematic diagram of a method for replying to text according to an embodiment of the present application;

FIG. 6c is a schematic diagram seven of a method for replying to text provided by an embodiment of the present application;

FIG. 6d is a schematic diagram eight of a method for replying to text provided by an embodiment of the present application;

FIG. 6e is a schematic diagram nine of a method for replying to text provided by an embodiment of the present application;

FIG. 6f is a schematic diagram of a method for replying to text according to an embodiment of the present application;

Fig. 7 is a schematic structural diagram of a text duplicating device according to an embodiment of the present application;

fig. 8 is a schematic diagram II of a device for reproducing text according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application.

Some terms in the embodiments of the present application are explained below to facilitate understanding by those skilled in the art.

(1) Duplicate (PARAPHRASE GENERATION):

the re-description means that given a natural language text, a natural language text expressed by another expression mode is automatically generated, and the new natural language text has the same or similar semantics with the original natural language text.

(2) Template (semantic structural template):

The template refers to structured information capable of representing the expression of the natural language text, and may be a sub-text of a structural frame containing the natural language text. The template and different words can be combined to obtain natural language texts with different semantics.

Embodiments of the application relate to cloud technology and artificial intelligence technology (ARTIFICIAL INTELLIGENCE, AI). Designed based on cloud computing (cloud computing) and cloud storage (cloud storage) in cloud technology, and computer vision technology (CV) in artificial intelligence technology, voice technology (Speech Technology), natural language processing (Nature Language processing, NLP), machine learning (MACHINE LEARNING, ML), and the like.

Cloud technology refers to a hosting technology for unifying serial resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied by the cloud computing business mode, can form a resource pool, and is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.

Cloud computing is a computing model that distributes computing tasks over a large number of computer-made resource pools, enabling various application systems to acquire computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the cloud are infinitely expandable in the sense of users, and can be acquired at any time, used as needed, expanded at any time and paid for use as needed.

As a basic capability provider of cloud computing, a cloud computing resource pool (cloud platform for short, commonly called Infrastructure AS A SERVICE (IaaS) platform) is established, in which multiple types of virtual resources are deployed for external clients to select for use.

According to the logic function division, a Platform as a service (PaaS) layer can be deployed on the IaaS layer, and a Software as a service (SaaS) layer can be deployed on the PaaS layer, or the SaaS layer can be directly deployed on the IaaS layer. PaaS is a platform on which software runs, such as a database, web container, etc. SaaS is a wide variety of business software such as web portals, sms mass senders, etc. Generally, saaS and PaaS are upper layers relative to IaaS.

Cloud storage is a new concept which extends and develops in the concept of cloud computing, and a distributed cloud storage system (hereinafter referred to as a storage system for short) refers to a storage system which integrates a large number of storage devices (storage devices are also called storage nodes) of different types in a network through application software or application interfaces to cooperatively work and jointly provides data storage and service access functions for the outside through functions such as cluster application, grid technology, a distributed storage file system and the like.

At present, the storage method of the storage system is as follows: when creating logical volumes, each logical volume is allocated a physical storage space, which may be a disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as an Identity (ID) of the data, the file system writes each object into a physical storage space of the logical volume, and the file system records storage position information of each object, so that when the client requests to access the data, the file system can enable the client to access the data according to the storage position information of each object.

The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided into stripes in advance according to a set of capacity measures for objects stored on a logical volume (the measures often have a large margin with respect to the capacity of the objects actually to be stored) and redundant array of independent disks (Redundant Array of INDEPENDENT DISK, RAID), and a logical volume can be understood as a stripe, whereby physical storage space is allocated for the logical volume.

Artificial intelligence is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and expand human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Computer vision is a science of researching how to make a machine "look at", and more specifically, it means that a camera and a computer are used to replace human eyes to perform machine vision such as recognition, tracking and measurement on a target, and further perform graphic processing, so that the computer is processed into an image more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.

Key technologies to speech technology are Automatic Speech Recognition (ASR) and speech synthesis (TTS) technologies and voiceprint recognition technologies. The method can enable the computer to listen, watch, say and feel, is the development direction of human-computer interaction in the future, and voice becomes one of the best human-computer interaction modes in the future.

Natural language processing is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.

Machine learning is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

The application field of the method for revising text provided by the embodiment of the application is briefly described below.

In order to enable the device to recognize more expression modes of the input text, the device can repeat the input text after obtaining the input text, and the obtained repeated text is used for determining the repeated text which can be recognized by the device, so that the output text is generated for corresponding reply. The device can also repeat the output text after generating the output text, and determine the repeated text similar to the common expression mode of the user in the obtained repeated text to carry out corresponding reply.

The repeated text obtained according to the conventional manner of repeated text may not be a usual expression. For example, by synonymously replacing keywords in an input text, the grammatical structure of the input text cannot be changed, and the replaced synonyms may not be commonly used with other words in the input text, so that the obtained repeated text is not a commonly used expression.

For another example, in a manner of translating an input text from a primitive language to another language and then translating the input text expressed in the other language back to the original language, since the grammar structure of the other language is different from that of the source language, after translating the input text expressed in the other language back to the primitive language, a grammar error or a semantic error may occur.

For another example, by changing the way in which the order of the text is input, such as the way in which the active language is changed to the passive language, not all the orders are commonly expressed for different language environments and different languages, and therefore, the comprehension of the repeated text obtained by changing the order of the language is easy to be low.

It can be seen that the accuracy of reproducing text by means of conventional reproducing text is low.

In order to solve the problem of text with lower accuracy in text reproduction, the application provides a text reproduction method. Referring to fig. 1, after obtaining a text to be repeated, the method analyzes the text to be repeated based on a preset analysis strategy to obtain grammar structure information and semantic information of the text to be repeated. Determining text composition templates of the text to be repeated based on the grammar structure information, and screening at least one target text template meeting preset matching conditions between the text composition templates from a pre-stored candidate text template set. And respectively combining the semantic information with at least one target text template to obtain at least one target reproduction text of the text to be reproduced.

The application scenario of the method for repeating text provided by the application is described below.

Please refer to fig. 2, which is an application scenario of the method for replying text provided in the embodiment of the present application. The application scene comprises a client 101 and a server 102. The client 101 and the server 102 may communicate with each other by using a wired communication technology, for example, communication is performed through a connection network or a serial port line; the communication may also be performed by using a wireless communication technology, for example, a bluetooth or wireless fidelity (WIRELESS FIDELITY, WIFI) technology, which is not limited in particular.

The client 101 generally refers to a device that can provide a text to be repeated to the server 102, for example, a terminal device, a third party application that the terminal device can access, or a web page that the terminal device can access, or the like. For example, terminal devices include, but are not limited to, cell phones, computers, intelligent voice interaction devices, intelligent appliances, vehicle terminals, and the like. The server 102 generally refers to a device, such as a terminal device or a server, that can perform a rendition of text to be repeated. The server 102 may be a cloud server associated with the client 101, a local server of the client 101, or a third party server associated with the client 101, etc., for example, the server 102 is a local server of the intelligent voice interaction device; for another example, the server 102 is a cloud server of an in-vehicle terminal or the like. Both the client 101 and the server 102 can adopt cloud computing to reduce occupation of local computing resources; cloud storage may also be employed to reduce the occupation of local storage resources.

As an embodiment, the client 101 and the server 102 may be the same device, which is not limited in particular. In the embodiment of the present application, the description will be given by taking the case that the client 101 and the server 102 are different devices respectively.

The method for replying text provided in the embodiment of the present application will be specifically described based on fig. 2, taking a target client as a client 101, a server as a server 102, and a storage cluster as a storage 103 as an example.

Referring to fig. 3, a flow chart of a method for replying text according to an embodiment of the present application is shown.

S301, the server obtains the text to be repeated.

The server may receive the text to be repeated sent by other devices, or may receive information sent by other devices, and determine the text to be repeated according to the received information, which is not limited in particular. There are various methods for the server to obtain the text to be repeated, and two of them will be described as examples.

The method comprises the following steps:

At least one input information is obtained in response to an input operation of the target client. If at least one of the input information includes text information, word segmentation processing is performed on the text information to obtain a corresponding plurality of input sub-texts. And combining the obtained input sub-texts according to a preset grammar rule to obtain the text to be repeated.

At least one input information is obtained in response to an input operation of the target client. If the server obtains an input message, and the input message is text message, the text message can be used as a text to be repeated; the server can also perform word segmentation processing on the text information to obtain each input sub-text, and the obtained each input sub-text and the preset sub-text are combined according to a preset grammar rule to obtain a text to be repeated; the server can also combine the obtained partial input sub-text with the preset sub-text according to the preset grammar rule to obtain the text to be repeated.

If the server obtains a plurality of input information and the plurality of input information are text information, the server can perform word segmentation processing for each text information to obtain input sub-texts corresponding to the text information. And the server combines the obtained input sub-texts according to a preset grammar rule to obtain the text to be repeated. The server can also combine the obtained input sub-texts with preset sub-texts according to preset grammar rules to obtain the text to be repeated. The server can also combine the obtained partial input sub-texts according to a preset grammar rule to obtain the text to be repeated.

As an embodiment, the preset sub-text may be a sub-text preset according to the review scene, which is used to improve the integrity of the combined text to be reviewed. The preset grammar rule may be a grammar rule corresponding to the current language, for example, the current language is chinese, and the corresponding grammar rule includes "subject+predicate+object" and the like, which is not limited in particular.

For example, in an intelligent customer service scenario, the preset sub-text may be "how", "modify" or "query", and if the obtained input sub-text includes "modify" and "password", the server may combine "how", "modify" and "password" to obtain the text to be repeated "how to modify the password".

The second method is as follows:

At least one input information is obtained in response to an input operation of the target client. If the at least one input message includes multimedia information, text recognition processing is performed on the multimedia information to obtain at least one input sub-text. And combining the obtained input sub-texts according to a preset grammar rule to obtain the text to be repeated.

At least one input information is obtained in response to an input operation of the target client. If the input information obtained by the server includes multimedia information, the server may perform text recognition processing on the multimedia information to obtain at least one input sub-text. For example, referring to fig. 4a (1), the input information is a screenshot of prompt information that a video conference cannot be entered, and referring to fig. 4a (2), after the target client sends the screenshot to the server, the server may perform text recognition processing on the screenshot to obtain at least one input sub-text included in the screenshot, i.e. "cannot", "enter" and "video conference".

The server can combine the obtained input sub-texts according to a preset grammar rule to obtain a text to be repeated; the server can also combine the obtained input sub-texts with preset sub-texts according to preset grammar rules to obtain a text to be repeated; the server may also combine the obtained partial input sub-text with a preset sub-text according to a preset grammar rule, to obtain a text to be repeated, and the like, which is not particularly limited. The preset sub-text and the preset grammar rules are described with reference to the foregoing, and are not described in detail herein. For example, referring to fig. 4b, if the preset sub-text is "why," after obtaining the input sub-text "cannot", "enter" and "video conference", the preset sub-text and each input sub-text may be combined to obtain the text to be repeated "why cannot enter video conference".

S302, the server analyzes the text to be repeated based on a preset analysis strategy to obtain grammar structure information and semantic information of the text to be repeated.

After obtaining the text to be repeated, the server can analyze the text to be repeated based on a preset analysis strategy. The preset parsing strategy includes a plurality of kinds, and two kinds of them are taken as examples to describe the method.

Analysis strategy one:

And determining grammar structure information and semantic information of the text to be repeated according to the word frequency.

The server can determine the respective word frequency of each sub-text to be repeated contained in the text to be repeated based on a preset reference text set. The set of reference text may be obtained based on the same or similar scenario as the review scenario, e.g., for a question-answer scenario, the set of reference text may be a set of reference text obtained based on an associated question-answer application or question-answer applet. The reference text set may be collected by the target client in advance, and the like, and is not particularly limited.

After obtaining a preset reference text set, the server performs word segmentation processing on each reference text in the reference text set to obtain a plurality of reference sub-texts corresponding to each reference text. And the server respectively counts the occurrence frequency of different reference sub-texts in the obtained reference sub-texts to obtain a first mapping relation between the reference sub-texts and word frequency. The server can obtain the first mapping relation in advance when the occupied resources are less, so that the first mapping relation can be directly read when the first mapping relation is used; the server can also acquire the first mapping relation in real time when the first mapping relation is used, so that the accuracy of the first mapping relation is ensured, and the method is not particularly limited.

After obtaining the text to be repeated, the server can perform word segmentation processing on the text to be repeated to obtain corresponding sub-texts to be repeated. After obtaining the first mapping relation, the server can respectively determine the respective word frequency of each sub-text to be repeated based on the first mapping relation. After obtaining the respective word frequency of each sub-text to be repeated, the server can screen out target sub-texts of which the word frequency meets the preset word frequency screening condition from each sub-text to be repeated. The preset word frequency screening condition is, for example, a condition greater than the specified word frequency, or a condition that the word frequency of each sub-text to be repeated is ranked before the specified sequence number in the sequencing result from large to small, or the like, which is not particularly limited.

After each target sub-text is obtained, the server can analyze the obtained target sub-text to obtain the grammar structure information of the text to be repeated. The grammar structure information can be used for representing the frame structure of the expression mode corresponding to the text to be repeated, and can also be used for representing the connection mode of each sub-text in the expression mode corresponding to the text to be repeated, and the like.

For example, in a question-and-answer scenario, the frequency of occurrence of the words related to the question is relatively high, such as words of "why", "how", "solve", or "how", and the like, in a set of reference texts associated with the question-and-answer scenario, the words belong to high frequency words, i.e., to words greater than a specified word frequency. The server can screen out each target sub-text with higher word frequency in the text to be repeated to obtain grammar structure information of the text to be repeated, for example, aiming at the target sub-text, the obtained grammar structure information can be 'modifier + verb + noun + how', and the like.

After obtaining each target sub-text, the server can further obtain sub-texts to be repeated except the screened target sub-text in each sub-text to be repeated, and analyze the sub-texts to be repeated except the screened target sub-text to obtain semantic information of the text to be repeated.

For example, in a question-and-answer scenario, the frequency of occurrence of words that are not related to a question is relatively low, such as words like "password", "spelling", "sewer" or "recitation", and in a set of reference text associated with the question-and-answer scenario, words that are of low frequency, i.e., words that are of less than or equal to a specified word frequency, are included. The server can screen each text to be repeated except the target text with lower word frequency in the text to be repeated to obtain semantic information of the text to be repeated, such as 'password' of the text to be repeated, and the obtained semantic information can be 'modified password' or 'query password'.

And (3) analyzing a strategy II:

And determining grammar structure information and semantic information of the text to be repeated according to the part of speech.

After the server performs word segmentation processing on the text to be repeated to obtain corresponding sub-texts to be repeated, the server can analyze the part-of-speech information of each sub-text to be repeated and the association relation of each sub-text to be repeated in the text to be repeated to obtain grammar structure information and semantic information of the text to be repeated. The part-of-speech information is used to characterize the part-of-speech of the sub-text to be repeated, e.g. in chinese, the part-of-speech comprises nouns, verbs, adjectives or adverbs etc. The association relation of each sub-text to be repeated in the text to be repeated, for example, in Chinese, adjectives modify nouns, and then the adjectives and the nouns have modification relation, namely association relation; for another example, in chinese, an adverb modifies a verb, and then there is a modified relationship, i.e., an association relationship, between the adverb and the verb.

After obtaining the part-of-speech information and the association relation of each sub-text to be repeated, the server can determine the sub-text grade corresponding to the part-of-speech information of each sub-text to be repeated based on a second mapping relation between the preset part-of-speech information and the sub-text grade. The server may divide the parts of speech into different levels according to the importance of the parts of speech in the sentence, for example, the verb is used to represent the action performed by the subject, the importance is higher, the verb may be divided into a first sub-text level, the noun is used to represent the subject performing the action, or the object of the performed action, the importance is just inferior to the verb, the noun may be divided into a second sub-text level, etc., which are not described again herein, so that a second mapping relationship between the part of speech information and the sub-text level may be obtained.

As an embodiment, the server may construct the second mapping relationship as a tree structure, with the first sub-text level as a root, the second sub-text level as a first branch of the root, the third sub-text level as a second branch of the first branch, and so on, without limitation.

After the server obtains the second mapping relation, the target sub-text with the sub-text grade within the preset grade range and no association relation with the appointed sub-text grade can be screened out from the sub-texts to be repeated based on the association relation of the sub-texts to be repeated in the text to be repeated. The preset level range may be a pre-designated sub-text level, for example, a first sub-text level and a second sub-text level with higher importance, etc., and may also be, for example, a root and a first branch in a tree structure, etc., which is not limited in particular. The specified sub-text level may be a level corresponding to a sub-text that does not affect sentence structure information, for example, a sub-text level corresponding to a preposition, a sub-text level corresponding to an exclamation, or the like, and is not particularly limited. Thus, the screened target sub-text is enabled to be each sub-text to be repeated, which can be used for representing the structural information of the text to be repeated.

After the server obtains the target sub-text, the obtained target sub-text may be parsed to obtain the grammar structure information of the text to be repeated, and the description may be referred to in the foregoing, which is not repeated herein.

After the server obtains the target sub-text, the server can correspondingly obtain the sub-text to be repeated except the screened target sub-text in each sub-text to be repeated. The server may analyze the sub-texts to be repeated except the screened target sub-text in each sub-text to be repeated to obtain semantic information of the text to be repeated, and the description of the text to be repeated may be referred to in the foregoing, and will not be repeated here.

S303, the server determines a text composition template of the text to be repeated based on the grammar structure information, and screens out at least one target text template meeting a preset matching condition with the text composition template from a pre-stored candidate text template set.

If the candidate text template set is prestored, the server can directly acquire the candidate text template set, and the same repeated marks exist among a plurality of candidate text templates which are in repeated relation in each candidate text template contained in the candidate text template set. In some cases, if there is no set of candidate text templates, the server may obtain a set of candidate text templates based on a pre-stored set of sample text. The sample text set comprises a plurality of sample texts with the same repeating marks in the repeating relation.

Referring to fig. 5, the process of obtaining the candidate text template set based on the sample text set may be that the server may determine the sample text template of each sample text based on the respective grammar structure information of each sample text after obtaining the sample text set. Thus, the server may build a set of candidate text templates based on the respective sample text templates for the respective sample texts and the respective repetition markers for the respective sample texts.

For example, in the candidate text template set, how … … is in a repeating relationship with what … … is the candidate text template, then the two candidate text templates may have the same repeating label, such as "0001", and then in the candidate text template set, the candidate text templates with the label "0001" are in a repeating relationship with each other.

Please refer to table 1, which is a storage form of a candidate text template set.

TABLE 1

Candidate text templates	Candidate text templates	Multiplex marking
			___ How	___ Good use	01
___ To ___ aircraft	Flights ___ to ___	02
			___ Notes	___ Need to pay attention to what	03
___ Why is	___ How to get back	04
			___ Which is better	___ Where best	05

For another example, in the sample text set, the sample text "how to modify the password" and the sample text "how to modify the password" are in a repeating relationship with each other, so that the two sample texts may have the same repeating label, such as "0001", and then in the sample text set, the sample text having the label "0001" is in a repeating relationship with each other. Thus, among the set of candidate text templates determined based on the sample text set, the candidate text templates that are in a repeating relationship with each other also have the same repeating labels.

After obtaining the grammar structure information of the text to be repeated, the server can determine a text composition template of the text to be repeated based on the grammar structure information. After obtaining the text composition templates of the text to be repeated, the server can match the text to be repeated with each candidate text template, so as to determine at least one target text template. There are various methods for determining the target text template by the server, and two of them will be described as examples.

The method comprises the following steps:

the matched text forms the template feature vector of the template and the template feature vector of each candidate text template.

After obtaining the text composition templates of the text to be repeated, the server can extract template feature vectors of the text composition templates and template feature vectors of each candidate text template. For example, the server extracts template feature vectors of the text component templates and template feature vectors of each candidate text template by means of a trained bert pre-training model. The server can extract the template feature vector of each candidate text template when the resource occupation is low, so that the template feature vector of each candidate text template can be directly called when the template feature vector of each candidate text template is used.

After the template feature vector of the text composition template and the template feature vector of each candidate text template are obtained, the server respectively determines the vector similarity between the template feature vector of the text composition template and the template feature vector of each candidate text template, and a corresponding similarity result is obtained. The method for determining the vector similarity by the server may be, without limitation, calculating the euclidean distance between two template feature vectors, calculating the mahalanobis distance between two template feature vectors, calculating the cosine similarity between two template feature vectors, and the like.

After obtaining the similarity results between the template feature vectors of the text component templates and the template feature vectors of the candidate text templates, the server can determine the maximum similarity result in the similarity results. And screening out candidate text templates with the same repeated marks as the candidate text templates in the candidate text template set based on the candidate text templates corresponding to the maximum similarity result, and taking the candidate text templates as target text templates, so that at least one target text template can be obtained.

As an embodiment, after obtaining the at least one target text template, the server may update the set of sample texts and the set of candidate text templates based on the text to be repeated and the repeated tags of the at least one target text template, add the text to be repeated to the set of sample texts, and add the repeated tag that is the same as the repeated tag of the sample text corresponding to the at least one target text template. The server adds the text composition templates of the text to be repeated to the set of candidate text templates and the same repetition mark as the repetition mark of the at least one target text template.

The second method is as follows:

and matching the text feature vector of the text to be repeated with the template feature vector of each candidate text template by adopting a trained template recognition model.

After the server obtains the text composition template of the text to be repeated, a trained template recognition model can be adopted to fuse the text characteristics of the text to be repeated and the template characteristics of the text composition template, so that the text characteristic vector of the text to be repeated is obtained. And respectively matching the text feature vectors of the text to be repeated with the template feature vectors of the candidate text templates through the trained template recognition model to obtain corresponding matching results. And the server screens out candidate text templates with the matching results meeting preset matching conditions based on the obtained matching results, and the candidate text templates are used as target text templates. The preset matching condition may be a condition that the matching result is larger than a specified matching threshold, or a condition that the matching result is arranged in the order from large to small and then arranged before a specified sequence number, or the like, and is not particularly limited.

As one example, the server may obtain a trained template recognition model prior to determining at least one target text model using the trained template recognition model. The server may receive the trained template recognition model sent by other devices, or may train the untrained template recognition model based on the candidate text template set to obtain the untrained template recognition model, which is not specifically limited.

The process of training an untrained template recognition model is described below.

After the server obtains the sample text set and the candidate text template set, training the untrained template recognition model based on each sample text. In a one-time training process, a server adopts an untrained template recognition model, fuses text features of a sample text and template features of a candidate text template corresponding to the sample text, and obtains text feature vectors of the sample text. The server respectively matches the text feature vectors of the sample text and the template feature vectors of the candidate text templates to obtain corresponding matching results;

And determining training errors of the untrained template recognition model based on the candidate text templates with the matching results meeting preset matching conditions and the candidate text templates corresponding to the sample text in the candidate text templates. The preset matching condition may be a maximum value of a matching result, etc., and is not particularly limited. If the training error does not meet the preset convergence condition, the model parameters of the untrained template recognition model are adjusted, the next training process is continued until the obtained training error meets the convergence condition, and the trained template recognition model is obtained.

S304, the server respectively combines the semantic information and at least one target text template to obtain at least one target reproduction text of the text to be reproduced.

After obtaining the at least one target text, the server may combine the semantic information of the text to be repeated and the target text templates, respectively, for each target text template of the at least one target text template. The server may determine a plurality of sub-text combining strategies corresponding to the target text template based on template features of the target text template using the trained text prediction model. The sub-text combining strategy is, for example, a combination mode of different sub-texts with different parts of speech, a combination mode of adding different modifier words to different sub-texts, and the like, and is not particularly limited.

And the server combines the semantic information with the target text template through the trained text prediction model by adopting the obtained sub-text combination strategies to obtain corresponding combination results. The server can screen out target combination results, of which the combination results meet preset combination conditions, based on the obtained combination results, and the target combination results are used as target repeated texts. For example, each combination result has a corresponding matching probability value, so that the server may screen the combination result with the matching probability value greater than the specified threshold as the target combination result, or may sort the combination results according to the order of the matching probability values from the higher to the lower, screen the combination result ranked before the specified sequence number as the target combination result, and the like, which is not particularly limited.

As one example, the server may obtain a trained text prediction model prior to determining at least one target text model using the trained text prediction model. The server may receive the trained text prediction model sent by other devices, or may train the untrained text prediction model based on the candidate text template set and the sample text set to obtain the untrained text prediction model, which is not specifically limited.

The process of training an untrained text prediction model is described below.

After obtaining the sample text set and the candidate text template set, the server may train the unconnected text prediction model based on each candidate text template. In one training process, the server adopts an untrained text prediction model to determine a plurality of sub-text combination strategies corresponding to the candidate text templates based on the template characteristics of the candidate text templates. And the server respectively combines the semantic information of the sample text with the candidate text templates by adopting the obtained sub-text combination strategies through an untrained text prediction model to obtain corresponding combination results. The server can screen out target combination results with the combination results meeting preset combination conditions based on the obtained combination results, and the target combination results are used as training and repeating texts.

And determining the training error of the untrained text prediction model based on the training replication text and the sample text corresponding to the candidate text template. If the training error does not meet the preset convergence condition, the model parameters of the untrained text prediction model are adjusted, the next training process is continued until the obtained training error meets the convergence condition, and the trained text prediction model is obtained.

As an embodiment, when each obtained sub-text combination strategy is adopted to respectively combine semantic information with a target text template, the method can be based on the reference semantic sub-text with similarity larger than a similarity threshold value with the semantic information in a pre-stored reference semantic set, and each obtained sub-text combination strategy is adopted to respectively combine the reference semantic sub-text with the target text template, so that various reasonable and common expression modes of the text to be repeated in a semantic angle are obtained, and the accuracy and diversity of the repeated text are further improved.

As one embodiment, the process of determining the target text template and the target repeated text can be completed through a beam search (beam search) technology, the beam search is a heuristic graph search algorithm, and in the breadth search process, each time a node with higher partial quality is reserved, the space-time consumption of the complete search and the local optimization of greedy search can be balanced.

In the embodiment of the application, after the target text template is obtained, the template can be formed based on the text of the text to be repeated, the candidate text template set can be expanded, and the sample text set can be expanded based on the text to be repeated, so that when the template recognition model and the text prediction model are trained based on the candidate text template set and the sample text set, a more accurate trained template recognition model and text prediction model can be obtained, and the robustness of the trained template recognition model and text prediction model is improved. And by combining the semantic information of the text to be repeated with other grammar structures which express the same meaning with the grammar structures of the text to be repeated, the obtained difference in terms of characters between the target repeated text and the text to be repeated is improved, and the difference in terms of meaning between the target repeated text and the text to be repeated is reduced.

The method for replying the text provided by the embodiment of the application can be used for expanding the question-answer index knowledge base and ensuring the dynamic expansion of the question-answer index knowledge base, thereby improving the reply accuracy, the recommendation accuracy of related questions and the like of a question-answer system constructed based on the question-answer index knowledge base. By expanding the question and answer index knowledge base, a question and answer system constructed based on the question and answer index knowledge base can accurately answer, recommend related questions or dynamically expand under the condition of cold start.

In the following, a scenario of a knowledge base with question and answer indexes is described by taking a case that a template recognition model and a text prediction model are constructed based on a transducer model as an example.

Referring to fig. 6a, when the method for revising text provided by the embodiment of the application is not used, after the server responds to the input operation of the target client to obtain the problem that the input text "the student does not enter the classroom and is forbidden to enter", the server does not determine the solution corresponding to "the student does not enter the classroom and is forbidden to enter" based on the corresponding relation between the pre-stored problem and the solution, so that the server feeds back the preset output text "i am well learned and does not know what you are speaking at all" to the target client.

When the method for replying texts provided by the embodiment of the application is used, after obtaining the text to be replied, namely 'students do not enter into a classroom and are forbidden to enter', the server analyzes the text to be replied to obtain various sub-texts to be replied, including 'students', 'do not enter into a classroom', 'forbidden', 'enter into a' and 'enter into a' respectively.

The server screens out target sub-texts in all the sub-texts to be repeated, including 'forbidden' and 'entering', analyzes the target sub-texts to obtain grammar structure information of the text to be repeated, analyzes the sub-texts to be repeated except the target sub-texts to be processed in all the sub-texts to be repeated, and obtains semantic information of the text to be repeated.

The method comprises the steps that a server determines a text composition template of a text to be repeated, namely a subject, a forbidden subject and an object, based on grammar structure information of the text to be repeated, and at least one target text template meeting preset matching conditions with the text composition template is screened out from a pre-stored candidate text template set by the server. The process of screening the target text templates may be obtained using a trained template recognition model.

Referring to fig. 6b, taking a trained template recognition model as a transducer model, the transducer model includes a coding sub-model and a decoding sub-model, and the coding sub-model includes a text coding sub-model to be repeated and a text composition template coding sub-model.

The position of the token semantic information in the text composition template may take the special character "[ MASK ]" place, e.g., the text composition template "subject+prohibited entry+object" may be expressed as "[ MASK ] prohibited entry [ MASK ]".

The server inputs the text to be repeated 'the student does not go to the classroom, is forbidden to enter the text coding submodel to be repeated, obtains the text characteristics of the text to be repeated, inputs the text composition template of the text to be repeated' the [ MASK ] is forbidden to enter the text composition template coding submodel to obtain the template characteristics of the text composition template. And obtaining the text feature vector of the text to be repeated by fusing the text features of the text to be repeated and the template features of the template formed by the text. The server inputs the text feature vector into the decoding submodel to obtain a matching result of the text feature vector and the template feature vector of each candidate text template, and outputs at least one target text template of which the matching result meets the preset matching condition, including 'how to enter ___', 'what is the best way to enter ___', or 'how to enter ___' and the like.

After obtaining the at least one target text template, the server may determine a target replication text using the trained text prediction model. Referring to fig. 6c, the text prediction model includes a coding sub-model and a decoding sub-model, and the coding sub-model includes a text coding sub-model to be repeated and a target text template coding sub-model.

The server inputs a text to be repeated 'the student does not enter a class, is forbidden to enter an input text coding submodel to obtain semantic information of the text to be repeated', and inputs a target text template 'how to enter ___' in at least one target text template into the target text template coding submodel to obtain a plurality of sub-text combination strategies including 'subject+how to enter+object', 'subject+how to+object+enter+object', and 'how to enter+object', and the like. For each sub-text combination strategy, combining semantic information and a target text model to obtain corresponding target repeated texts, for example, how a student enters a classroom, how the student rapidly enters the classroom, how to enter the classroom and the like.

Referring to fig. 6d, after obtaining the target replication text based on the replication text, the server may determine a solution corresponding to the target replication text in a correspondence between preset problems and solutions. The server may feed back the obtained solution to the target client, please refer to fig. 6e, and after the target client inputs "the student is not going to the classroom and is forbidden to enter", the server may display the solution "the student can change an account to log in" through the target client, and then try to enter the classroom.

Referring to fig. 6f, after the target client inputs the problems of "the student is not going to the classroom, is prohibited from entering", "the student is not going to the classroom", or "the lesson is not being performed", the server can find the corresponding solution through the determined target repeating text and feed back the solution to the target client, and meanwhile, the server can also establish the corresponding relation with the solution based on the obtained problems of "the student is not going to the classroom, is prohibited from entering", "the student is not going to the classroom", or "the lesson is not being performed", and the like, so as to enrich the knowledge base of the question-answer index.

Based on the same inventive concept, the embodiment of the application provides a text copying device, which is equivalent to the server discussed above and can realize functions corresponding to the text copying method. Referring to fig. 7, the apparatus includes an acquisition module 701 and a processing module 702, where:

The acquisition module 701: for obtaining text to be repeated;

The processing module 702: the method comprises the steps of analyzing a text to be repeated based on a preset analysis strategy to obtain grammar structure information and semantic information of the text to be repeated;

The processing module 702 is further configured to: determining text composition templates of the text to be repeated based on grammar structure information, and screening at least one target text template meeting preset matching conditions with the text composition templates from a pre-stored candidate text template set;

The processing module 702 is further configured to: and respectively combining the semantic information with at least one target text template to obtain at least one target reproduction text of the text to be reproduced.

In one possible embodiment, the obtaining module 701 is specifically configured to:

For at least one input information, the following operations are performed:

if one input message is multimedia message, then carrying out text recognition processing to the multimedia message to obtain at least one input sub-text;

In one possible embodiment, the processing module 702 is further configured to:

Respectively carrying out word segmentation processing on each reference text in the reference text set to obtain a plurality of reference sub-texts corresponding to each reference text;

In one possible embodiment, the processing module 702 is specifically configured to:

Based on the first mapping relation, determining the respective word frequency of each sub-text to be repeated;

screening target sub-texts with word frequency meeting preset word frequency screening conditions from each sub-text to be repeated, and analyzing the obtained target sub-texts to obtain grammar structure information of the text to be repeated;

And analyzing the sub-texts to be repeated except the screened target sub-text in each sub-text to be repeated to obtain semantic information of the text to be repeated.

analyzing the part-of-speech information of each sub-text to be repeated and the association relation of each sub-text to be repeated in the text to be repeated to obtain the grammar structure information and the semantic information of the text to be repeated.

In one possible embodiment, the processing module 702 is further configured to:

before at least one target text template meeting a preset matching condition between a template and a text composition template is screened out from a pre-stored candidate text template set, a pre-stored sample text set is obtained, wherein the sample text set comprises a plurality of sample texts with the same repeated relation;

extracting template feature vectors of templates formed by texts and template feature vectors of candidate text templates;

Respectively determining the template feature vectors of the text composition templates, and obtaining the vector similarity between the template feature vectors of each candidate text template and the template feature vectors of each candidate text template to obtain a corresponding similarity result;

In one possible embodiment, the processing module 702 is further configured to:

After combining the semantic information and at least one target text template respectively, obtaining at least one target replication text of the text to be replication, updating the sample text set and the candidate text template set based on the text to be replication and the at least one target text template.

Adopting a trained template recognition model, and fusing text features of the text to be repeated and template features of a template formed by the text to obtain text feature vectors of the text to be repeated;

Respectively matching text feature vectors of the text to be repeated with template feature vectors of each candidate text template to obtain corresponding matching results;

Determining a plurality of sub-text combination strategies corresponding to the target text template based on template characteristics of the target text template by adopting a trained text prediction model;

Combining semantic information and a target text template by adopting the obtained sub-text combination strategies to obtain corresponding combination results;

And screening out target combination results with the combination results meeting preset combination conditions based on the obtained combination results, and taking the target combination results as target repeated texts.

Acquiring a pre-stored reference semantic sub-text set;

and combining the obtained sub-text combination strategies with at least one reference semantic sub-text and the target text template respectively to obtain corresponding combination results.

Based on the same inventive concept, embodiments of the present application provide a computer apparatus, and the computer apparatus 800 will be described below.

Referring to fig. 8, the apparatus for repeating the above text may be run on a computer device 800, a current version and a history version of a data storage program and application software corresponding to the data storage program may be installed on the computer device 800, and the computer device 800 includes a display unit 840, a processor 880 and a memory 820, wherein the display unit 840 includes a display panel 841 for displaying an interface or the like interacted with by a user.

In one possible embodiment, the display panel 841 may be configured in the form of a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD) or an Organic Light-Emitting Diode (OLED) or the like.

The processor 880 is configured to read the computer program and then execute a method defined by the computer program, for example, the processor 880 reads a data storage program or a file, etc., thereby running the data storage program on the computer device 800 and displaying a corresponding interface on the display unit 840. The Processor 880 may include one or more general-purpose processors, and may further include one or more DSPs (DIGITAL SIGNAL processors ) for performing related operations to implement the technical solutions provided by the embodiments of the present application.

Memory 820 typically includes memory and external memory, and memory may be Random Access Memory (RAM), read Only Memory (ROM), CACHE (CACHE), etc. The external memory can be a hard disk, an optical disk, a USB disk, a floppy disk, a tape drive, etc. The memory 820 is used to store computer programs including application programs corresponding to the respective clients, etc., and other data, which may include data generated after the operating system or application programs are run, including system data (e.g., configuration parameters of the operating system) and user data. In embodiments of the present application where program instructions are stored in memory 820, processor 880 executes the program instructions stored in memory 820 to implement any one of the methods of the repeated text discussed in the previous figures.

The above-described display unit 840 is used to receive input digital information, character information, or touch operation/noncontact gestures, and to generate signal inputs related to user settings and function controls of the computer device 800, and the like. Specifically, in an embodiment of the present application, the display unit 840 may include a display panel 841. The display panel 841, for example, a touch screen, may collect touch operations thereon or thereabout by a user (such as operations of the user on the display panel 841 or on the display panel 841 using any suitable object or accessory such as a finger, a stylus, etc.), and drive the corresponding connection device according to a predetermined program.

In one possible embodiment, the display panel 841 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a player, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 880 and can receive commands from the processor 880 and execute them.

The display panel 841 may be implemented by various types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the display unit 840, the computer device 800 may also include an input unit 830, where the input unit 830 may include a graphical input device 831 and other input devices 832, where other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, etc.

In addition to the above, the computer device 800 may also include a power supply 890 for powering other modules, audio circuitry 860, near field communication module 870, and RF circuitry 810. The computer device 800 may also include one or more sensors 850, such as acceleration sensors, light sensors, pressure sensors, and the like. The audio circuit 860 specifically includes a speaker 861 and a microphone 862, etc., and for example, the computer device 800 can collect the sound of the user through the microphone 862, perform corresponding operations, etc.

The number of processors 880 may be one or more, and the processors 880 and memory 820 may be coupled or may be relatively independent.

As an example, the processor 880 of fig. 8 may be used to implement the functionality of the acquisition module 701 and the processing module 702 of fig. 7.

As an example, the processor 880 of fig. 8 may be configured to implement the functions corresponding to the server 102 discussed above.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware associated with program instructions, where the foregoing program may be stored in a computer readable storage medium, and when executed, the program performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a read-only memory (ROM), a random access memory (RAM, random Access Memory), a magnetic disk or optical disk, or the like, which can store program codes.

Or the above-described integrated units of the invention may be stored in a computer-readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A method of reproducing text, comprising:

obtaining a text to be repeated;

Respectively combining the semantic information with the at least one target text template to obtain at least one target compound text of the text to be compound;

The analyzing the text to be repeated based on the preset analysis strategy to obtain grammar structure information and semantic information of the text to be repeated comprises the following steps:

Based on a first mapping relation, determining the respective word frequency of each sub-text to be repeated; wherein the first mapping relationship characterizes: mapping relation between each reference sub-text and each word frequency; the reference sub-text is obtained from a reference text contained in a preset reference text set; the word frequency is as follows: the frequency of occurrence of the corresponding reference sub-text in the reference text set;

2. The method of claim 1, wherein obtaining the text to be repeated comprises:

for the at least one input information, the following operations are performed:

3. The method according to claim 1, wherein before parsing the text to be repeated based on a preset parsing policy to obtain syntax structure information and semantic information of the text to be repeated, further comprising:

Acquiring a preset reference text set;

4. The method of claim 1, wherein parsing the text to be repeated based on a preset parsing policy to obtain syntax structure information and semantic information of the text to be repeated comprises:

5. The method of claim 4, wherein analyzing the part-of-speech information of each sub-text to be repeated and the association relation of each sub-text to be repeated in the text to be repeated to obtain the grammar structure information and the semantic information of the text to be repeated comprises:

6. The method according to any one of claims 1 to 5, further comprising, before screening at least one target text template satisfying a preset matching condition with the text composition template from a pre-stored candidate text template set:

Acquiring a pre-stored sample text set, wherein identical repeated marks exist among a plurality of sample texts which are mutually repeated relations in each sample text included in the sample text set;

7. The method of claim 6, wherein screening at least one target text template satisfying a preset matching condition with the text composition template from a pre-stored set of candidate text templates, comprises:

8. The method of claim 6, further comprising, after combining the semantic information with the at least one target text template, respectively, to obtain at least one target reproduction text of the text to be reproduced:

Updating the sample text set and the candidate text template set based on the text to be repeated and the at least one target text template.

9. The method according to any one of claims 1 to 5, wherein screening at least one target text template satisfying a preset matching condition with the text composition template from a pre-stored candidate text template set comprises:

Respectively matching the text feature vectors of the text to be repeated with the template feature vectors of each candidate text template contained in the candidate text template set to obtain corresponding matching results;

10. The method according to any one of claims 1 to 5, wherein combining the semantic information with the at least one target text template to obtain at least one target reproduction text of the text to be reproduced, respectively, includes:

11. The method of claim 10, wherein combining the semantic information with the target text template to obtain a corresponding combined result using the obtained respective sub-text combining policies, respectively, comprises:

Acquiring a pre-stored reference semantic sub-text set;

12. An apparatus for reproducing text, comprising:

the acquisition module is used for: for obtaining text to be repeated;

The processing module is further configured to: respectively combining the semantic information with the at least one target text template to obtain at least one target compound text of the text to be compound;

the processing module is specifically configured to:

Based on a first mapping relation, determining the respective word frequency of each sub-text to be repeated; the first mapping relation represents: mapping relation between each reference sub-text and each word frequency; the reference sub-text is obtained from a reference text contained in a preset reference text set; the word frequency is as follows: the frequency of occurrence of the corresponding reference sub-text in the preset reference text set;

13. A computer device, comprising:

A memory for storing program instructions;

A processor for invoking program instructions stored in the memory and executing the method according to any of claims 1-11 in accordance with the obtained program instructions.

14. A computer-readable storage medium storing computer-executable instructions for causing a computer to perform the method of any one of claims 1-11.