CN113822034A

CN113822034A - Method and device for repeating text, computer equipment and storage medium

Info

Publication number: CN113822034A
Application number: CN202110630068.8A
Authority: CN
Inventors: 闫昭; 刘昊岩; 周辉阳
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-06-07
Filing date: 2021-06-07
Publication date: 2021-12-21
Anticipated expiration: 2041-06-07
Also published as: CN113822034B

Abstract

The application provides a method and a device for rephrasing a text, a computer device and a storage medium, which can be applied to the field of cloud computing or the field of artificial intelligence and are used for solving the problem of low accuracy of the rephrased text. The method comprises the following steps: obtaining a text to be repeated; analyzing the text to be repeated based on a preset analysis strategy to obtain grammatical structure information and semantic information of the text to be repeated; determining a text composition template of the text to be repeated based on the syntactic structure information, and screening out at least one target text template which meets preset matching conditions with the text composition template from a pre-stored candidate text template set; and respectively combining the semantic information and the at least one target text template to obtain at least one target repeat text of the text to be repeated.

Description

Method and device for repeating text, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for rephrasing a text, a computer device, and a storage medium.

Background

With the continuous development of science and technology, equipment can execute more and more intelligent tasks. For example, the device may generate an output text in response to the received input text. However, due to the variety of language expressions, the input text received by the device may have various expression modes, and for some expression modes, the device may not recognize the semantics expressed by the input text, and thus cannot perform corresponding reply.

In order to enable the device to recognize more expression modes of the input text, the device can repeat the input text after obtaining the input text, and in the obtained repeat text, the repeat text which can be recognized by the device is determined, so that the output text is generated and corresponding reply is carried out.

The retelling text obtained from the conventional means of retelling text may not be a common expression. For example, the grammatical structure of the input text cannot be changed by synonymy replacing the keywords in the input text, and the replaced synonyms may not be a common collocation with other words in the input text, so that the obtained retelling text is not a common expression. As can be seen, the accuracy of the retelling text is low.

Disclosure of Invention

The embodiment of the application provides a method and a device for rephrasing a text, computer equipment and a storage medium, which are used for solving the problem of low accuracy of the rephrased text.

In a first aspect, a method for reciting text is provided, including:

obtaining a text to be repeated;

analyzing the text to be repeated based on a preset analysis strategy to obtain grammatical structure information and semantic information of the text to be repeated;

determining a text composition template of the text to be repeated based on the syntactic structure information, and screening out at least one target text template which meets preset matching conditions with the text composition template from a pre-stored candidate text template set;

and respectively combining the semantic information and the at least one target text template to obtain at least one target repeat text of the text to be repeated.

In a second aspect, an apparatus for reciting text is provided, comprising:

an acquisition module: the text editing device is used for obtaining a text to be repeated;

a processing module: the text to be repeated is analyzed based on a preset analysis strategy, and grammar structure information and semantic information of the text to be repeated are obtained;

the processing module is further configured to: determining a text composition template of the text to be repeated based on the syntactic structure information, and screening out at least one target text template which meets preset matching conditions with the text composition template from a pre-stored candidate text template set;

the processing module is further configured to: and respectively combining the semantic information and the at least one target text template to obtain at least one target repeat text of the text to be repeated.

Optionally, the obtaining module is specifically configured to:

responding to the input operation of the target client, and obtaining at least one piece of input information;

for the at least one input information, the following operations are respectively performed:

if one input information in the at least one input information is text information, performing word segmentation processing on the text information to obtain a plurality of corresponding input sub-texts;

if the input information is multimedia information, performing text recognition processing on the multimedia information to obtain at least one input sub-text;

and combining the obtained input sub texts according to a preset grammar rule to obtain the text to be repeated.

Optionally, the processing module is further configured to:

the method comprises the steps that a preset reference text set is obtained before the text to be repeated is analyzed based on a preset analysis strategy and grammatical structure information and semantic information of the text to be repeated are obtained;

performing word segmentation processing on each reference text in the reference text set respectively to obtain a plurality of reference sub-texts corresponding to each reference text;

and respectively counting the occurrence frequency of different reference sub-texts in each obtained reference sub-text to obtain a first mapping relation between the reference sub-text and the word frequency.

Optionally, the processing module is specifically configured to:

performing word segmentation processing on the text to be repeated to obtain corresponding sub-texts to be repeated;

respectively determining the respective word frequency of each sub text to be repeated based on the first mapping relation;

screening out target sub-texts with the word frequencies meeting preset word frequency screening conditions from the sub-texts to be repeated, and analyzing the obtained target sub-texts to obtain grammatical structure information of the texts to be repeated;

and analyzing the sub-texts to be repeated except the screened target sub-text in each sub-text to be repeated to obtain semantic information of the text to be repeated.

Optionally, the processing module is specifically configured to:

and analyzing the part-of-speech information of each sub-text to be repeated and the incidence relation of each sub-text to be repeated in the text to be repeated to obtain the syntactic structure information and the semantic information of the text to be repeated.

Optionally, the processing module is specifically configured to:

determining sub-text levels corresponding to the part-of-speech information of each sub-text to be repeated based on a second mapping relation between preset part-of-speech information and the sub-text levels;

screening out target sub-texts of which the sub-text levels are within a preset level range and have no association with the designated sub-text levels based on the association relationship of each sub-text to be repeated in the text to be repeated, and analyzing the obtained target sub-texts to obtain grammatical structure information of the text to be repeated;

Optionally, the processing module is further configured to:

in a pre-stored candidate text template set, obtaining a pre-stored sample text set before screening out at least one target text template which meets a preset matching condition with the text composition template, wherein the same rephrase marks exist among a plurality of sample texts which are in a rephrase relationship with each other in each sample text included in the sample text set;

determining a respective sample text template of each sample text based on respective syntactic structure information of each sample text;

and establishing a candidate text template set based on the respective sample text templates of the sample texts and the respective repeat marks of the sample texts, wherein the same repeat marks exist among a plurality of candidate text templates with repeat relations among the candidate text templates contained in the candidate text template set.

Optionally, the processing module is specifically configured to:

extracting template feature vectors of the text composition templates and the respective template feature vectors of the candidate text templates;

respectively determining template feature vectors of the text composition templates, and obtaining corresponding similarity results by vector similarity between the template feature vectors of the candidate text templates and the respective template feature vectors of the candidate text templates;

and screening out candidate text templates with the same repeated statement marks as the candidate text template corresponding to the maximum similarity result in the candidate text template set as target text templates based on the obtained similarity result.

Optionally, the processing module is further configured to:

after the semantic information and the at least one target text template are respectively combined to obtain at least one target retesting text of the text to be retested, updating the sample text set and the candidate text template set based on the text to be retested and the at least one target text template.

Optionally, the processing module is specifically configured to:

adopting a trained template recognition model, fusing the text features of the text to be repeated and the template features of the template formed by the text to obtain the text feature vector of the text to be repeated;

respectively matching the text characteristic vectors of the text to be repeated with the template characteristic vectors of the candidate text templates to obtain corresponding matching results;

and screening out candidate text templates with matching results meeting preset matching conditions as target text templates based on the obtained matching results.

Optionally, the processing module is specifically configured to:

for each target text template of the at least one target text template, respectively performing the following operations:

determining a plurality of sub-text combination strategies corresponding to a target text template by adopting a trained text prediction model based on the template characteristics of the target text template;

combining the semantic information and the target text template respectively by adopting the obtained sub-text combination strategies to obtain corresponding combination results;

and screening out target combination results of which the combination results meet preset combination conditions as the target retelling texts based on the obtained combination results.

Optionally, the processing module is specifically configured to:

acquiring a pre-stored reference semantic sub-text set;

screening out at least one reference semantic sub-text which meets preset semantic matching conditions with the semantic information from the reference semantic sub-text set on the basis of the semantic information;

and combining the at least one reference semantic sub-text and the target text template respectively by the obtained sub-text combination strategies to obtain corresponding combination results.

In a third aspect, a computer device is provided, comprising:

a memory for storing program instructions;

a processor for calling the program instructions stored in the memory and executing the method according to the first aspect according to the obtained program instructions.

In a fourth aspect, there is provided a computer-readable storage medium having stored thereon computer-executable instructions for causing a computer to perform the method of the first aspect.

In the embodiment of the application, the grammatical structure information and the semantic information of the text to be repeated are obtained by analyzing the text to be repeated, and at least one target text template is screened out from a candidate text template set based on the grammatical structure information, so that various reasonable common expression modes of the text to be repeated in the aspect of grammatical structure are obtained, and simple synonym replacement or word order transformation is not performed. After the target text template is obtained, the semantic information is combined with each target text template by combining the obtained target text templates, so that the semantic information is matched with the target text templates, the obtained target retelling text conforms to the current language environment, word collocation errors and the like caused by synonym replacement and the like are avoided, and the problem that the expression mode does not conform to the current language environment caused by word order conversion is avoided. The target repeat text is determined from two angles of a grammatical structure and semantics, an artificial real repeat scene is simulated, and the accuracy of the repeat text is greatly improved.

Drawings

Fig. 1 is a schematic diagram of a first principle of a method for rephrasing a text provided in an embodiment of the present application;

fig. 2 is an application scenario of a method for rephrasing a text according to an embodiment of the present application;

FIG. 3 is a schematic flow chart of a method for rephrasing a text as provided by an embodiment of the present application;

fig. 4a is a schematic diagram illustrating a principle of a method for rephrasing a text according to an embodiment of the present application;

fig. 4b is a schematic diagram illustrating a third principle of a method for rephrasing a text according to an embodiment of the present application;

fig. 5 is a schematic diagram illustrating a principle of a method for rephrasing a text according to an embodiment of the present application;

fig. 6a is a schematic diagram illustrating a principle of a method for rephrasing a text according to an embodiment of the present application;

FIG. 6b is a schematic diagram illustrating a sixth principle of a method for rephrasing a text according to an embodiment of the present application;

fig. 6c is a schematic diagram seven illustrating a principle of a method for rephrasing a text according to an embodiment of the present application;

fig. 6d is a schematic diagram eight illustrating a principle of a method for rephrasing a text according to an embodiment of the present application;

fig. 6e is a schematic diagram nine illustrating a principle of a method for rephrasing a text according to an embodiment of the present application;

fig. 6f is a schematic diagram ten illustrating a principle of a method for rephrasing a text according to an embodiment of the present application;

FIG. 7 is a first schematic structural diagram of an apparatus for reviewing texts according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of a device for rephrasing a text according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

Some terms in the embodiments of the present application are explained below to facilitate understanding by those skilled in the art.

(1) Repeat (paraphrase generation):

the repeated statement means that a natural language text is given, a natural language text expressed by another expression mode is automatically generated, and the newly generated natural language text has the same or similar semantics with the original natural language text.

(2) Template (semantic structural template):

the template refers to structured information capable of embodying the expression form of the natural language text, and may be a sub-text including a structural framework of the natural language text. The template and different word combinations can obtain natural language texts with different semantics.

Embodiments of the present application relate to cloud technology (cloud technology) and Artificial Intelligence (AI). The design is based on cloud computing (cloud computing) and cloud storage (cloud storage) in cloud Technology, and Computer Vision Technology (CV), Speech Technology (Speech Technology), Natural Language Processing (NLP), Machine Learning (ML), and the like in artificial intelligence Technology.

The cloud technology is a hosting technology for unifying series resources such as hardware, software, network and the like in a wide area network or a local area network to realize the calculation, storage, processing and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in the cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data in different levels are processed separately, and various industrial data need strong system background support and can only be realized through cloud computing.

Cloud computing is a computing model that distributes computing tasks over a resource pool of large numbers of computers, enabling various application systems to obtain computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the "cloud" appear to the user as being infinitely expandable and available at any time, available on demand, expandable at any time, and paid for on-demand.

As a basic capability provider of cloud computing, a cloud computing resource pool (called as an Infrastructure as a Service (IaaS) platform for short) is established, and multiple types of virtual resources are deployed in the resource pool and are selectively used by external clients.

According to the logic function division, a Platform as a Service (PaaS) layer can be deployed on the IaaS layer, a Software as a Service (SaaS) layer is deployed on the PaaS layer, and the SaaS layer can be directly deployed on the IaaS layer. PaaS is a platform on which software runs, such as a database, a web container, etc. SaaS is a variety of business software, such as web portal, sms, and mass texting. Generally speaking, SaaS and PaaS are upper layers relative to IaaS.

Cloud storage is a new concept extended and developed from a cloud computing concept, and a distributed cloud storage system (hereinafter referred to as a storage system) refers to a storage system which integrates a large number of storage devices (storage devices are also referred to as storage nodes) of different types in a network through application software or application interfaces to cooperatively work through functions of cluster application, a grid technology, a distributed storage file system and the like, and provides data storage and service access functions to the outside.

At present, a storage method of a storage system is as follows: logical volumes are created, and when created, each logical volume is allocated physical storage space, which may be the disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as data Identification (ID), the file system writes each object into a physical storage space of the logical volume, and the file system records storage location information of each object, so that when the client requests to access the data, the file system can allow the client to access the data according to the storage location information of each object.

The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided in advance into stripes according to a group of capacity measures of objects stored in a logical volume (the measures often have a large margin with respect to the capacity of the actual objects to be stored) and Redundant Array of Independent Disks (RAID), and one logical volume can be understood as one stripe, thereby allocating physical storage space to the logical volume.

Artificial intelligence is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Computer vision technology computer vision is a science for researching how to make a machine "see", and further, it means that a camera and a computer are used to replace human eyes to perform machine vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the computer processing becomes an image more suitable for human eye observation or transmitted to an instrument for detection. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.

Key technologies for speech technology are automatic speech recognition technology (ASR) and speech synthesis technology (TTS), as well as voiceprint recognition technology. The computer can listen, see, speak and feel, and the development direction of the future human-computer interaction is provided, wherein the voice becomes one of the best viewed human-computer interaction modes in the future.

Natural language processing is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

Machine learning is a multi-field cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

The following briefly introduces the application field of the method for reviewing texts provided by the embodiment of the present application.

In order to enable the device to recognize more expression modes of the input text, the device can repeat the input text after obtaining the input text, and in the obtained repeat text, the repeat text which can be recognized by the device is determined, so that the output text is generated and corresponding reply is carried out. The device can also repeat the output text after generating the output text, determine the repeat text similar to the user common expression mode in the obtained repeat text, and perform corresponding reply.

The retelling text obtained from the conventional means of retelling text may not be a common expression. For example, the grammatical structure of the input text cannot be changed by synonymy replacing the keywords in the input text, and the replaced synonyms may not be a common collocation with other words in the input text, so that the obtained retelling text is not a common expression.

For another example, in a manner of translating the input text from the original language to another language and then translating the input text expressed in the other language back to the original language, since the grammatical structure of the other language is different from that of the source language, a grammatical error or a semantic error may occur after translating the input text expressed in the other language back to the original language.

For another example, by changing the language order of the input text, such as changing the active language state into the passive language state, not all the language orders are commonly used for different language environments and different languages, and thus, the comprehension degree of the repeated text obtained by changing the language order is easy to be low.

It can be seen that the accuracy of text repetition by the conventional text repetition method is low.

In order to solve the problem of text with low accuracy of the rephrasing text, the application provides a method for rephrasing the text. Referring to fig. 1, after obtaining a text to be repeated, the method analyzes the text to be repeated based on a preset analysis policy, and obtains syntactic structure information and semantic information of the text to be repeated. And determining a text composition template of the text to be repeated based on the grammatical structure information, and screening out at least one target text template which meets preset matching conditions with the text composition template from a pre-stored candidate text template set. And respectively combining the semantic information and at least one target text template to obtain at least one target rephrase text of the text to be rephrased.

An application scenario of the method for reviewing the text provided by the present application is described below.

Please refer to fig. 2, which is an application scenario of the text replying method according to the embodiment of the present application. The application scenario includes a client 101 and a server 102. The client 101 and the server 102 may communicate with each other, and the communication may be performed by using a wired communication technology, for example, by using a connection network or a serial port; the communication may also be performed by using a wireless communication technology, for example, communication may be performed by using technologies such as bluetooth or wireless fidelity (WIFI), and the like, which is not limited specifically.

The client 101 generally refers to a device that can provide the text to be repeated to the server 102, for example, a terminal device, a third-party application accessible by the terminal device, or a web page accessible by the terminal device. For example, the terminal device includes, but is not limited to, a mobile phone, a computer, a smart voice interaction device, a smart home appliance, a vehicle-mounted terminal, and the like. The server 102 generally refers to a device, such as a terminal device or a server, which can rephrase a text to be rephrased. The server 102 may be a cloud server associated with the client 101, a local server of the client 101, or a third-party server associated with the client 101, for example, the server 102 is a local server of the intelligent voice interaction device; for example, the server 102 is a cloud server of a vehicle-mounted terminal. The client 101 and the server 102 can both adopt cloud computing to reduce the occupation of local computing resources; cloud storage can also be adopted to reduce the occupation of local storage resources.

As an embodiment, the client 101 and the server 102 may be the same device, and are not limited in particular. In the embodiment of the present application, the client 101 and the server 102 are respectively different devices for example.

Referring to fig. 2, a method for replying text provided in the embodiment of the present application is specifically described, taking a target client as a client 101, a server as a server 102, and a storage cluster as a storage 103 as an example.

Please refer to fig. 3, which is a flowchart illustrating a method for rephrasing a text according to an embodiment of the present application.

S301, the server obtains the text to be repeated.

The server may receive the text to be repeated sent by other devices, may also receive information sent by other devices, and determines the text to be repeated according to the received information, which is not limited specifically. There are various methods for the server to obtain the text to be repeated, and two of them are described below as examples.

The method comprises the following steps:

at least one piece of input information is obtained in response to an input operation of the target client. If at least one input message comprises text information, performing word segmentation processing on the text information to obtain a plurality of corresponding input sub-texts. And combining the obtained input sub texts according to a preset grammar rule to obtain the text to be repeated.

At least one piece of input information is obtained in response to an input operation of the target client. If the server obtains input information which is text information, the text information can be used as a text to be repeated; the server can also perform word segmentation processing on the text information to obtain each input sub-text, and combine each obtained input sub-text and each obtained preset sub-text according to a preset grammar rule to obtain a text to be repeated; the server can also combine the obtained partial input sub-text and the preset sub-text according to a preset grammar rule to obtain the text to be repeated.

If the server obtains a plurality of input information and the plurality of input information are text information, the server may perform word segmentation processing on each text information to obtain input sub-texts corresponding to the plurality of text information. And the server combines the obtained input sub-texts according to a preset grammar rule to obtain the text to be repeated. The server can also combine each obtained input sub-text with a preset sub-text according to a preset grammar rule to obtain the text to be repeated. The server can also combine the obtained partial input sub-texts according to a preset grammar rule to obtain the text to be repeated.

As an embodiment, the preset sub-text may be a sub-text preset according to a repeat scene, and is used to improve the integrity of the combined text to be repeated. The preset grammar rule may be a grammar rule corresponding to the current language, for example, if the current language is chinese, the corresponding grammar rule includes "subject + predicate + object", and the like, which is not limited specifically.

For example, in the smart customer service scenario, the preset sub-text may be "how", "modify" or "query", and if the obtained input sub-text includes "change" and "password", the server may combine "how", "modify" and "password", resulting in "how to modify the password" of the text to be repeated.

The second method comprises the following steps:

at least one piece of input information is obtained in response to an input operation of the target client. If the at least one input message includes multimedia information, text recognition processing is performed on the multimedia information to obtain at least one input sub-text. And combining the obtained input sub texts according to a preset grammar rule to obtain the text to be repeated.

At least one piece of input information is obtained in response to an input operation of the target client. If the input information obtained by the server comprises the multimedia information, the server can perform text recognition processing on the multimedia information to obtain at least one input sub-text. For example, referring to fig. 4a (1), the input information is a screenshot of prompt information that a video conference cannot be entered, and then referring to fig. 4a (2), after the target client sends the screenshot to the server, the server may perform text recognition processing on the screenshot to obtain at least one input sub-text included in the screenshot, that is, "cannot", "enter", and "video conference".

The server can combine the obtained input sub-texts according to a preset grammar rule to obtain a text to be repeated; the server can also combine each obtained input sub-text with a preset sub-text according to a preset grammar rule to obtain a text to be repeated; the server may also combine the obtained partial input sub-text and the preset sub-text according to a preset grammar rule to obtain a text to be repeated, and the like, which is not limited specifically. The preset sub-text and the preset grammar rule are introduced with reference to the foregoing description, and are not described in detail herein. For example, referring to fig. 4b, if the preset sub-text is "why", after the input sub-texts "cannot", "enter", and "video conference" are obtained, the preset sub-text and the respective input sub-texts may be combined to obtain the text to be repeated "why the video conference cannot be entered".

S302, the server analyzes the text to be repeated based on a preset analysis strategy to obtain the grammatical structure information and semantic information of the text to be repeated.

After obtaining the text to be repeated, the server may analyze the text to be repeated based on a preset analysis policy. The preset analysis strategy includes a plurality of kinds, and two kinds of the preset analysis strategies are described below as examples.

Analyzing a strategy I:

and determining the syntactic structure information and semantic information of the text to be repeated according to the word frequency.

The server may determine, based on a preset reference text set, respective word frequencies of respective sub texts to be repeated included in the text to be repeated. The reference text set may be obtained based on a scene that is the same as or similar to the narrative scene, for example, for a question and answer scene, the reference text set may be obtained based on a related question and answer application or question and answer applet. The reference text set may also be collected by the target client in advance, and the like, and is not limited in particular.

After the server obtains a preset reference text set, word segmentation processing is respectively carried out on each reference text in the reference text set, and a plurality of reference sub-texts corresponding to each reference text are obtained. And the server respectively counts the occurrence frequency of different reference sub-texts in each obtained reference sub-text to obtain a first mapping relation between the reference sub-text and the word frequency. The server can obtain the first mapping relation in advance when the resource occupation is less, so that the first mapping relation can be directly read when the first mapping relation is used; the server may also obtain the first mapping relationship in real time when using the first mapping relationship, so as to ensure the accuracy of the first mapping relationship, and is not particularly limited.

After obtaining the text to be repeated, the server may perform word segmentation processing on the text to be repeated to obtain corresponding sub-texts to be repeated. After obtaining the first mapping relationship, the server may determine respective word frequencies of the respective sub-texts to be repeated based on the first mapping relationship. After the server obtains the respective word frequency of each sub-text to be repeated, the server can screen out the target sub-text of which the word frequency meets the preset word frequency screening condition from each sub-text to be repeated. The preset word frequency screening condition is, for example, a condition that the word frequency is greater than the specified word frequency, or a condition that the word frequency of each sub-text to be repeated is arranged before the specified sequence number in the sorting result of the order from large to small, and the like, and is not limited specifically.

After the server obtains each target sub-text, the server can analyze the obtained target sub-text to obtain grammatical structure information of the text to be repeated. The grammar structure information may be used to represent a framework structure of an expression mode corresponding to the text to be repeated, and may also be used to represent a connection mode of each sub-text in the expression mode corresponding to the text to be repeated.

For example, in the question and answer scenario, the frequency of occurrence of the words related to the question is relatively high, such as "why", "how to do", "solution" or "how", and in the reference text set associated with the question and answer scenario, the words belong to high frequency, that is, the words belong to a frequency greater than a specified word frequency. The server may screen out each target sub-text with a higher word frequency in the text to be repeated, and obtain the grammatical structure information of the text to be repeated, for example, for the target sub-text "how to do", the obtained grammatical structure information may be "modifier + verb + noun + how to do", and the like.

After obtaining each target sub-text, the server may further obtain the sub-text to be repeated except the screened target sub-text in each sub-text to be repeated, and analyze the sub-text to be repeated except the screened target sub-text to obtain semantic information of the text to be repeated.

For example, in a question-and-answer scenario, words that are not relevant to the question may appear relatively infrequently, such as "password," "spelling," "sewer," or "recitation," and in a reference text set associated with the question-and-answer scenario, may belong to words that are low-frequency, i.e., that belong to less than or equal to a specified word frequency. The server may screen out each sub-text to be repeated except the target sub-text with a low word frequency from the text to be repeated, and obtain semantic information of the text to be repeated, for example, for a "password" of the sub-text to be repeated, the obtained semantic information may be a "modified password" or a "query password" or the like.

And (5) analyzing a strategy II:

and determining syntactic structure information and semantic information of the text to be repeated according to the part of speech.

After the server performs word segmentation processing on the text to be repeated to obtain corresponding sub-texts to be repeated, the server can analyze the respective part-of-speech information of each sub-text to be repeated and the association relation of each sub-text to be repeated in the text to be repeated to obtain the syntactic structure information and semantic information of the text to be repeated. The part-of-speech information is used to characterize the part-of-speech of the sub-text to be repeated, for example, in chinese, the part-of-speech includes nouns, verbs, adjectives, or adverbs. The association relationship of each sub-text to be repeated in the text to be repeated, for example, in the chinese, an adjective modifies a noun, and then there is a modification relationship, i.e., an association relationship, between the adjective and the noun; for another example, in chinese, an adverb modifies a verb, and then there is a modification relationship, i.e., an association relationship, between the adverb and the verb.

After obtaining the part-of-speech information and the association relationship of each sub-text to be repeated, the server may determine a sub-text level corresponding to the part-of-speech information of each sub-text to be repeated based on a second mapping relationship between preset part-of-speech information and the sub-text level. The server may divide the part of speech into different levels according to the importance of the part of speech in the sentence, for example, a verb is used to indicate an action performed by a subject, the importance is higher, the verb may be divided into a first sub-text level, a noun is used to indicate a subject performing the action, or an object performing the action, the importance is second to the verb, the noun may be divided into a second sub-text level, and so on, which is not described herein again, so that a second mapping relationship between the part of speech information and the sub-text levels may be obtained.

As an embodiment, the server may construct the second mapping relationship as a tree structure, where the first sub-text level serves as a root, the second sub-text level serves as a first branch of the root, and the third sub-text level serves as a second branch of the first branch, and the like, which is not limited in particular.

After the server obtains the second mapping relationship, the server may screen out, in each sub-text to be repeated, a target sub-text of which the sub-text level is within a preset level range and has no association with the designated sub-text level, based on the association relationship of each sub-text to be repeated in the text to be repeated. The preset level range may be a pre-designated sub-text level, such as a first sub-text level and a second sub-text level with a higher importance level, and may also be, for example, a root and a first branch in a tree structure, and is not limited in particular. The specified sub-text level may be a level corresponding to a sub-text that does not affect the sentence structure information, for example, a sub-text level corresponding to a preposition, and for example, a sub-text level corresponding to an exclamation word, and the like, and is not limited specifically. Therefore, the screened target sub-texts are all the sub-texts to be repeated, which can be used for representing the structural information of the texts to be repeated.

After obtaining the target sub-text, the server may analyze the obtained target sub-text to obtain grammatical structure information of the text to be repeated, which may specifically refer to the foregoing description and is not described herein again.

After the server obtains the target sub-texts, the server can correspondingly obtain the sub-texts to be repeated in each sub-text to be repeated except the screened target sub-text. The server may analyze the sub-texts to be repeated except for the screened target sub-text in each sub-text to be repeated to obtain semantic information of the text to be repeated, which may specifically refer to the foregoing description and is not described herein again.

S303, the server determines a text composition template of the text to be repeated based on the grammatical structure information, and screens out at least one target text template which meets preset matching conditions with the text composition template from a pre-stored candidate text template set.

If the candidate text template set is prestored, the server can directly obtain the candidate text template set, and the multiple candidate text templates which are in the repeated relation with each other in each candidate text template contained in the candidate text template set have the same repeated mark. In some cases, if there is no set of candidate text templates, the server may obtain a set of candidate text templates based on a pre-stored set of sample texts. In each sample text included in the sample text set, the same retelling marks exist among a plurality of sample texts which are in retelling relationship with each other.

Referring to fig. 5, a process of obtaining the candidate text template set based on the sample text set may be that, after obtaining the sample text set, the server determines a sample text template for each sample text based on the respective syntactic structure information of each sample text. Thus, the server may establish a set of candidate text templates based on the respective sample text templates of the respective sample texts and the respective repeat tags of the respective sample texts.

For example, if the candidate text template "how … …" and the candidate text template "how … …" in the set of candidate text templates have a paraphrase relationship with each other, then the two candidate text templates may have the same paraphrase label, e.g., "0001", and then the candidate text templates with the label "0001" in the set of candidate text templates have a paraphrase relationship with each other.

Please refer to table 1, which is a storage form of a candidate text template set.

TABLE 1

Candidate text templates	Candidate text templates	Repeat mark
			___ how much	___ good for doing	01
___ to ___ aircraft	Flight from ___ to ___	02
			___ notes on	___ what needs to be noted	03
___ what reason is	___ how to get back	04
			___ which family is better	Preferably, ___	05

For another example, in the sample text set, the sample text "how to modify the password" and the sample text "how to modify the password" are in a repeated relationship with each other, then the two sample texts may have the same repeated mark, such as "0001", and then in the sample text set, the sample texts with the mark "0001" are in a repeated relationship with each other. Therefore, in the candidate text template set determined based on the sample text set, the candidate text templates having a mutual repeating relationship also have the same repeating mark.

After obtaining the grammatical structure information of the text to be rephrased, the server may determine a text composition template of the text to be rephrased based on the grammatical structure information. After obtaining the text composition templates of the text to be repeated, the server may match the text to be repeated with each candidate text template, thereby determining at least one target text template. There are several ways for the server to determine the target text template, two of which are described below as examples.

The method comprises the following steps:

the matched texts form template feature vectors of the templates, and the template feature vectors of the candidate text templates are respectively.

After obtaining the text composition template of the text to be repeated, the server may extract the template feature vectors of the text composition template and the respective template feature vectors of the candidate text templates. For example, the server extracts the template feature vectors of the text composition templates and the respective template feature vectors of the candidate text templates through the trained bert pre-training model. The server can extract the template characteristic vector of each candidate text template when the resource occupation is low, so that the template characteristic vector of each candidate text template can be directly called when the template characteristic vector of each candidate text template is used.

After obtaining the template feature vectors of the text composition templates and the template feature vectors of the candidate text templates, the server respectively determines the template feature vectors of the text composition templates and the vector similarity between the template feature vectors of the candidate text templates and the template feature vectors of the candidate text templates, and obtains corresponding similarity results. The method for determining the vector similarity by the server may be to calculate an euclidean distance between two template feature vectors, may also be to calculate a mahalanobis distance between two template feature vectors, may also be to calculate a cosine similarity between two template feature vectors, and the like, and is not particularly limited.

After obtaining the template feature vectors of the text composition templates and the similarity results between the template feature vectors of the candidate text templates, the server may determine the maximum similarity result among the similarity results. And screening out candidate text templates with the same repeated description marks as the candidate text templates from the candidate text template set based on the candidate text templates corresponding to the maximum similarity result, and taking the candidate text templates as target text templates, thereby obtaining at least one target text template.

As an embodiment, after obtaining at least one target text template, the server may update the sample text set and the candidate text template set based on the text to be repeated and the repeat flag of the at least one target text template, add the text to be repeated to the sample text set, and add the repeat flag that is the same as the repeat flag of the sample text corresponding to the at least one target text template. And the server adds the text composition template of the text to be repeated to the candidate text template set and repeats the marks which are the same as the repeat marks of at least one target text template.

The second method comprises the following steps:

and matching the text characteristic vectors of the text to be repeated with the respective template characteristic vectors of the candidate text templates by adopting the trained template recognition model.

After obtaining the text composition template of the text to be repeated, the server can adopt the trained template recognition model to fuse the text features of the text to be repeated and the template features of the text composition template to obtain the text feature vector of the text to be repeated. And respectively matching the text characteristic vectors of the text to be repeated through the trained template recognition model, and obtaining corresponding matching results with the respective template characteristic vectors of the candidate text templates. And screening out candidate text templates with matching results meeting preset matching conditions by the server based on the obtained matching results to serve as target text templates. The preset matching condition may be a condition that the matching result is greater than the specified matching threshold, or a condition that the matching results are arranged in the descending order and before the specified sequence number, and the like, and is not limited specifically.

In one embodiment, the server may obtain the trained template recognition model before determining the at least one target text model using the trained template recognition model. The server may receive the trained template recognition models sent by other devices, or may train the untrained template recognition models based on the candidate text template set, and the like, which is not limited specifically.

The process of training the untrained template recognition model is described below.

And after obtaining the sample text set and the candidate text template set, the server trains the untrained template recognition model based on each sample text. In the primary training process, the server adopts an untrained template recognition model, and fuses the text features of the sample text and the template features of the candidate text templates corresponding to the sample text to obtain the text feature vector of the sample text. The server respectively matches the text characteristic vectors of the sample text with the template characteristic vectors of the candidate text templates to obtain corresponding matching results;

and determining the training error of the untrained template recognition model based on the candidate text templates of which the matching results meet the preset matching conditions and the candidate text templates corresponding to the sample text in the candidate text templates. The preset matching condition may be a maximum value of the matching result, and the like, and is not limited specifically. And if the training error does not meet the preset convergence condition, adjusting the model parameters of the untrained template recognition model, and continuing to enter the next training process until the obtained training error meets the convergence condition to obtain the trained template recognition model.

S304, the server respectively combines the semantic information with at least one target text template to obtain at least one target repeat text of the text to be repeated.

After obtaining the at least one target repeat text, the server may combine semantic information of the text to be repeated and the target text template for each target text template in the at least one target text template. The server may determine, by using the trained text prediction model, a plurality of sub-text combination strategies corresponding to the target text template based on the template features of the target text template. The sub-text combination policy is, for example, different combination manners of sub-texts with different parts of speech, and is, for example, a combination manner of adding different modifiers to different sub-texts, and the like, and is not particularly limited.

And the server combines the semantic information and the target text template respectively by adopting the obtained sub-text combination strategies through the trained text prediction model to obtain corresponding combination results. The server can screen out a target combination result of which the combination result meets a preset combination condition as a target retelling text based on each obtained combination result. For example, each combination result has a corresponding matching probability value, the server may screen out the combination result with the matching probability value greater than the specified threshold as the target combination result, or may sort the combination results according to the descending order of the matching probability values, screen out the combination result before the specified sequence number as the target combination result, and the like, which is not limited specifically.

In one embodiment, the server may obtain the trained text prediction model before determining the at least one target text model using the trained text prediction model. The server may receive the trained text prediction models sent by other devices, or may train the untrained text prediction models based on the candidate text template set and the sample text set, and the like, which is not limited specifically.

The process of training an untrained text prediction model is described below.

After obtaining the sample text set and the candidate text template set, the server may train the uncontinuous text prediction model based on each candidate text template. In the one-time training process, the server determines a plurality of sub-text combination strategies corresponding to the candidate text templates based on the template characteristics of the candidate text templates by adopting an untrained text prediction model. And the server combines the semantic information of the sample text and the candidate text template respectively by adopting the obtained sub-text combination strategies through the untrained text prediction model to obtain a corresponding combination result. The server can screen out a target combination result of which the combination result meets the preset combination condition based on each obtained combination result to serve as a text for the repeated training.

And determining the training error of the untrained text prediction model based on the training repeat text and the sample text corresponding to the candidate text template. And if the training error does not meet the preset convergence condition, adjusting the model parameters of the untrained text prediction model, and continuing to enter the next training process until the obtained training error meets the convergence condition to obtain the trained text prediction model.

As an embodiment, when the obtained sub-text combination strategies are adopted to respectively combine the semantic information with the target text template, the reference semantic sub-text and the target text template can be respectively combined based on the reference semantic sub-text with the similarity between the semantic information and the reference semantic sub-text in the pre-stored reference semantic set, and the obtained sub-text combination strategies are adopted to obtain various reasonable common expression modes of the text to be repeated in a semantic angle, so that the accuracy and diversity of the repeated text are further improved.

As an embodiment, the process of determining the target text template and the target repeat text may be completed by a beam search (beam search), which is a heuristic graph search algorithm, and in the breadth search process, a part of nodes with higher quality are retained each time, so that the time-space consumption of complete search and the local optimization of greedy search may be balanced.

In the embodiment of the application, after the target text template is obtained, the template can be formed based on the text of the text to be repeated, the candidate text template set is expanded, and the sample text set is expanded based on the text to be repeated, so that when the template recognition model and the text prediction model are trained based on the candidate text template set and the sample text set, the trained template recognition model and the trained text prediction model can be obtained more accurately, and the robustness of the trained template recognition model and the trained text prediction model is improved. And other syntactic structures with the same meaning as the syntactic structure of the text to be repeated are used for expressing other syntactic structures with the same meaning, and the semantic information of the text to be repeated is combined, so that the difference of characters between the obtained target repeated text and the text to be repeated is improved, and the difference of the meaning between the target repeated text and the text to be repeated is reduced.

The method for replying the text can be used for expanding the question and answer index knowledge base and ensuring the dynamic expansion of the question and answer index knowledge base, so that the reply accuracy of a question and answer system constructed based on the question and answer index knowledge base, the recommendation accuracy of related questions and the like are improved. By expanding the question-answer index knowledge base, the question-answer system constructed based on the question-answer index knowledge base can accurately reply, recommend related questions or dynamically expand and the like under the condition of cold start.

The following description takes a scenario of a knowledge base indexed by question and answer, a template recognition model and a text prediction model constructed based on a Transformer model as an example.

Referring to fig. 6a, when the method for replying text provided in the embodiment of the present application is not used, after the server obtains the input text "the student does not go to the classroom and is prohibited from entering" in response to the input operation of the target client, the server does not determine the solution corresponding to "the student does not go to the classroom and is prohibited from entering" based on the correspondence between the prestored problem and the solution, and therefore, the server feeds back the preset output text "i should learn well and does not understand what you say to the target client".

When the method for replying the text provided by the embodiment of the application is used, after the text to be replied is obtained by the server, that the student does not go to the classroom and is prohibited from entering, the text to be replied is analyzed by the server, and each text to be replied is obtained, wherein the text to be replied comprises the text of student, the text of classroom, the text of prohibited, the text of entrance and the text of prohibition.

The server screens out target sub-texts in each sub-text to be repeated, wherein the target sub-texts comprise 'forbidden' and 'entering', analyzes the target sub-texts to obtain grammatical structure information of the text to be repeated, and analyzes the sub-texts to be repeated except the processed target sub-text in each sub-text to be repeated to obtain semantic information of the text to be repeated.

The server determines that a text composition template 'subject + forbidden to enter + object' of the text to be repeated based on grammatical structure information of the text to be repeated, and screens out at least one target text template which meets preset matching conditions with the text composition template from a pre-stored candidate text template set. The process of screening the target text template may be obtained using a trained template recognition model.

Referring to fig. 6b, taking the trained template recognition model as a transform model as an example, the transform model includes a coding sub-model and a decoding sub-model, and the coding sub-model includes a text coding sub-model to be repeated and a text composition template coding sub-model.

The position in the text composition template where semantic information is represented may be occupied with a special character "[ MASK ]", e.g., the text composition template "subject + prohibited from entering + object", may be represented as "[ MASK ] prohibited from entering [ MASK ]".

The server enables the text to be repeated to be forbidden to enter the text coding submodel to be repeated without entering the classroom, obtains the text characteristics of the text to be repeated, enables the text composition template [ MASK ] of the text to be repeated to be forbidden to enter the text coding submodel to be input, and obtains the template characteristics of the text composition template. And obtaining a text feature vector of the text to be repeated by fusing the text features of the text to be repeated and the template features of the template formed by the text. The server inputs the text feature vectors into the decoding submodel, obtains matching results of the text feature vectors and the template feature vectors of the candidate text templates, and outputs at least one target text template with the matching results meeting preset matching conditions, wherein the target text template comprises 'how to enter ___', 'what the best mode is for entering ___' or 'how I should enter ___', and the like.

After obtaining at least one target text template, the server may determine a target repeat text using the trained text prediction model. Referring to fig. 6c, the text prediction model includes a coding sub-model and a decoding sub-model, and the coding sub-model includes a text coding sub-model to be repeated and a target text template coding sub-model.

The server inputs a text coding sub-model of a text to be repeated into a semantic information of the text to be repeated, wherein the student does not go to a classroom and is prohibited from entering the input text coding sub-model, and the server inputs a target text template coding sub-model of ' how to enter ___ ' in at least one target text template to obtain a plurality of sub-text combination strategies, wherein the strategies comprise ' how to enter a subject + an object ', ' how to enter a subject + a shape + an object ', how to enter a + an object ', and the like. And aiming at each sub-text combination strategy, combining semantic information and a target text model to obtain a corresponding target retelling text, such as ' how a student enters a classroom ', ' how a student quickly enters the classroom ', how the student enters the classroom ' and the like.

Referring to fig. 6d, after obtaining the target retelling text based on the text to be retested, the server may determine a solution corresponding to the target retelling text in the preset correspondence between the problem and the solution. The server may feed back the obtained solution to the target client, and referring to fig. 6e, after the target client inputs "the student does not go to the classroom and is prohibited from entering", the server may display the solution "the student may change an account to log in and try to enter the classroom" through the target client.

Referring to fig. 6f, after the target client inputs the problems of "the student does not go to the classroom, is prohibited from entering the classroom", "does not let enter the classroom" or "cannot go to the classroom", and the like, the server can find a corresponding solution and feed the solution back to the target client through the determined target repeat text, and meanwhile, the corresponding relationship with the solution can be established based on the obtained problems of "the student does not go to the classroom, is prohibited from entering the classroom", "does not go to the classroom", "does not let enter the classroom" or "cannot go to the classroom", and the like, so that the question and answer knowledge base is enriched.

Based on the same inventive concept, the embodiment of the present application provides a device for text rephrasing, which is equivalent to the server discussed above and can implement the corresponding functions of the method for text rephrasing. Referring to fig. 7, the apparatus includes an obtaining module 701 and a processing module 702, wherein:

an acquisition module 701: the text editing device is used for obtaining a text to be repeated;

the processing module 702: the method comprises the steps of analyzing a text to be repeated based on a preset analysis strategy to obtain grammatical structure information and semantic information of the text to be repeated;

the processing module 702 is further configured to: determining a text composition template of a text to be repeated based on the grammatical structure information, and screening out at least one target text template which meets preset matching conditions with the text composition template from a pre-stored candidate text template set;

the processing module 702 is further configured to: and respectively combining the semantic information and at least one target text template to obtain at least one target rephrase text of the text to be rephrased.

In a possible embodiment, the obtaining module 701 is specifically configured to:

for at least one piece of input information, the following operations are respectively performed:

if one input message is multimedia message, performing text recognition processing on the multimedia message to obtain at least one input sub-text;

In a possible embodiment, the processing module 702 is further configured to:

the method comprises the steps that a preset reference text set is obtained before a text to be repeated is analyzed based on a preset analysis strategy and grammatical structure information and semantic information of the text to be repeated are obtained;

In a possible embodiment, the processing module 702 is specifically configured to:

screening out target sub-texts with the word frequencies meeting preset word frequency screening conditions from each sub-text to be repeated, and analyzing the obtained target sub-texts to obtain grammatical structure information of the text to be repeated;

determining a sub-text level corresponding to the part of speech information of each sub-text to be repeated based on a second mapping relation between the preset part of speech information and the sub-text level;

screening out target sub-texts of which the sub-text levels are within a preset level range and have no association with the designated sub-text levels based on the association relation of each sub-text to be repeated in the text to be repeated, and analyzing the obtained target sub-texts to obtain grammatical structure information of the text to be repeated;

In a possible embodiment, the processing module 702 is further configured to:

in a pre-stored candidate text template set, before screening out at least one target text template which meets a preset matching condition with a text composition template, obtaining a pre-stored sample text set, wherein the same rephrasing marks exist among a plurality of sample texts which are in a rephrasing relationship with each other in each sample text included in the sample text set;

and establishing a candidate text template set based on the respective sample text templates of the sample texts and the respective repeat marks of the sample texts, wherein the same repeat marks exist among a plurality of candidate text templates which are in repeat relation with each other in the candidate text templates included in the candidate text template set.

extracting template feature vectors of templates formed by texts and template feature vectors of candidate text templates;

and screening out candidate text templates with the same repeated statement marks as the candidate text template corresponding to the maximum similarity result from the candidate text template set as the target text template based on the obtained similarity result.

In a possible embodiment, the processing module 702 is further configured to:

and after the semantic information and the at least one target text template are respectively combined to obtain at least one target repeated text of the text to be repeated, updating the sample text set and the candidate text template set based on the text to be repeated and the at least one target text template.

adopting a trained template recognition model, fusing text features of a text to be repeated and template features of a template formed by the text to obtain a text feature vector of the text to be repeated;

respectively matching the text characteristic vectors of the text to be repeated with the respective template characteristic vectors of the candidate text templates to obtain corresponding matching results;

determining a plurality of sub-text combination strategies corresponding to the target text template by adopting a trained text prediction model and based on the template characteristics of the target text template;

combining semantic information and a target text template respectively by adopting the obtained sub-text combination strategies to obtain corresponding combination results;

and screening out target combination results of which the combination results meet preset combination conditions based on the obtained combination results to serve as target retelling texts.

acquiring a pre-stored reference semantic sub-text set;

Based on the same inventive concept, the embodiment of the present application provides a computer device, and the computer device 800 is described below.

Referring to fig. 8, the apparatus for reviewing the text may be run on a computer device 800, and a current version and a historical version of a data storage program and application software corresponding to the data storage program may be installed on the computer device 800, the computer device 800 includes a display unit 840, a processor 880 and a memory 820, wherein the display unit 840 includes a display panel 841 for displaying an interface interacted with by a user, and the like.

In one possible embodiment, the Display panel 841 may be configured in the form of a Liquid Crystal Display (LCD) or an Organic Light-Emitting Diode (OLED) or the like.

The processor 880 is used to read the computer program and then execute a method defined by the computer program, for example, the processor 880 reads a data storage program or a file, etc., so as to run the data storage program on the computer device 800 and display a corresponding interface on the display unit 840. The Processor 880 may include one or more general-purpose processors, and may further include one or more DSPs (Digital Signal processors) for performing relevant operations to implement the technical solutions provided in the embodiments of the present application.

Memory 820 typically includes both internal and external memory, which may be Random Access Memory (RAM), Read Only Memory (ROM), and CACHE memory (CACHE). The external memory can be a hard disk, an optical disk, a USB disk, a floppy disk or a tape drive. The memory 820 is used for storing a computer program including an application program and the like corresponding to each client, and other data, which may include data generated after an operating system or the application program is executed, including system data (e.g., configuration parameters of the operating system) and user data. Program instructions in the embodiments of the present application are stored in memory 820, and processor 880 executes the program instructions stored in memory 820 to implement any of the methods described in the previous figures for reciting text.

The display unit 840 is used to receive input numerical information, character information, or contact touch operation/non-contact gesture, and generate signal input related to user setting and function control of the computer device 800, and the like. Specifically, in the embodiment of the present application, the display unit 840 may include a display panel 841. The display panel 841, such as a touch screen, may collect touch operations of a user (e.g., operations of a user on the display panel 841 or on the display panel 841 using a finger, a stylus, or any other suitable object or accessory) thereon or nearby, and drive a corresponding connection device according to a preset program.

In one possible embodiment, the display panel 841 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a player, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts it to touch point coordinates, and sends the touch point coordinates to the processor 880, and can receive and execute commands from the processor 880.

The display panel 841 can be implemented by various types, such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the display unit 840, the computer device 800 may also include an input unit 830, the input unit 830 may include a graphical input device 831 and other input devices 832, wherein the other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

In addition to the above, computer device 800 may also include a power supply 890 for powering the other modules, audio circuitry 860, near field communication module 870, and RF circuitry 810. The computer device 800 may also include one or more sensors 850, such as acceleration sensors, light sensors, pressure sensors, and the like. The audio circuit 860 specifically includes a speaker 861, a microphone 862, and the like, for example, the computer device 800 may collect the sound of the user through the microphone 862 and perform corresponding operations.

For one embodiment, the number of the processors 880 may be one or more, and the processors 880 and the memory 820 may be coupled or relatively independent.

Processor 880 of fig. 8 may be used to implement the functionality of acquisition module 701 and processing module 702 of fig. 7, as an example.

As an example, the processor 880 in fig. 8 may be used to implement the corresponding functions of the server 102 discussed above.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A method of reciting text, comprising:

obtaining a text to be repeated;

2. The method of claim 1, wherein obtaining text to be repeated comprises:

3. The method according to claim 1, before parsing the text to be repeated based on a preset parsing policy to obtain syntactic structure information and semantic information of the text to be repeated, further comprising:

acquiring a preset reference text set;

4. The method according to claim 3, wherein parsing the text to be repeated based on a preset parsing strategy to obtain syntactic structure information and semantic information of the text to be repeated comprises:

5. The method according to claim 1, wherein parsing the text to be repeated based on a preset parsing strategy to obtain syntactic structure information and semantic information of the text to be repeated comprises:

6. The method according to claim 5, wherein analyzing respective part-of-speech information of each of the sub-texts to be repeated and an association relationship of each of the sub-texts to be repeated in the text to be repeated to obtain syntactic structure information and semantic information of the text to be repeated comprises:

7. The method according to any one of claims 1 to 6, before screening out at least one target text template satisfying a preset matching condition with the text composition template from a pre-stored candidate text template set, further comprising:

obtaining a pre-stored sample text set, wherein the sample text set comprises a plurality of sample texts which are in a repeating relationship with each other and have the same repeating marks;

8. The method according to claim 7, wherein screening out at least one target text template satisfying a preset matching condition with the text composition template from a pre-stored candidate text template set comprises:

9. The method according to claim 7, further comprising, after combining the semantic information and the at least one target text template to obtain at least one target retended text of the text to be retended, respectively:

updating the sample text set and the candidate text template set based on the text to be repeated and the at least one target text template.

10. The method according to any one of claims 1 to 6, wherein screening out at least one target text template satisfying a preset matching condition with the text composition template from a pre-stored candidate text template set comprises:

11. The method according to any one of claims 1 to 6, wherein the combining the semantic information and the at least one target text template to obtain at least one target retelling text of the text to be retested comprises:

12. The method according to claim 11, wherein the combining the semantic information and the target text template with the obtained sub-text combination strategy respectively to obtain a corresponding combination result comprises:

acquiring a pre-stored reference semantic sub-text set;

13. An apparatus for reciting text, comprising:

14. A computer device, comprising:

a memory for storing program instructions;

a processor for calling the program instructions stored in the memory and executing the method according to any one of claims 1 to 12 according to the obtained program instructions.

15. A computer-readable storage medium having stored thereon computer-executable instructions for causing a computer to perform the method of any one of claims 1 to 12.