CN111144093A

CN111144093A - Intelligent text processing method and device, electronic equipment and storage medium

Info

Publication number: CN111144093A
Application number: CN201911367547.4A
Authority: CN
Inventors: 田植良
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-12-26
Filing date: 2019-12-26
Publication date: 2020-05-12

Abstract

The invention provides an intelligent text processing method, which comprises the following steps: acquiring text content at chapter level corresponding to the operation in the touch screen; extracting a feature vector matched with text content at the chapter level; determining a statement vector corresponding to the text content; determining at least one word-level hidden variable corresponding to the text content; generating candidate words corresponding to hidden variables of word levels and the selected probability of the candidate words; and selecting at least one candidate word to form a target text corresponding to the text content according to the selected probability of the candidate word and the sentence vector corresponding to the text content. The invention also provides a text processing device, electronic equipment and a storage medium. The text processing method and the text processing device can process corresponding sentence at chapter level in the touch screen in the process of selecting the text by the touch screen when used by a user, and improve the accuracy and readability of the generated text.

Description

Intelligent text processing method and device, electronic equipment and storage medium

Technical Field

The present invention relates to information processing technologies, and in particular, to an intelligent text processing method and apparatus, an electronic device, and a storage medium.

Background

In the conventional technology, when text information is selected by using a touch screen electronic device (a mobile phone, an ipad, etc.), due to the limitation of the operable area of the touch screen and the habit of one-hand operation of a user, the user cannot select a text accurately by manually controlling a cursor, and often encounters the situation that an ideal text cannot be selected, so that the selection speed and the selection accuracy of the text by the user are affected. The artificial intelligence is the theory, method and technology for simulating, extending and expanding human intelligence by using a digital computer or a machine controlled by the digital computer, perceiving environment, acquiring knowledge and obtaining the best result by using the knowledge, and the artificial intelligence of an application system, namely, the artificial intelligence for researching the design principle and the implementation method of various intelligent machines, so that the machine has the functions of perception, reasoning and decision making, and in the field of voice processing, the recognition of text information is realized by using the digital computer or the machine controlled by the digital computer.

Disclosure of Invention

In view of this, an embodiment of the present invention provides an intelligent text processing method, an intelligent text processing apparatus, an electronic device, and a storage medium, and a technical solution of the embodiment of the present invention is implemented as follows:

the embodiment of the invention provides an intelligent text processing method, which comprises the following steps:

acquiring text content at chapter level corresponding to the operation in the touch screen;

extracting a characteristic vector matched with the text content at the chapter level through a first neural network of a text processing model;

determining a statement vector corresponding to the text content according to the feature vector through a first neural network of the text processing model;

determining at least one word-level hidden variable corresponding to the text content according to the feature vector through a second neural network of the text processing model;

generating, by a second neural network of the text processing model, candidate words corresponding to the word-level hidden variables and a selected probability of the candidate words according to the at least one word-level hidden variable;

selecting at least one candidate word to form a target text corresponding to the text content according to the selected probability of the candidate word and the statement vector corresponding to the text content;

and displaying the target text in a display mode corresponding to the selected operation in the touch screen.

In the above scheme, the method further comprises:

acquiring a first training sample set, wherein the first training sample set comprises at least one group of sentence samples at chapter level input through a corresponding touch screen;

denoising the first training sample set to form a corresponding second training sample set;

processing the second training sample set through a text processing model to determine initial parameters of a first neural network and initial parameters of a second neural network in the text processing model;

responding to the initial parameters of the first neural network and the initial parameters of the second neural network, processing the second training sample set through the text processing model, and determining updating parameters corresponding to different neural networks of the text processing model;

and respectively iteratively updating the parameters of the first neural network and the parameters of the second neural network of the text processing model through the second training sample set according to the updating parameters of the text processing model corresponding to different neural networks of the text processing model, so as to realize processing of text contents at corresponding chapter levels in the touch screen through the text processing model.

In the foregoing solution, the performing denoising processing on the first training sample set to form a corresponding second training sample set includes:

determining a dynamic noise threshold value matched with the use environment of the text processing model;

and denoising the first training sample set according to the dynamic noise threshold value to form a second training sample set matched with the dynamic noise threshold value.

determining a fixed noise threshold corresponding to a usage environment of the text processing model;

and denoising the first training sample set according to the fixed noise threshold value to form a second training sample set matched with the fixed noise threshold value.

In the above scheme, the method further comprises:

determining a set of attention parameters for the second set of training samples in response to a set of training sample dictionaries for the text processing model;

and performing weighting processing on the second training sample set according to the training sample dictionary set and the attention parameter set of the second training sample set so as to realize the adaptation of the second training sample set and the training sample dictionary set of the text processing model.

In the above scheme, the method further comprises:

negative example processing is carried out on the first training sample set to form a negative example sample set corresponding to the first training sample set, wherein the negative example sample set is used for adjusting the parameters of an encoder and a decoder of the first neural network or adjusting the parameters of the encoder and the decoder of a second neural network;

and determining a corresponding evaluation research value according to the negative example sample set, wherein the evaluation research value is used for evaluating a text processing result of the text processing model as a supervision parameter.

In the foregoing solution, the performing negative example processing on the first training sample set includes:

randomly combining the sentences to be output generated by the text processing model to form a negative example sample set corresponding to the first training sample set; alternatively, the first and second electrodes may be,

and carrying out random deletion processing or replacement processing on the sentence to be output generated by the text processing model to form a negative example sample set corresponding to the first training sample set.

In the foregoing solution, the iteratively updating, according to the update parameters of the text processing model corresponding to different neural networks of the text processing model, the parameters of the first neural network and the parameters of the second neural network of the text processing model through the second training sample set respectively includes:

determining a second noise parameter matched with the second training sample set, wherein the second noise parameter is used for representing the noise value of the parallel statement samples in the second training sample set;

iteratively updating the parameters of the encoder and the decoder of the first neural network according to the noise value of the second noise parameter until a loss function corresponding to a self-encoding network formed by the encoder and the decoder of the first neural network meets a corresponding convergence condition;

and iteratively updating the parameters of the encoder and the decoder of the second neural network according to the noise value of the second noise parameter until a loss function corresponding to a self-encoding network formed by the encoder and the decoder of the second neural network meets a corresponding convergence condition.

The embodiment of the invention also provides an intelligent text processing device, which comprises:

the information processing module is used for acquiring text contents at chapter levels corresponding to operations in the touch screen;

the information processing module is used for extracting a characteristic vector matched with the text content at the chapter level through a first neural network of the text processing model;

the information processing module is used for determining a statement vector corresponding to the text content according to the feature vector through a first neural network of the text processing model;

the information processing module is used for determining at least one word-level hidden variable corresponding to the text content according to the feature vector through a second neural network of the text processing model;

the information processing module is used for generating candidate words corresponding to the hidden variables of the word level and the selected probability of the candidate words according to the hidden variables of the at least one word level through a second neural network of the text processing model;

the information processing module is used for selecting at least one candidate word to form a target text corresponding to the text content at the chapter level according to the selection probability of the candidate word and the sentence vector corresponding to the text content;

and the information transmission module is used for displaying the target text in a display mode corresponding to the selected operation in the touch screen.

In the above-mentioned scheme, the first step of the method,

the information processing module is used for triggering the corresponding word segmentation library according to text parameter information carried by text content at chapter level displayed in the touch screen;

the information processing module is used for carrying out word segmentation processing on text contents at chapter level displayed in the touch screen through the triggered word segmentation library word dictionary to form different word level feature vectors;

and the information processing module is used for denoising the different word-level feature vectors to form a word-level feature vector set matched with the text content at the chapter level.

In the above-mentioned scheme, the first step of the method,

the information processing module is used for determining a dynamic noise threshold value matched with the use environment of the text processing model;

the information processing module is used for carrying out denoising processing on the different word-level feature vectors according to the dynamic noise threshold value and triggering a dynamic word segmentation strategy matched with the dynamic noise threshold value;

and the information processing module is used for performing word segmentation processing on text contents at chapter level displayed in the touch screen according to a dynamic word segmentation strategy matched with the dynamic noise threshold value to form a corresponding dynamic word level feature vector set.

In the above-mentioned scheme, the first step of the method,

the information processing module is used for determining a fixed noise threshold value corresponding to the use environment of the text processing model;

the information processing module is used for denoising the different word-level feature vectors according to the fixed noise threshold and triggering a fixed word segmentation strategy matched with the fixed noise threshold;

and performing word segmentation processing on the target text at chapter level displayed in the touch screen according to a fixed word segmentation strategy matched with the fixed noise threshold value to form a corresponding fixed word level feature vector set.

In the above-mentioned scheme, the first step of the method,

the information processing module is configured to select at least one candidate word to form a target text corresponding to the text content according to the selected probability of the candidate word and the sentence vector corresponding to the text content, and includes:

the information processing module is used for matching the statement vector corresponding to the text content with the text content displayed in the touch screen;

the information processing module is used for carrying out fusion processing on the selected probability of the candidate words according to the matching result of the sentence vector corresponding to the text content and the text content at the chapter level displayed in the touch screen;

and the information processing module is used for selecting at least one candidate word to form a target text corresponding to the text content at the chapter level according to the fusion processing result of the selected probability of the candidate word.

In the above-mentioned scheme, the first step of the method,

the information transmission module is used for acquiring a first training sample set, wherein the first training sample set comprises at least one group of sentence samples at chapter level input through a corresponding touch screen;

the information processing module is used for carrying out denoising processing on the first training sample set to form a corresponding second training sample set;

the information processing module is used for processing the second training sample set through a text processing model so as to determine initial parameters of a first neural network and initial parameters of a second neural network in the text processing model;

the information processing module is used for responding to the initial parameters of the first neural network and the initial parameters of the second neural network, processing the second training sample set through the text processing model, and determining the updating parameters corresponding to different neural networks of the text processing model;

the information processing module is used for respectively carrying out iterative updating on the parameters of the first neural network and the parameters of the second neural network of the text processing model through the second training sample set according to the updating parameters of the text processing model corresponding to different neural networks of the text processing model, so as to realize processing of text contents at corresponding chapter levels in the touch screen through the text processing model.

In the above-mentioned scheme, the first step of the method,

and the information processing module is used for carrying out denoising processing on the first training sample set according to the dynamic noise threshold value so as to form a second training sample set matched with the dynamic noise threshold value.

In the above-mentioned scheme, the first step of the method,

and the information processing module is used for carrying out denoising processing on the first training sample set according to the fixed noise threshold value so as to form a second training sample set matched with the fixed noise threshold value.

In the above-mentioned scheme, the first step of the method,

the information processing module is used for substituting different sentence samples in the second training sample set into a loss function corresponding to a self-coding network formed by an encoder and a decoder of a first neural network of the text processing model;

the information processing module is used for determining parameters of an encoder and corresponding decoder corresponding to the first neural network when the loss function meets a first convergence condition as updating parameters of the first neural network;

the information processing module is used for substituting different sentence samples in the second training sample set into a loss function corresponding to a self-coding network formed by an encoder and a decoder of a second neural network of the text processing model;

and the information processing module is used for determining parameters of an encoder and corresponding decoder corresponding to the second neural network when the loss function meets a second convergence condition as update parameters of the second neural network.

In the above-mentioned scheme, the first step of the method,

the information processing module is configured to determine a second noise parameter matched with the second training sample set, where the second noise parameter is used to characterize a noise value of a parallel statement sample in the second training sample set;

the information processing module is configured to iteratively update the encoder parameter and the decoder parameter of the first neural network according to the noise value of the second noise parameter until a loss function corresponding to a self-encoding network formed by an encoder and a decoder of the first neural network meets a corresponding convergence condition;

and the information processing module is used for iteratively updating the parameters of the encoder and the decoder of the second neural network according to the noise value of the second noise parameter until a loss function corresponding to a self-encoding network formed by the encoder and the decoder of the second neural network meets a corresponding convergence condition.

In the above-mentioned scheme, the first step of the method,

the information processing module is used for responding to a training sample dictionary set of the text processing model and determining an attention parameter set of the second training sample set;

and the information processing module is used for performing weighting processing on the second training sample set according to the training sample dictionary set and the attention parameter set of the second training sample set so as to realize the adaptation of the second training sample set and the training sample dictionary set of the text processing model.

In the above-mentioned scheme, the first step of the method,

the information processing module is configured to apply negative example processing to the first training sample set to form a negative example sample set corresponding to the first training sample set, where the negative example sample set is used to adjust encoder parameters and decoder parameters of the first neural network or adjust encoder parameters and decoder parameters of a second neural network;

and the information processing module is used for determining a corresponding evaluation research value according to the negative example sample set, wherein the evaluation research value is used as a supervision parameter for evaluating a text processing result of the text processing model.

In the above-mentioned scheme, the first step of the method,

the information processing module is used for randomly combining the sentences to be output, which are generated by the text processing model, to form a negative example sample set corresponding to the first training sample set;

the information processing module is used for carrying out random deletion processing or replacement processing on the sentences to be output generated by the text processing model to form a negative example sample set corresponding to the first training sample set.

An embodiment of the present invention further provides an electronic device, where the electronic device includes:

a memory for storing executable instructions;

and the processor is used for realizing the preorder training method when the executable instructions stored in the memory are operated.

The embodiment of the invention also provides a computer-readable storage medium, which stores executable instructions, and the executable instructions are executed by a processor to realize the intelligent text processing method of the preamble.

The embodiment of the invention has the following beneficial effects:

the method comprises the steps of obtaining a first training sample set, and carrying out denoising processing on the first training sample set to form a corresponding second training sample set; processing the second training sample set through a text processing model to determine initial parameters of a first neural network and initial parameters of a second neural network in the text processing model; responding to the initial parameters of the first neural network and the initial parameters of the second neural network, processing the second training sample set through the text processing model, and determining updating parameters corresponding to different neural networks of the text processing model; according to the update parameters of the text processing model corresponding to the different neural networks of the text processing model, the parameters of the first neural network and the parameters of the second neural network of the text processing model are respectively updated iteratively through the second training sample set, so that the generalization capability of the text processing model is stronger, the training precision and the training speed of the text processing model are improved, meanwhile, the text processing model can adapt to different use scenes, the influence of environmental noise on the text processing model is avoided, the text processing model can generate high-quality text processing results, and the capability of processing corresponding chapters-level sentences in the touch screen through the text processing model is improved.

Drawings

Fig. 1 is a schematic view of a usage scenario of a text processing model training method according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating a text processing result generated by a Seq2Seq model based on RNN in the prior art;

FIG. 4 is a schematic flow chart illustrating an alternative method for training a text processing model according to an embodiment of the present invention;

FIG. 5 is an alternative schematic diagram of a second neural network according to an embodiment of the present invention;

FIG. 6 is a diagram illustrating an alternative machine-readable representation of a second neural network in accordance with an embodiment of the present invention;

FIG. 7 is an alternative schematic diagram of an encoder in a second neural network in accordance with embodiments of the present invention;

FIG. 8 is a schematic diagram of vector stitching of an encoder in a second neural network according to an embodiment of the present invention;

FIG. 9 is a diagram illustrating an encoding process of an encoder in a second neural network according to an embodiment of the present invention;

FIG. 10 is a diagram illustrating a decoding process of a decoder in a second neural network according to an embodiment of the present invention;

FIG. 11 is a diagram illustrating a decoding process of a decoder in a second neural network according to an embodiment of the present invention;

FIG. 12 is a diagram illustrating a decoding process of a decoder in a second neural network according to an embodiment of the present invention;

FIG. 13 is a diagram illustrating an alternative sentence-level text processing for the second neural network in accordance with an embodiment of the present invention;

FIG. 14 is a schematic flow chart illustrating an alternative method for training a text processing model according to an embodiment of the present invention;

FIG. 15 is a diagram illustrating an application environment for text selection according to the related art in an embodiment of the present invention;

FIG. 16 is a diagram illustrating an application environment for text selection according to the related art in an embodiment of the present invention;

FIG. 17 is a diagram illustrating an application environment for text selection according to the related art in an embodiment of the present invention;

FIG. 18 is a diagram illustrating a training process of a text processing model according to an embodiment of the present invention;

FIG. 19 is a diagram illustrating a data structure of a text processing model according to an embodiment of the present invention;

FIG. 20A is a diagram illustrating a text selection of a text processing model according to an embodiment of the present invention;

fig. 20B is a schematic diagram illustrating a usage process of the text processing model according to the embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail with reference to the accompanying drawings, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

Before further detailed description of the embodiments of the present invention, terms and expressions mentioned in the embodiments of the present invention are explained, and the terms and expressions mentioned in the embodiments of the present invention are applied to the following explanations.

1) In response to the condition or state on which the performed operation depends, one or more of the performed operations may be in real-time or may have a set delay when the dependent condition or state is satisfied; there is no restriction on the order of execution of the operations performed unless otherwise specified.

2) Word segmentation: and segmenting the Chinese text by using a Chinese word segmentation tool to obtain a set of fine-grained words. Stop words: words or words that do not contribute or contribute negligibly to the semantics of the text. Cosin similarity: the two texts are represented as cosine similarities behind a vector.

3) A word bank is divided: the term segmentation library refers to a specific word segmentation method, and word dictionaries corresponding to different term segmentation libraries can be used for carrying out word segmentation processing on corresponding text information according to the word dictionaries corresponding to the term segmentation libraries.

4) Consistency: meaning that the data accessed in different server accesses is always unique.

5) Convolutional Neural Networks (CNN Convolutional Neural Networks) are a class of Feed forward Neural Networks (Feed forward Neural Networks) that contain convolution computations and have a deep structure, and are one of the representative algorithms for deep learning (deep). The convolutional neural network has a representation learning (representation learning) capability, and can perform shift-invariant classification (shift-invariant classification) on input information according to a hierarchical structure of the convolutional neural network.

6) And (4) model training, namely performing multi-classification learning on the image data set. The model can be constructed by adopting deep learning frames such as Tensor Flow, torch and the like, and a multi-classification model is formed by combining multiple layers of neural network layers such as CNN and the like. The input of the model is a three-channel or original channel matrix formed by reading an image through openCV and other tools, the output of the model is multi-classification probability, and the webpage category is finally output through softmax and other algorithms. During training, the model approaches to a correct trend through an objective function such as cross entropy and the like.

7) Neural Networks (NN): an Artificial Neural Network (ANN), referred to as Neural Network or Neural Network for short, is a mathematical model or computational model that imitates the structure and function of biological Neural Network (central nervous system of animals, especially brain) in the field of machine learning and cognitive science, and is used for estimating or approximating functions.

8) Encoder-decoder architecture: a network architecture commonly used for machine translation technology. The decoder receives the output result of the encoder as input and outputs a corresponding text sequence of another language.

9) Bidirectional attention neural network model (BERT Bidirectional Encoder recurrent from transformations) Google.

10) token: the word unit, before any actual processing of the input text, needs to be divided into language units such as words, punctuation, numbers or pure alphanumerics. These units are called word units.

11) Softmax: the normalized exponential function is a generalization of the logistic function. It can "compress" a K-dimensional vector containing arbitrary real numbers into another K-dimensional real vector, such that each element ranges between [0,1] and the sum of all elements is 1.

12) Transformers: a new network architecture, employing an attention mechanism, replaces the traditional encoder-decoder that must rely on other neural network patterns. Word vector: a single word is represented by a fixed-dimension distribution vector. Compound word: the keywords with thicker granularity are composed of the keywords with fine granularity, and the semantics of the keywords with thicker granularity are richer and more complete than those of the keywords with fine granularity.

Fig. 1 is a schematic view of a usage scenario of an intelligent text processing method according to an embodiment of the present invention, referring to fig. 1, a terminal (including a terminal 10-1 and a terminal 10-2) is provided with corresponding clients capable of executing different functions, where the clients are terminals (including the terminal 10-1 and the terminal 10-2) that acquire different text information from corresponding servers 200 through a network 300 for browsing, the terminal is connected to the servers 200 through the network 300, the network 300 may be a wide area network or a local area network, or a combination of the two, and data transmission is implemented using a wireless link, where types of text information acquired by the terminals (including the terminal 10-1 and the terminal 10-2) from the corresponding servers 200 through the network 300 are different, for example: the terminal (including the terminal 10-1 and the terminal 10-2) can acquire any type of text information from the corresponding server 200 through the network 300, and can also acquire text information only matched with the corresponding retrieval instruction from the corresponding server 200 through the network 300 for browsing. The server 200 may store text information or corresponding inverted indexes for performing word segmentation processing through different word segmentation libraries. In some embodiments of the invention, different types of textual information maintained in server 200 may be written in software code environments of different programming languages, and code objects may be different types of code entities. For example, in the software code of C language, one code object may be one function. In the software code of JAVA language, a code object may be a class, and the OC language of IOS terminal may be a target code. In the software code of C + + language, a code object may be a class or a function to execute text processing instructions from different terminals. In which the sources of the text information to be processed by the text processing model are not further distinguished in the present application.

The server 200 needs to determine the text information selected by the user for monitoring in the process that the server 200 transmits the different types of text information to the terminal (the terminal 10-1 and/or the terminal 10-2) through the network 300. As an example, the server 200 is configured to obtain text content at chapter level corresponding to an operation in the selection on the touch screen; extracting a characteristic vector matched with text content at chapter level through a first neural network of a text processing model; determining a statement vector corresponding to the text content according to the feature vector through a first neural network of the text processing model; determining at least one word-level hidden variable corresponding to the text content according to the feature vector through a second neural network of the text processing model; generating candidate words corresponding to the hidden variables of the word level and the selected probability of the candidate words according to the hidden variables of at least one word level through a second neural network of the text processing model; selecting the probability according to the candidate words through a second neural network of the text processing model; selecting at least one candidate word to form a target text corresponding to the text content according to the selected probability of the candidate word and the statement vector corresponding to the text content; and displaying the target text in a display mode corresponding to the selected operation in the touch screen.

Of course, before the sentence samples at chapter level are processed by the text processing model, the text processing model also needs to be trained, which specifically includes:

acquiring a first training sample set, wherein the first training sample set comprises at least one group of sentence samples at chapter level input through a corresponding touch screen; denoising the first training sample set to form a corresponding second training sample set; processing the second training sample set through the text processing model to determine initial parameters of a first neural network and initial parameters of a second neural network in the text processing model; responding to the initial parameters of the text processing model, processing the second training sample set through the text processing model, and determining updating parameters corresponding to different neural networks of the text processing model; and respectively carrying out iterative updating on the parameters of the first neural network and the parameters of the second neural network of the text processing model through the second training sample set according to the updating parameters of the text processing model corresponding to the updating parameters of different neural networks of the text processing model, so as to realize the processing of the text content at the corresponding chapter level in the touch screen through the text processing model.

As will be described in detail below, the electronic device according to the embodiment of the present invention may be implemented in various forms, such as a dedicated terminal with a text processing function, or an electronic device with a text processing function, for example, the server in fig. 1, where fig. 2 is a schematic diagram of a constituent structure of the electronic device according to the embodiment of the present invention, and it is understood that fig. 2 only shows an exemplary structure of the electronic device, and not a whole structure, and a part of the structure or a whole structure shown in fig. 2 may be implemented as needed.

The electronic equipment provided by the embodiment of the invention comprises: at least one processor 201, memory 202, user interface 203, and at least one network interface 204. The various components in the electronic device are coupled together by a bus system 205. It will be appreciated that the bus system 205 is used to enable communications among the components. The bus system 205 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 205 in fig. 2.

The user interface 203 may include, among other things, a display, a keyboard, a mouse, a trackball, a click wheel, a key, a button, a touch pad, or a touch screen.

It will be appreciated that the memory 202 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. The memory 202 in embodiments of the present invention is capable of storing data to support operation of the terminal (e.g., 10-1). Examples of such data include: any computer program, such as an operating system and application programs, for operating on a terminal (e.g., 10-1). The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application program may include various application programs.

In some embodiments, the intelligent text processing apparatus provided in the embodiments of the present invention may be implemented by a combination of hardware and software, and by way of example, the intelligent text processing apparatus provided in the embodiments of the present invention may be a processor in the form of a hardware decoding processor, which is programmed to execute the intelligent text processing method provided in the embodiments of the present invention. For example, a processor in the form of a hardware decoding processor may employ one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.

As an example of the intelligent text processing apparatus provided by the embodiment of the present invention implemented by combining software and hardware, the intelligent text processing apparatus provided by the embodiment of the present invention may be directly embodied as a combination of software modules executed by the processor 201, where the software modules may be located in a storage medium, the storage medium is located in the memory 202, and the processor 201 reads executable instructions included in the software modules in the memory 202, and completes the intelligent text processing method provided by the embodiment of the present invention in combination with necessary hardware (for example, including the processor 201 and other components connected to the bus 205).

By way of example, the Processor 201 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor or the like.

As an example of the intelligent text processing apparatus provided by the embodiment of the present invention implemented by hardware, the apparatus provided by the embodiment of the present invention may be implemented by directly using the processor 201 in the form of a hardware decoding processor, for example, by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components, to implement the intelligent text processing method provided by the embodiment of the present invention.

The memory 202 in embodiments of the present invention is used to store various types of data to support the operation of the electronic device. Examples of such data include: any executable instructions for operating on an electronic device, such as executable instructions, a program implementing the smart text processing method of an embodiment of the present invention may be embodied in the executable instructions.

In other embodiments, the smart text processing apparatus provided by the embodiment of the present invention may be implemented by software, and fig. 2 shows the smart text processing apparatus 2020 stored in the memory 202, which may be software in the form of programs, plug-ins, and the like, and includes a series of modules, and as an example of the programs stored in the memory 202, the smart text processing apparatus 2020 may include the following software modules: an information transmission module 2081 and an information processing module 2082. When the software modules in the intelligent text processing apparatus 2020 are read into the RAM by the processor 201 and executed, the functions of the software modules in the intelligent text processing apparatus 2020 are described as follows:

the information transmission module 2081 is configured to obtain text content at chapter level corresponding to an operation in the touch screen.

The information processing module 2082 is configured to extract a feature vector matched with the text content at the chapter level through the first neural network of the text processing model.

The information processing module 2082 is configured to determine, according to the feature vector, a statement vector corresponding to the text content through a first neural network of the text processing model;

the information processing module 2082 is configured to determine, according to the feature vector, at least one word-level hidden variable corresponding to the text content through a second neural network of the text processing model.

The information processing module 2082 is configured to generate, according to the at least one word-level hidden variable, a candidate word corresponding to the word-level hidden variable and a selection probability of the candidate word through a second neural network of the text processing model.

The information processing module 2082 is configured to select the probability of the candidate word according to the second neural network of the text processing model.

The information processing module 2082 is configured to select at least one candidate word to form a target text corresponding to the text content according to the selection probability of the candidate word and the sentence vector corresponding to the text content.

The information transmission module 2081 is configured to display the target text in a display manner corresponding to the selected operation on the touch screen.

Before describing the intelligent text processing method provided by the embodiment of the present invention, the intelligent text processing apparatus 2020 shown in fig. 2 is combined to describe the intelligent text processing method provided by the embodiment of the present invention, and first of all, in a process in which a text processing model generates a corresponding text processing result (target text) according to text content at chapter level in the present application, fig. 3 is a schematic diagram of generating a text processing result in a conventional scheme, where an eq2seq model is an architectural manner represented by an encoder (Encode) and a decoder (Decode), and the seq2seq model generates an output sequence Y according to an input sequence X. In the seq2seq model represented by an encoder (Encode) which converts an input sequence into a vector of fixed length, and a decoder (Decode) which decodes the input vector of fixed length into an output sequence. As shown in fig. 3, an Encoder (Encoder) encodes an input sentence to be processed to obtain a text feature of the sentence to be processed; and a Decoder (Decoder) decodes the text features and outputs the decoded text features to generate a corresponding text processing result, wherein the encoder (Encode) and the Decoder (Decode) are in one-to-one correspondence.

It can be seen that, for the related art shown in fig. 3, the text processing model based on the Seq2Seq model has a disadvantage that the model in the related art only establishes a one-to-one relationship for the training data target text y, and in many practical scenarios, the text content at the same chapter level may have many word segmentation modes, and the existing Seq2Seq model cannot effectively process the text at the chapter level because an encoder (Encode) and a decoder (Decode) are in one-to-one correspondence, and is easily interfered by noise information, triggering useless recognition or selection, and the user experience is poor.

To solve the drawbacks of the related art, referring to fig. 4, fig. 4 is an optional flowchart of a text processing model training method provided in the embodiment of the present invention, and it can be understood that the steps shown in fig. 4 may be executed by various electronic devices operating an intelligent text processing apparatus, such as a dedicated terminal with a text processing function, a server with a text processing model training function, or a server cluster. The following is a description of the steps shown in fig. 4.

Step 401: the intelligent text processing device obtains a first set of training samples.

Wherein the first training sample set comprises at least one group of sentence samples at chapter level input through the corresponding touch screen. For some specific scenes, such as novel reading and news reading, the data to be processed through the touch screen is often the whole text at chapter level, compared with the short text in the scenes of WeChat chat and browser search, and the text at chapter level can form hundreds of word units through word segmentation processing. Wherein, the word segmentation means that the meaning of verb also means the meaning of name word; each participle is a word or a phrase, namely the minimum semantic unit with definite meaning; for the received use environments of different users or different text processing models, the minimum semantic units contained in the received use environments need to be divided into different types, and adjustment needs to be made timely, and the process is called word segmentation, namely the word segmentation can refer to the process for dividing the minimum semantic units; on the other hand, the minimum semantic unit obtained after division is also often called word segmentation, that is, a word obtained after the word segmentation is performed; in order to distinguish the two meanings from each other, the smallest semantic unit referred to by the latter meaning is sometimes referred to as a participle object (Term); the term participled object is used in this application; the word segmentation object corresponds to a keyword which is used as an index basis in the inverted list. For Chinese, because words as the minimum semantic unit are often composed of different numbers of characters, and there are no natural distinguishing marks in alphabetic writing such as blank partitions and the like between the words, it is an important step for Chinese to accurately perform word segmentation to obtain reasonable word segmentation objects.

Step 402: and the intelligent text processing device carries out denoising processing on the first training sample set to form a corresponding second training sample set.

In some embodiments of the present invention, denoising the first training sample set to form a corresponding second training sample set may be implemented by:

determining a dynamic noise threshold value matched with the use environment of the text processing model; and denoising the first training sample set according to the dynamic noise threshold value to form a second training sample set matched with the dynamic noise threshold value. Wherein the dynamic noise threshold value matching the usage environment of the corresponding text processing model is different due to different usage environments of the text processing model, for example, in the usage environment of a single text processing (academic paper display or news information text), the dynamic noise threshold value matching the usage environment of the text processing model needs to be smaller than that in the text advertisement environment. Because the training samples are derived from different data sources, the data sources include data of various types of application scenarios as data sources of corresponding training books, for example, the text processing model provided by the invention can be packaged in vehicle-mounted electronic equipment as a software module, can also be packaged in different smart homes (including but not limited to a sound box, a television, a refrigerator, an air conditioner, a washing machine and a kitchen range), and can also be solidified in hardware equipment of an intelligent robot, and the text processing model can be trained specifically by using the corresponding training samples according to different use scenarios of the text processing model.

determining a fixed noise threshold corresponding to a usage environment of the text processing model; and denoising the first training sample set according to the fixed noise threshold value to form a second training sample set matched with the fixed noise threshold value. When the text processing model is solidified in a corresponding hardware mechanism, for example, when information text is read in a vehicle-mounted environment, the noise is relatively single, and the training speed of the text processing model can be effectively improved and the waiting time of a user is reduced by fixing a fixed noise threshold corresponding to the text processing model.

Step 403: the intelligent text processing device processes the second training sample set through a text processing model to determine initial parameters of a first neural network and initial parameters of a second neural network in the text processing model.

The text processing model provided by the application comprises: an initial neural network and a second neural network of the first neural network, wherein the first neural network may be a structured RNN (neural RNN) and the second neural network may be a Bidirectional attention neural network model (BERT Bidirectional Encoder responses from transformations).

With continuing reference to fig. 5, fig. 5 is an alternative structural schematic diagram of a second neural network in an embodiment of the present invention, wherein the encor includes: n ═ 6 identical layers, each layer containing two sub-layers. The first sub-layer is a multi-head attention layer (multi-head attention layer) and then a simple fully connected layer. Each sub-layer is added with residual connection (residual connection) and normalization (normal ligation).

The Decoder includes: the Layer consists of N ═ 6 identical layers, wherein the layers and the encoder are not identical, and the layers comprise three sub-layers, wherein one self-orientation Layer is arranged, and the encoder-decoding Layer is finally a full connection Layer. Both the first two sub-layers are based on multi-head attentional layers.

With continued reference to FIG. 6, FIG. 6 is an alternative word-level machine-readable representation of a second neural network in an embodiment of the present invention, in which both the encoder and decoder portions contain 6 encoders and decoders. Inputs into the first encoder combine embedding and positional embedding. After passing 6 encoders, outputting to each decoder of the decoder part; the input text with chapter level "jun don't see yellow river water come from … hut will come out and change beautiful wine and Cheng Pin Van Cheng" and process through the second neural network, the machine reading result of output is vector level: "jun/unseen/yellow river/water/heaven/coming … hui/going/changing/making wine/and/going/selling at the same time/worries of all ages".

With continuing reference to FIG. 7, FIG. 7 is an alternative structural schematic diagram of an encoder in a second neural network in an embodiment of the present invention, where its input consists of a query of dimension d (Q) and keys (K) and a value of dimension d (V), all keys calculate the dot product of the query and apply the softmax function to obtain the weight of the value.

With continued reference to FIG. 7, FIG. 7 shows a vector schematic of an encoder in a second neural network in an embodiment of the present invention, where Q, K, and V are obtained by multiplying the vector x of the input encoder by W ^ Q, W ^ K, W ^ V. W ^ Q, W ^ K, W ^ V are (512, 64) in the dimension of the article, then suppose the dimension of our inputs is (m, 512), where m represents the number of words. The dimension of Q, K and V obtained after multiplying the input vector by W ^ Q, W ^ K, W ^ V is (m, 64).

With continued reference to FIG. 8, FIG. 8 illustrates the present inventionIn the embodiment, the vector splicing diagram of the encoder in the second neural network is shown, wherein Z₀To Z₇I.e. corresponding 8 parallel heads (dimension is (m, 64)), and then concat gets the (m, 512) dimension after these 8 heads. After the final multiplication with W ^ O, the output matrix with the dimension (m, 512) is obtained, and the dimension of the matrix is consistent with the dimension of entering the next encoder.

With continued reference to fig. 9, fig. 9 is a schematic diagram of the encoding process of the encoder in the second neural network according to the embodiment of the present invention, in which x1 passes through self-anchorage to reach the state of z1, the tensor passing through self-anchorage further needs to go through the residual error network and the latex Norm, and then the fully connected feedforward network needs to go through the same operation, the residual error processing and the normalization. The tensor which is finally output can enter the next encoder, then the iteration is carried out for 6 times, and the result of the iteration processing enters the decoder.

With continuing reference to fig. 10, fig. 10 is a schematic diagram of a decoding process of a decoder in a second neural network according to an embodiment of the present invention, wherein the decoder inputs and outputs and the decoding process:

and (3) outputting: probability distribution of output words corresponding to the i position;

inputting: output of encoder & output of corresponding i-1 position decoder. So the middle atttion is not self-atttion, its K, V comes from encoder and Q comes from the output of the decoder at the last position.

With continuing reference to fig. 11 and 12, fig. 11 is a schematic diagram of a decoding process of a decoder in a second neural network according to an embodiment of the present invention, in which a vector output by a last decoder of the decoder network passes through a Linear layer and a softmax layer. Fig. 12 is a schematic diagram of a decoding process of a decoder in a second neural network according to an embodiment of the present invention, where the Linear layer is used to map a vector from the decoder portion into a logits vector, and then the softmax layer converts the logits vector into a probability value according to the logits vector, and finally finds a position of a maximum probability value, i.e., completes the output of the decoder.

In some embodiments of the invention, the second neural network may be a bidirectional attention neural network model (BERTBIirectional Encode responses from transducers). With continuing reference to fig. 5, fig. 5 is an alternative structural schematic diagram of a second neural network in an embodiment of the present invention, wherein the encor includes: n ═ 6 identical layers, each layer containing two sub-layers. The first sub-layer is a multi-head attention layer (multi-head attention layer) and then a simple fully connected layer. Each sub-layer is added with residual connection (residual connection) and normalization (normalization).

With continuing reference to FIG. 13, FIG. 13 is an alternative sentence-level machine reading schematic diagram of a second neural network in an embodiment of the present invention, in which both the encoder and decoder portions contain 6 encoders and decoders. Inputs into the first encoder combine embedding and positional embedding. After passing 6 encoders, outputting to each decoder of the decoder part; the input target is English "When you are old And green And full of sleep And not adding by the fire tap down book" And the output machine reading result is: "When/you/are/old/And/grey/And/full of sleep/And/odd/bottom/by the fire/take down/this book".

Certainly, the BERT model in the present invention is also replaced by a forward neural network model (Bi-LSTM Bi-directional long Short-term memory), a Gated round robin Unit network model (GRU Gated current Unit) model, an deep contextualized word representation network model (ELMo embedding from network model), a GPT model, and a GPT2 model, which are not described in detail herein.

Step 404: and the intelligent text processing device responds to the initial parameters of the first neural network and the initial parameters of the second neural network, processes the second training sample set through the text processing model, and determines the updating parameters corresponding to different neural networks of the text processing model.

In some embodiments of the present invention, in response to the initial parameters of the first neural network and the initial parameters of the second neural network, processing the second training sample set by the text processing model to determine the updated parameters corresponding to different neural networks of the text processing model may be implemented by:

substituting different sentence samples in the second training sample set into a loss function corresponding to a self-coding network formed by an encoder and a decoder of a first neural network of the text processing model; determining parameters of an encoder and corresponding decoder parameters corresponding to the first neural network when the loss function meets a first convergence condition as updating parameters of the first neural network; substituting different sentence samples in the second training sample set into a loss function corresponding to a self-coding network formed by an encoder and a decoder of a second neural network of the text processing model; and determining parameters of an encoder and corresponding decoder parameters corresponding to the second neural network when the loss function meets a second convergence condition as the update parameters of the second neural network.

Therefore, the text content at the corresponding chapter level in the touch screen can be processed through the text processing model.

Step 405: and the intelligent text processing device respectively carries out iterative updating on the parameters of the first neural network and the parameters of the second neural network of the text processing model through the second training sample set according to the updating parameters corresponding to different neural networks of the text processing model.

In some embodiments of the present invention, the iteratively updating the parameters of the first neural network and the parameters of the second neural network of the text processing model through the second training sample set according to the updated parameters of the text processing model corresponding to different neural networks of the text processing model may be implemented as follows:

determining a second noise parameter matched with the second training sample set, wherein the second noise parameter is used for representing the noise value of the parallel statement samples in the second training sample set; iteratively updating the parameters of the encoder and the decoder of the first neural network according to the noise value of the second noise parameter until a loss function corresponding to a self-encoding network formed by the encoder and the decoder of the first neural network meets a corresponding convergence condition; and iteratively updating the parameters of the encoder and the decoder of the second neural network according to the noise value of the second noise parameter until a loss function corresponding to a self-encoding network formed by the encoder and the decoder of the second neural network meets a corresponding convergence condition.

In some embodiments of the present invention, wherein the loss function of the encoder network is expressed as:

loss _ a ═ Σ (decoder _ a (encoder (warp (x1))) -x1) 2; wherein decoder _ A is decoder A, warp is function of statement to be identified, x₁The encoder is used for the statement to be identified.

In the iterative training process, the sentence to be recognized is substituted into the loss function of the encoder network, parameters of the encoder A and the decoder A when the loss function is reduced according to the gradient (such as the maximum gradient) are solved, and when the loss function is converged (namely when the hidden variable capable of forming the word level corresponding to the sentence to be recognized is determined), the training is finished.

In the training process of the encoder network, the loss function of the encoder network is represented as: loss _ B ═ Σ (decoder _ B (encoder (warp (x2))) -x2) 2; wherein decoder _ B is a decoder B, warp is a function of a statement to be identified, x2 is the statement to be identified, and encoder is an encoder.

In the iterative training process, parameters of an encoder B and a decoder B when a loss function is reduced according to a gradient (such as a maximum gradient) are solved by substituting a statement to be identified into the loss function of the encoder network; when the loss function converges (i.e., when the decoding results in the selected probability of the text processing result corresponding to the sentence to be recognized), the adjustment and training are ended.

In some embodiments of the present invention, the intelligent text processing method provided by the present application further includes:

determining a set of attention parameters for the second set of training samples in response to a set of training sample dictionaries for the text processing model; and performing weighting processing on the second training sample set according to the training sample dictionary set and the attention parameter set of the second training sample set so as to realize the adaptation of the second training sample set and the training sample dictionary set of the text processing model.

In some embodiments of the invention, the method further comprises:

negative example processing is carried out on the first training sample set to form a negative example sample set corresponding to the first training sample set, wherein the negative example sample set is used for adjusting the parameters of an encoder and a decoder of the first neural network or adjusting the parameters of the encoder and the decoder of a second neural network; and determining a corresponding evaluation research value according to the negative example sample set, wherein the evaluation research value is used for evaluating a text processing result of the text processing model as a supervision parameter.

In some embodiments of the present invention, negative example processing on the first training sample set may be implemented by:

With continuing reference to fig. 14, fig. 14 is an alternative flowchart of the text processing model training method according to the embodiment of the present invention, and it can be understood that the steps shown in fig. 14 can be executed by various electronic devices operating an intelligent text processing apparatus, such as a dedicated terminal with a text processing model training function, a server with a text processing model training function, or a server cluster. The following is a description of the steps shown in fig. 14.

Step 1401: the intelligent text processing device acquires text content at chapter level corresponding to operation in the touch screen;

step 1402: and extracting the characteristic vector matched with the text content at the chapter level through a first neural network of the text processing model.

Step 1403: determining, by a first neural network of the text processing model, a sentence vector corresponding to the text content according to the feature vector.

Step 1404: determining, by a second neural network of the text processing model, at least one word-level hidden variable corresponding to the text content from the feature vector.

Step 1405: generating, by a second neural network of the text processing model, candidate words corresponding to the word-level hidden variables and a selected probability of the candidate words according to the at least one word-level hidden variable;

step 1406: selecting at least one candidate word to form a target text corresponding to the text content at the chapter level according to the selected probability of the candidate word and the sentence vector corresponding to the text content;

step 1407: and displaying the target text in a display mode corresponding to the selected operation in the touch screen.

In some embodiments of the present invention, extracting feature vectors matching the text content at the chapter level through the first neural network of the text processing model may be implemented by:

triggering a corresponding word segmentation library according to text parameter information carried by text content at chapter level displayed in the touch screen; performing word segmentation processing on text content at chapter level displayed in the touch screen through the triggered word segmentation library word dictionary to form different word level feature vectors; and denoising the different word-level feature vectors to form a word-level feature vector set matched with the text content at the chapter level. Wherein, the word segmentation means that the meaning of verb also means the meaning of name word; each participle is a word or a phrase, namely the minimum semantic unit with definite meaning; for the received use environments of different users or different text processing models, the minimum semantic units contained in the received use environments need to be divided into different types, and adjustment needs to be made timely, and the process is called word segmentation, namely the word segmentation can refer to the process for dividing the minimum semantic units; on the other hand, the minimum semantic unit obtained after division is also often called word segmentation, that is, a word obtained after the word segmentation is performed; in order to distinguish the two meanings from each other, the smallest semantic unit referred to by the latter meaning is sometimes referred to as a participle object (Term); the term participled object is used in this application; the word segmentation object corresponds to a keyword which is used as an index basis in the inverted list. For Chinese, because words as the minimum semantic unit are often composed of different numbers of characters, and there are no natural distinguishing marks in alphabetic writing such as blank partitions and the like between the words, it is an important step for Chinese to accurately perform word segmentation to obtain reasonable word segmentation objects.

In some embodiments of the present invention, denoising the different word-level feature vectors to form a word-level feature vector set matching the text content at the chapter level may be implemented as follows:

determining a dynamic noise threshold value matched with the use environment of the text processing model; denoising the different word-level feature vectors according to the dynamic noise threshold, and triggering a dynamic word segmentation strategy matched with the dynamic noise threshold; and performing word segmentation processing on the text content at the chapter level displayed in the touch screen according to a dynamic word segmentation strategy matched with the dynamic noise threshold value to form a corresponding dynamic word level feature vector set. For example, in the usage environment of academic translation, the dynamic noise threshold value of the text information displayed by the terminal, which only includes the text information of the academic paper and matches with the usage environment of the text processing model, needs to be smaller than the dynamic noise threshold value in the reading environment of the entertainment information text.

determining a fixed noise threshold corresponding to a use environment of the text processing model; denoising the different word-level feature vectors according to the fixed noise threshold, and triggering a fixed word segmentation strategy matched with the fixed noise threshold; and performing word segmentation processing on the target text at chapter level displayed in the touch screen according to a fixed word segmentation strategy matched with the fixed noise threshold value to form a corresponding fixed word level feature vector set. When the text processing model is solidified in a corresponding hardware mechanism, such as a multi-modal terminal or an intelligent medical system, and the using environment is professional term text information (or text information in a certain field), because the noise is relatively single, the processing speed of the text processing model can be effectively improved, the waiting time of a user is reduced, and the using experience of the user is improved by fixing a fixed noise threshold corresponding to the text processing model.

In some embodiments of the present invention, word segmentation processing may be further performed on text content at chapter level displayed in the touch screen to form a word segmentation processing result; responding to the word segmentation processing result, and performing word deactivation processing on the text content to form text keywords matched with the text content; and determining a part-of-speech tagging result matched with the text content according to the text keywords matched with the text content, and forming a part-of-speech feature vector set corresponding to the text content. Because the text processed by the text processing model not only includes text information in a single language, but also may be complex text information in multiple languages (for example, a chinese-english hybrid academic paper as text information), in which, unlike english which directly uses spaces as intervals between words, for a chinese text, word segmentation is correspondingly required, because words in chinese can contain complete information. Correspondingly, a Chinese word segmentation tool Jieba can be used for segmenting Chinese texts. In addition, word processing needs to be stopped for the segmented keyword set correspondingly, and because words like "yes" and "can" have no information help for corresponding category labeling tasks. For example, for the text "yes, i'm favorite restaurant", the word segmentation and the word stop are performed to obtain a set of two keywords "favorite/restaurant" (using/as a separator, the same applies hereinafter), so that the processing speed of the text processing model can be effectively increased.

In some embodiments of the present invention, selecting at least one candidate word to compose a target text corresponding to the text content according to the selected probability of the candidate word and the sentence vector corresponding to the text content may be implemented by:

matching the sentence vector corresponding to the text content with the text content displayed in the touch screen; performing fusion processing on the selected probability of the candidate words according to the matching result of the sentence vector corresponding to the text content and the text content at the chapter level displayed in the touch screen; and selecting at least one candidate word to form a target text corresponding to the text content at the chapter level according to the fusion processing result of the selected probability of the candidate word.

The text processing model training method provided by the embodiment of the present invention is described below by taking text information processing in medical hardware equipment as an example, where fig. 15 is an application environment schematic diagram of text selection in the related art in the embodiment of the present invention, where the related art may select a text only in a touch screen sensing manner as shown in fig. 15, specifically, in the using process of the scheme, the terminal receives a signal touched by a finger through the touch screen, locates a cursor according to a touch position, and a user may select a boundary of a corresponding text by using the cursor. In the process, the text can be selected only according to the position where the touch screen senses the finger touch, the target text is inconvenient to select when the text font is small, and meanwhile, because the user often easily selects the wrong text or selects the text characters in a missing mode through one-hand operation, the user can select the target text for many times, and the operation time of the user is wasted.

Further, fig. 16 is a schematic diagram of an application environment of the related art for selecting a text in the embodiment of the present invention, where in the related art, as shown in fig. 16, a text may be selected in a touch screen sensing manner, and punctuation marks are considered, specifically, on the basis of the scheme shown in fig. 15, when text information of a text to be selected is determined, a common character and a punctuation mark may be distinguished, because the punctuation mark is a boundary of a sentence or a half-sentence, when a cursor of a user moves to a vicinity of the punctuation mark, the system may use the punctuation mark as a boundary of a target text with a higher probability, but in this process, although the punctuation mark is considered in text selection, this process may only use the punctuation mark to perform auxiliary selection, and for a text in a sentence that the user wants to select, for example, a word in the sentence shown in fig. 16, the auxiliary selection cannot be performed. Meanwhile, personalized recommendation cannot be performed according to the historical habits of the users (for example, the users need to select texts including punctuations, and some users do not need to include texts of the punctuations), so that the personalized recommendation is not beneficial to different users.

Further, fig. 17 is a schematic view of an application environment for text selection in the related art in the embodiment of the present invention, and as shown in fig. 17, in the related art, a corresponding target text is selected in a touch screen sensing manner, and word segmentation processing is performed on the target text in consideration of word segmentation information. However, in this process, only word information can be used for auxiliary selection, and a longer phrase or short sentence cannot be selected, and meanwhile, this process cannot perform personalized recommendation according to the historical habits of the users (different users and different fields have different word segmentation processing methods).

Fig. 18 is a schematic diagram of a training process of a text processing model provided in an embodiment of the present invention, and fig. 19 is a schematic diagram of a data structure of the text processing model provided in the embodiment of the present invention, and in combination with fig. 18, the method specifically includes the following steps:

step 1801: a first set of training samples in a medical data source is acquired.

Wherein the first training sample set comprises at least one set of sentence samples at chapter level input through the corresponding touch screen. In particular, usage data for all users (for different text content) may be recorded. First, the user may provide one or more touches to the screen to select text, and then select the target text followed by further operations (e.g., copy, send, etc.). The text selected by the user before the next operation can be regarded as the real target text of the user. Meanwhile, the position of the finger when the user touches the screen is recorded. Of course, to distinguish between different users, the users are configured with an ID (not repeated from each other) when recording the training data, and each sample indicates the user ID.

Step 1802: denoising the first training sample set to form a corresponding second training sample set;

step 1803: processing the second training sample set through a text processing model to determine initial parameters of a structured RNN network and initial parameters of a bidirectional attention neural network in the text processing model;

step 1804: responding to initial parameters of the structured RNN and initial parameters of the bidirectional attention neural network, processing the second training sample set through the text processing model, and determining updating parameters corresponding to different neural networks of the text processing model;

step 1805: and respectively carrying out iterative updating on the parameters of the structured RNN of the text processing model and the parameters of the bidirectional attention neural network through the second training sample set according to the updating parameters corresponding to different neural networks of the text processing model.

With reference to fig. 19, fig. 20A is a schematic diagram of text selection of a text processing model provided in the embodiment of the present invention, and fig. 20B is a schematic diagram of a using process of the text processing model provided in the embodiment of the present invention, which specifically includes the following steps:

step 2001: acquiring text content at chapter level corresponding to the operation in the touch screen;

the acquired long text sentence can correspond to the position touched by the finger in the touch screen.

Step 2002: and extracting a feature vector matched with the text content through a structured RNN (radio network), and determining a statement vector corresponding to the text content according to the feature vector.

Step 2003: generating candidate words corresponding to the hidden variables of the word level and the selected probability of the candidate words according to the hidden variables of the at least one word level through a bidirectional attention neural network.

Step 2004: and selecting at least one candidate word according to the selection probability of the candidate word through a bidirectional attention neural network, and combining a statement vector corresponding to the text content to form a target text corresponding to the text content.

Step 2005: and displaying the target text in a display mode corresponding to the selected operation in the touch screen. Therefore, the text processing model can be used for processing corresponding sentence at chapter level in the touch screen, and the prediction of user operation is realized.

The beneficial technical effects are as follows:

compared with the traditional technology for processing the text in the touch screen, according to the technical scheme provided by the application, the text processing model is triggered by detecting the operation of the user on the touch screen, the prediction of the text to be selected by the user is realized, the accuracy of user selection is improved, the text processing model can generate a high-quality text processing result, the capability of processing the corresponding sentence at the chapter level in the touch screen through the text processing model is improved, the smoothness of use of the user is improved, and the use experience of the user is effectively improved.

The above description is only exemplary of the present invention and should not be taken as limiting the scope of the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. An intelligent text processing method, characterized in that the method comprises:

selecting at least one candidate word to form a target text corresponding to the text content at the chapter level according to the selected probability of the candidate word and the sentence vector corresponding to the text content;

2. The method of claim 1, wherein extracting feature vectors matching the text content at the chapter level through the first neural network of the text processing model comprises:

triggering a corresponding word segmentation library according to text parameter information carried by text content at chapter level displayed in the touch screen;

performing word segmentation processing on text content at chapter level displayed in the touch screen through the triggered word segmentation library word dictionary to form different word level feature vectors;

and denoising the different word-level feature vectors to form a word-level feature vector set matched with the text content at the chapter level.

3. The method of claim 2, wherein the denoising the different word-level feature vectors to form a set of word-level feature vectors matching the text content at the chapter level comprises:

denoising the different word-level feature vectors according to the dynamic noise threshold, and triggering a dynamic word segmentation strategy matched with the dynamic noise threshold;

and performing word segmentation processing on the text content at the chapter level displayed in the touch screen according to a dynamic word segmentation strategy matched with the dynamic noise threshold value to form a corresponding dynamic word level feature vector set.

4. The method of claim 2, wherein the denoising the different word-level feature vectors to form a set of word-level feature vectors matching the text content at the chapter level comprises:

determining a fixed noise threshold corresponding to a use environment of the text processing model;

denoising the different word-level feature vectors according to the fixed noise threshold, and triggering a fixed word segmentation strategy matched with the fixed noise threshold;

5. The method of claim 1, wherein selecting at least one candidate word to compose a target text corresponding to the text content according to the selected probability of the candidate word and a sentence vector corresponding to the text content comprises:

matching the sentence vector corresponding to the text content with the text content displayed in the touch screen;

performing fusion processing on the selected probability of the candidate words according to the matching result of the sentence vector corresponding to the text content and the text content at the chapter level displayed in the touch screen;

and selecting at least one candidate word to form a target text corresponding to the text content at the chapter level according to the fusion processing result of the selected probability of the candidate word.

6. The method of claim 1, further comprising:

7. The method of claim 6, wherein the processing the second set of training samples by the text processing model in response to the initial parameters of the first neural network and the initial parameters of the second neural network to determine updated parameters corresponding to different neural networks of the text processing model comprises:

substituting different sentence samples in the second training sample set into a loss function corresponding to a self-coding network formed by an encoder and a decoder of a first neural network of the text processing model;

determining parameters of an encoder and corresponding decoder parameters corresponding to the first neural network when the loss function meets a first convergence condition as updating parameters of the first neural network;

substituting different sentence samples in the second training sample set into a loss function corresponding to a self-coding network formed by an encoder and a decoder of a second neural network of the text processing model;

and determining parameters of an encoder and corresponding decoder parameters corresponding to the second neural network when the loss function meets a second convergence condition as the update parameters of the second neural network.

8. An intelligent text processing apparatus, characterized in that the apparatus comprises:

the information transmission module is used for acquiring text contents at chapter levels corresponding to operations in the touch screen;

9. An electronic device, characterized in that the electronic device comprises:

a memory for storing executable instructions;

a processor for implementing the intelligent text processing method of any one of claims 1 to 7 when executing the executable instructions stored by the memory.

10. A computer-readable storage medium storing executable instructions, wherein the executable instructions, when executed by a processor, implement the intelligent text processing method of any one of claims 1 to 7.