WO2020232864A1

WO2020232864A1 - Data processing method and related apparatus

Info

Publication number: WO2020232864A1
Application number: PCT/CN2019/102348
Authority: WO
Inventors: 郭鸿程
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-05-20
Filing date: 2019-08-23
Publication date: 2020-11-26
Also published as: CN110222168B; CN110222168A

Abstract

The present application relates to the field of intelligent decisions. Provided are a data processing method and a related apparatus. The data processing method comprises: acquiring image data, of a book, sent by a terminal; carrying out character identification processing on the image data to obtain text data corresponding to the image data; carrying out text type detection on the text data to determine whether a text type of the text data satisfies a preset text type; when the text type satisfies the preset text type, inputting the text data into a neural network encoder to obtain an abstract vector of the text data; inputting the abstract vector of the text data into a neural network decoder to obtain an abstract of the text data; extracting N keywords in the abstract of the text data; combining the N keywords to obtain a question of the text data; and determining an answer corresponding to the question of the text data by means of a neural network semantic representation model. The technical solution of the embodiments of the present application improves the efficiency of checking the reading effect.

Description

Method and related device for data processing

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is 2019104203915, and the application name is "a method and related device for data processing" on May 20, 2019. The entire content of the patent application is incorporated herein by reference. Applying.

Technical field

This application relates to the field of intelligent decision-making, and in particular to a data processing method and related devices.

Background technique

At present, after children or students have finished reading books, the method for parents or teachers to check the reading effect is to confirm through homework. For example, for articles in textbooks, children or students often need to do after-school exercises after reading, and parents or teachers pass after-class Practice to test the effect of reading.

However, sometimes there is no corresponding homework or exercise behind the books that children or students read. If you want to test the reading effect, parents or teachers have to read the book first and understand the content of the book before you can test the reading effect of the children or students. , It wastes the time of reading books, and if the books read is very long, the efficiency of checking the reading effect is low.

Summary of the invention

The embodiments of the present application provide a data processing method and related devices to improve the efficiency of checking reading effects.

The first aspect of this application provides a data processing method, including:

Acquiring the image data of the book sent by the terminal;

Performing character recognition processing on the image data to obtain text data corresponding to the image data;

Performing text type detection on the text data to determine whether the text type of the text data meets the preset text type;

When the text type satisfies the preset text type, input the text data into a neural network encoder to obtain a summary vector of the text data, wherein the neural network encoder is used to compress the text data coding;

The summary vector of the text data is input to a neural network decoder to obtain a summary of the text data, wherein the neural network decoder is used to predict the summary vector of the text data through a neural network to obtain multiple predictions Words, the plurality of predicted words are connected as a summary of the text data;

Perform word segmentation processing on the abstract of the text data, and extract N keywords in the abstract of the text data in the order of word frequency from large to small, where N is a positive integer;

Classify the N keywords by part of speech, and combine the N keywords according to the part of speech of the N keywords according to a preset question sentence order to obtain the text data question;

A neural network semantic representation model is used to calculate the degree of semantic relevance between the question of the text data and the text in the text data, and the text with the highest degree of semantic relevance is determined as the answer corresponding to the question of the text data.

The second aspect of the present application provides a data processing device, including:

The acquisition module is used to acquire the image data of the book sent by the terminal;

A character recognition module for performing character recognition processing on the image data to obtain text data corresponding to the image data;

The detection module is configured to perform text type detection on the text data to determine whether the text type of the text data meets the preset text type;

The encoding module is used to input the text data into a neural network encoder to obtain a summary vector of the text data when the text type meets the preset text type, wherein the neural network encoder is used to The text data is compressed and encoded;

The decoding module is configured to input the summary vector of the text data into a neural network decoder to obtain a summary of the text data, wherein the neural network decoder is used to predict the summary vector of the text data through a neural network Obtaining a plurality of predicted words, and the plurality of predicted words are connected as a summary of the text data;

The extraction module is configured to perform word segmentation processing on the abstract of the text data, and extract N keywords in the abstract of the text data in the order of word frequency from large to small, where N is a positive integer;

A combination module, configured to classify the N keywords by part of speech, and combine the N keywords according to the part of speech of the N keywords in a preset question sentence order to obtain the text data question;

The processing module is used to calculate the semantic correlation degree between the text data question and the text in the text data through the neural network semantic representation model, and determine the text with the highest semantic correlation degree as the answer corresponding to the text data question.

A third aspect of the present application provides an electronic device for data processing. The electronic device includes a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory , And configured to be executed by the processor, and the program includes instructions for executing the steps in any method of the first aspect of the present application.

The fourth aspect of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the part described in any method of the first aspect of the present application Or all steps.

It can be seen that when there is no corresponding homework or exercise at the back of the book read by children or students, through the above technical solutions, the abstract of the book, the question and the answer corresponding to the question can be obtained, so that the parent or teacher can understand the content of the book based on the abstract. The reading effect of children or students is tested through the questions and the answers corresponding to the questions, which avoids parents or teachers from spending a lot of time reading books, and improves the efficiency of checking the reading effect.

Description of the drawings

In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, without creative work, other drawings can be obtained based on these drawings.

FIG. 1 is a flowchart of a data processing method provided by an embodiment of this application;

2 is a flowchart of another data processing method provided by an embodiment of the application;

FIG. 3 is a flowchart of another data processing method provided by an embodiment of the application;

FIG. 4 is a schematic diagram of a system structure provided by an embodiment of this application;

FIG. 5 is a schematic diagram of performing character recognition processing on image data according to an embodiment of the application;

FIG. 6 is a schematic diagram of a data processing device provided by an embodiment of this application;

FIG. 7 is a schematic structural diagram of an electronic device in a hardware operating environment involved in an embodiment of the application.

Detailed ways

The data processing method and related devices provided in the embodiments of the present application can improve the efficiency of checking the reading effect.

In order to enable those skilled in the art to better understand the solution of the application, the technical solutions in the embodiments of the application will be clearly and completely described below in conjunction with the drawings in the embodiments of the application. Obviously, the described embodiments are only It is a part of the embodiments of this application, not all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work should fall within the protection scope of this application.

Detailed descriptions are given below.

The terms "first", "second", "third", "fourth", etc. in the specification and claims of this application and the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific sequence. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but optionally includes unlisted steps or units, or optionally also includes Other steps or units inherent to these processes, methods, products or equipment.

In the embodiment of the present application, the artificial intelligence server obtains the image data sent by the terminal, then processes the image data to obtain text data corresponding to the image data, and then processes the text data to obtain a summary of the text data, text data problems, and The answers to the text data questions are returned to the terminal.

First, refer to FIG. 1. FIG. 1 is a flowchart of a data processing method according to an embodiment of the application. Wherein, as shown in FIG. 1, a data processing method provided by an embodiment of the present application may include:

101. Acquire image data of a book sent by a terminal.

Among them, the terminal can be a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a mobile Internet device, or other types of terminals.

If the books read by children or students are paper books, the paper books are scanned first to obtain scanned images of the paper books, and then the terminal sends the scanned images to the artificial intelligence server.

102. Perform character recognition processing on the image data to obtain text data corresponding to the image data.

Optionally, when the image data is a scanned image, because the scanned image is scanned and generated by a scanning tool, there may be problems that some parts are not scanned or the scan is not clear, or the scan is skewed. Before data is processed for character recognition, it is necessary to standardize the image data. The method for standardizing the image data can be:

When the inclination of the image data exceeds the preset inclination threshold, the image data is processed by an image correction algorithm, where the image correction algorithm includes any one of Radon algorithm, Hough transform, and linear regression algorithm. Or, when the definition of the image data is lower than the preset definition threshold, the image data is processed by an image enhancement algorithm, where the image enhancement algorithm includes any one of histogram equalization, image smoothing, and image sharpening. Or, when the inclination of the image data exceeds the preset inclination threshold and the definition of the image data is lower than the preset definition threshold, the image data is processed through an image correction algorithm and an image enhancement algorithm.

When the image data is a scanned image, since the scanned image cannot be directly recognized, an artificial intelligence server is required to perform character recognition processing on the image data to obtain text data corresponding to the image data, and the text data can be directly recognized. The method for the artificial intelligence server to perform character recognition processing on the image data to obtain the text data corresponding to the image data may be:

Character cutting is performed on the image data to obtain M characters, where M is a positive integer. Perform feature extraction on M characters to obtain M character features, where M characters correspond to M character features one-to-one. Compare M character features with a character feature database to identify M text characters corresponding to M character features, where M character features correspond to M text characters one-to-one, and the comparison method includes comparison in Euclidean space Methods, relaxation comparison method (Relaxation), dynamic programming method (Dynamic Programming, DP), neural network-like database establishment and comparison method, HMM (Hidden Markov Model) and other methods.

Combine M text characters to obtain text data corresponding to the image data.

103. Perform text type detection on the text data to determine whether the text type of the text data meets the preset text type.

Optionally, the text type includes language type and style type, language type includes Chinese, English, Japanese, etc., style type includes modern style (including novel, prose, fairy tale, narrative, explanatory, argumentative, etc.) and ancient style ( Including poems, words, songs, fu etc.).

The method for the artificial intelligence server to perform text type detection on the text data to determine whether the text type of the text data meets the preset text type may be:

Performing language type detection on the text data to obtain the language type of the text data, and performing style type detection on the text data to obtain the style type of the text data. When the language type of the text data satisfies the preset language type and the style type of the text data satisfies the preset style type, it is determined that the text type of the text data satisfies the preset text type, wherein the preset language type includes Chinese, The preset style includes modern style. When the language type of the text data does not meet the preset language type, or the style type of the text data does not meet the preset style type, or the language type of the text data does not meet the preset language type and the style of the text data When the type does not meet the preset text type, it is determined that the text type of the text data does not meet the preset text type.

Further optionally, after the artificial intelligence server determines that the text type of the text data does not meet the preset text type, the method includes:

When the language type of the text data does not meet the preset language type, the artificial intelligence server sends a language type error message to the terminal, where the language type error message is used to instruct the terminal to generate a pop-up window or interface prompting that the language type of the book is wrong. For example, if the artificial intelligence server recognizes that the language type of the text data sent by the terminal is English, the artificial intelligence server sends a language type error message to the terminal, and when the terminal receives the language type error message, it generates a pop-up window indicating that the language type of the book cannot be English Or interface.

When the stylistic type of the text data does not meet the preset stylistic type, a stylistic type error message is sent to the terminal, where the stylistic type error message is used to instruct the terminal to generate a pop-up window or interface indicating that the book’s stylistic type is wrong, for example, manual The smart server recognizes that the style type of the text data sent by the terminal is ancient style, then the artificial intelligence server sends a style type error message to the terminal. When the terminal receives the style type error message, it generates a pop-up window indicating that the style of the book cannot be ancient style or interface.

When the language type of the text data does not meet the preset language type and the style type of the text data does not meet the preset style type, a language and style type error message is sent to the terminal, where the language and style type error message is used to indicate the terminal Generate a pop-up window or interface prompting that the language and style of the book are wrong. For example, if the artificial intelligence server recognizes that the language type of the text data sent by the terminal is Japanese, and the style of the image data is ancient style, the artificial intelligence server sends to the terminal Language and style type error messages. When the terminal receives the language and style type error messages, it generates a pop-up window or interface that prompts that the language type of the book cannot be Japanese and the style type of the book cannot be ancient style.

104. When the text type meets the preset text type, input the text data into the neural network encoder to obtain a summary vector of the text data.

In a possible example, the neural network encoder includes the first recurrent neural network, and the method of inputting text data into the neural network encoder to obtain the summary vector of the text data may be:

At the current moment, the first text in the text data is input into the first recurrent neural network to obtain the first encoding vector; the first encoding vector is passed into the next moment; the first encoding vector and the second in the text data are sent to the next moment The text is input into the first recurrent neural network to obtain the second encoding vector; the second encoding vector is passed into the next moment, until all the text in the text data is input into the first recurrent neural network, and the final encoding vector is determined to be Abstract vector of text data.

Specifically, the neural network encoder is used to compress and encode the text data, and is implemented by a recurrent neural network (RNN). The neural network encoder receives the input text data, and inputs the words in the original text data into the neural network at the beginning , Compress this word into a vector, and then pass the compressed vector to the next moment. In the next moment, input the compressed vector at the previous moment and the word in the original text data to the neural network, and then the compressed new vector Pass in the next moment, the code vector obtained after compressing all the text data is the summary vector of the text data.

105. Input the summary vector of the text data into a neural network decoder to obtain a summary of the text data.

In a possible example, the neural network decoder includes a second recurrent neural network, and the method of inputting the summary vector of the text data into the neural network decoder to obtain the summary of the text data may be: input the summary vector of the text data into the first Second recurrent neural network to predict the first output text; pass the first output text into the next moment; at the next moment, input the summary vector of the first output text and text data into the second recurrent neural network to predict the second Output text; the second output text is passed into the next moment until the second recurrent neural network predicts the summary vector of the text data, and the final combination of all output texts is determined as the summary of the text data.

Specifically, the neural network decoder is used to decode the summary vector of the text data, and is also implemented by a recurrent neural network (RNN). After the summary vector of the text data is input to the neural network decoder, the neural network decoder The summary vector of the data is predicted to get the output word at one moment, and then the neural network decoder predicts the output word at the next moment according to the output word and summary vector at that moment, and so on, the output word at the previous moment will affect The next output word and all the output words obtained by the neural network decoder are connected together to form the summary of the text data.

106. Perform word segmentation processing on the abstract of the text data, and extract N keywords in the abstract of the text data according to the order of word frequency from large to small, where N is a positive integer.

Optionally, performing word segmentation processing on the abstract of the text data, and extracting the N keywords in the abstract of the text data in the order of word frequency in descending order may be:

Perform word segmentation processing on the summary of the text data to obtain K word segmentation corresponding to the summary of the text data, where K is a positive integer greater than N. Calculate the K word frequencies corresponding to the K word segmentation, where the K word segments correspond to the K word frequencies one-to-one. Determine the N participles of the K participles according to the order of the word frequency, and extract the N participles.

Among them, the word segmentation method for the abstract of the text data includes a word segmentation method based on string matching, a word segmentation method based on understanding, and a word segmentation method based on statistics.

The word segmentation method based on string matching is to match the Chinese character string to be segmented with an entry in a dictionary according to a certain strategy. If a string is found in the dictionary, the matching is successful, that is, a word is recognized. The word segmentation method based on comprehension achieves the effect of word recognition by letting the computer simulate human's understanding of the sentence. The statistical-based word segmentation method should use the basic word segmentation dictionary for string matching and segmentation, and at the same time use statistical methods to identify some new words, that is, the combination of string frequency statistics and string matching, which not only exerts the characteristics of fast matching segmentation speed and high efficiency, It also uses the advantages of no dictionary word segmentation combined with context to identify new words and automatically eliminate ambiguity.

107. Perform a part-of-speech classification on the N keywords, and combine the N keywords according to the part of speech of the N keywords according to a preset question sentence order to obtain the text data question.

108. Calculate the degree of semantic relevance between the question of the text data and the text in the text data through the neural network semantic representation model, and determine the text with the highest degree of semantic relevance as the answer corresponding to the question of the text data.

Among them, the problem of calculating the text data through the neural network semantic representation model and the semantic correlation degree of the text in the text data include:

Input the question of the text data and the text in the text data into the neural network semantic representation model, use the neural network to encode the question of the text data and the text in the text data, and obtain its vector representation through semantic mining Finally, the degree of semantic relevance is obtained by calculating the similarity between the question of the text data and the semantic vector of the text in the text data. Wherein, the method for calculating the degree of semantic relevance between the question of the text data and the text in the text data may be a vocabulary overlap method, a string method, a cosine similarity method or a maximum common subsequence method.

The specific process is to search for Q segments of text matching the N keywords in the text data, where Q is a positive integer. The question of calculating the text data is related to the Q semantic relevance degrees of the Q segment text, where the Q segment text corresponds to the Q semantic relevance degrees one-to-one. Obtain the highest first semantic relevance degree among the Q semantic relevance degrees, and determine that the text corresponding to the first semantic relevance degree is the answer corresponding to the question of the text data.

Refer to FIG. 2, which is a flowchart of another data processing method provided by another embodiment of the application. Wherein, as shown in FIG. 2, another data processing method provided by another embodiment of the present application may include:

201. The terminal sends the image data of the book to the artificial intelligence server.

202. The artificial intelligence server performs character recognition processing on the image data to obtain text data corresponding to the image data.

When the inclination of the image data exceeds the preset inclination threshold, the image data is processed by an image correction algorithm, where the image correction algorithm includes any one of Radon algorithm, Hough transform, and linear regression algorithm.

Or, when the definition of the image data is lower than the preset definition threshold, the image data is processed by an image enhancement algorithm, where the image enhancement algorithm includes any one of histogram equalization, image smoothing, and image sharpening.

Or, when the inclination of the image data exceeds the preset inclination threshold and the definition of the image data is lower than the preset definition threshold, the image data is processed through an image correction algorithm and an image enhancement algorithm.

When the image data is a scanned image, since the scanned image cannot be directly recognized, an artificial intelligence server is required to perform character recognition processing on the image data to obtain text data corresponding to the image data, and the text data can be directly recognized.

The method for the artificial intelligence server to perform character recognition processing on the image data to obtain the text data corresponding to the image data may be:

Character cutting is performed on the image data to obtain M characters, where M is a positive integer.

Perform feature extraction on M characters to obtain M character features, where M characters correspond to M character features one-to-one.

Compare M character features with a character feature database to identify M text characters corresponding to M character features, where M character features correspond to M text characters one-to-one, and the comparison method includes comparison in Euclidean space Methods, relaxation comparison method (Relaxation), dynamic programming method (Dynamic Programming, DP), neural network-like database establishment and comparison method, HMM (Hidden Markov Model) and other methods.

Combine M text characters to obtain text data corresponding to the image data.

203. The artificial intelligence server recognizes whether the language type of the text data meets the preset language type.

Among them, the language types include Chinese, English, Japanese, etc., and the preset language types include Chinese.

204. When the language type of the text data does not meet the preset language type, the artificial intelligence server recognizes whether the style type of the text data meets the preset style type.

Among them, the stylistic types include modern styles (including novels, prose, fairy tales, narratives, explanatory essays, argumentative essays, etc.) and ancient styles (including poems, words, songs, fu, etc.), and the preset styles include modern styles.

205. When the style type of the text data does not meet the preset style type, the artificial intelligence server sends a language and style type error message to the terminal.

206. The terminal generates a pop-up window or interface prompting that the language and style of the book are wrong.

For example, if the artificial intelligence server recognizes that the language type of the text data is Japanese and the style is ancient style, then the artificial intelligence server sends a language and style type error message to the terminal, and when the terminal receives the language and style type error message, it generates a language that prompts the book The type cannot be Japanese and the style cannot be the pop-up window or interface of the ancient style.

Refer to Fig. 3, which is a flowchart of another data processing method provided by another embodiment of the application. Wherein, as shown in FIG. 3, another data processing method provided by another embodiment of the present application may include:

301. The terminal sends the image data of the book to the artificial intelligence server.

The books read by children or students are paper books. The paper books are scanned through the terminal to obtain scanned images of the paper books, and then the terminal sends the scanned images to the artificial intelligence server.

302. When the inclination of the image data exceeds a preset inclination threshold, the artificial intelligence server processes the image data by using an image correction algorithm.

When the image data is a scanned image, because the scanned image is scanned and generated by a scanning tool, there may be some problems that are not scanned or the scan is not clear, and the scan skew may also occur. Therefore, the image correction algorithm is required to The image data is processed, and the image correction algorithm includes any one of Radon algorithm, Hough transform and linear regression algorithm.

303. When the definition of the image data is lower than the preset definition threshold, the artificial intelligence server processes the image data by using an image enhancement algorithm.

Among them, the image enhancement algorithm includes any of histogram equalization, image smoothing, and image sharpening.

304. The artificial intelligence server performs character cutting on the image data to obtain M characters, where M is a positive integer.

305. The artificial intelligence server performs feature extraction on the M characters to obtain M character features.

Among them, M characters correspond to M character features one-to-one, and feature extraction can be divided into two categories: one is statistical features, the ratio of the number of black points or the number of white points in the character area of the image data is obtained, when the character area is divided into When there are several areas, the black point ratio or white point ratio of each area is combined into a numerical vector of space, and the other type is structural feature. After the characters of the image data are thinned, the strokes of the characters are obtained The number and location of endpoints and intersections.

306. The artificial intelligence server compares the M character features with the character feature database to identify M text characters corresponding to the M character features.

Among them, M character features correspond to M text characters one-to-one. Among them, the comparison methods include the comparison method of Euclidean space, relaxation comparison method (Relaxation), dynamic programming comparison method (Dynamic Programming, DP), neural Network database establishment and comparison method, HMM (Hidden Markov Model) and other methods.

307. The artificial intelligence server combines M text characters to obtain text data corresponding to the image data.

308. The artificial intelligence server performs text type detection on the text data to determine whether the text type of the text data meets the preset text type.

Optionally, the text type includes language type and style type, language type includes Chinese, English, Japanese, etc., style type includes modern style (including novel, prose, fairy tale, narrative, explanatory, argumentative, etc.) and ancient style ( Including poems, words, songs, fu, etc.).

Performing language type detection on the text data to obtain the language type of the text data, and performing style type detection on the text data to obtain the style type of the text data.

When the language type of the text data satisfies the preset language type and the style type of the text data satisfies the preset style type, it is determined that the text type of the text data satisfies the preset text type, wherein the preset language type includes Chinese, The preset style includes modern style.

When the language type of the text data does not meet the preset language type, or the style type of the text data does not meet the preset style type, or the language type of the text data does not meet the preset language type and the style of the text data When the type does not meet the preset text type, it is determined that the text type of the text data does not meet the preset text type.

309. When the text type meets the preset text type, input the text data into the neural network encoder to obtain a summary vector of the text data.

Among them, the neural network encoder is used to compress and encode the text data, which is implemented by a recurrent neural network (RNN). The neural network encoder receives the input text data, and inputs the words in the original text data into the neural network at the beginning. Compress this word into a vector, and then pass the compressed vector to the next moment. In the next moment, input the compressed vector at the previous moment and the word in the original text data to the neural network, and then transfer the compressed new vector At the next moment, the code vector obtained after compressing all the text data is the summary vector of the text data.

310. Input the summary vector of the text data into a neural network decoder to obtain a summary of the text data.

Among them, the neural network decoder is used to decode the summary vector of the text data, and it is also implemented by a recurrent neural network (RNN). After the summary vector of the text data is input to the neural network decoder, the neural network decoder Predicts the output word at a moment by using the summary vector of the, and then the neural network decoder predicts the output word at the next moment according to the output word and summary vector at that moment, and so on, the output word at the previous moment will affect the next An output word, and finally all the output words obtained by the neural network decoder are connected to form a summary of the text data.

311. Extract N keywords in the abstract of the text data, where N is a positive integer.

Optionally, the method for extracting the N keywords in the abstract of the text data may be:

Perform word segmentation processing on the summary of the text data to obtain K word segmentation corresponding to the summary of the text data, where K is a positive integer greater than N.

Calculate the K word frequencies corresponding to the K word segmentation, where the K word segments correspond to the K word frequencies one-to-one.

Determine the N participles of the K participles according to the order of the word frequency, and extract the N participles.

312. Combine the N keywords to obtain the text data question.

313. Process the text data question and the text data through the neural network semantic representation model to obtain the answer corresponding to the text data question.

The specific process is to search for Q segments of text matching the N keywords in the text data, where Q is a positive integer.

The question of calculating the text data is related to the Q semantic relevance degrees of the Q segment text, where the Q segment text corresponds to the Q semantic relevance degrees one-to-one.

Obtain the highest first semantic relevance degree among the Q semantic relevance degrees, and determine that the text corresponding to the first semantic relevance degree is the answer corresponding to the question of the text data.

Refer to FIG. 4, which is a schematic diagram of a system structure provided by an embodiment of this application. Among them, as shown in Figure 4, the system includes an artificial intelligence server and a terminal. The artificial intelligence server communicates with the terminal. The terminal includes a mobile phone and a computer. The user accesses the artificial intelligence server through the terminal. When the terminal is a mobile phone, the user can use the mobile phone Take photos of the books to be processed, send the photos to the artificial intelligence server, the artificial intelligence server processes the photos, obtains the processing results, and then returns the processing results to the user’s mobile phone. When the terminal is a computer, the user can connect to the computer through Scanning equipment, such as printers, scans the book, and then sends the scanned image to the artificial intelligence server. The artificial intelligence server processes the scanned image to obtain the processing result, and then returns the processing result to the user's computer.

Referring to FIG. 5, FIG. 5 is a schematic diagram of performing character recognition processing on image data according to an embodiment of the application. Among them, as shown in Figure 5, the image data is displayed as ABCDE. First, the image data is cut into characters, and five characters can be obtained, namely A, B, C, D, and E, and then feature extraction of the obtained characters , To obtain the features of the five characters respectively, which are feature a, feature b, feature c, feature d, and feature e. After the features are obtained, compare and recognize to determine the text characters corresponding to the features, which are text character A, text After the character B, the text character C, the text character D and the text character E are obtained, all the text characters are combined to obtain the text ABCDE.

Referring to FIG. 6, FIG. 6 is a schematic diagram of a data processing apparatus provided by another embodiment of the application. Wherein, as shown in FIG. 6, a data processing apparatus provided by another embodiment of the present application may include:

The obtaining module 601 is used to obtain image data of books sent by the terminal;

The character recognition module 602 is configured to perform character recognition processing on the image data to obtain text data corresponding to the image data;

The detection module 603 is configured to perform text type detection on the text data to determine whether the text type of the text data meets the preset text type;

The encoding module 604 is configured to input the text data into a neural network encoder to obtain a summary vector of the text data when the text type meets the preset text type, wherein the neural network encoder is used to Compressing and encoding the text data;

The decoding module 605 is configured to input the summary vector of the text data into a neural network decoder to obtain a summary of the text data, wherein the neural network decoder is used to predict the summary vector of the text data through a neural network To obtain a plurality of predicted words, and the plurality of predicted words are connected as a summary of the text data;

The extraction module 606 is configured to perform word segmentation processing on the abstract of the text data, and extract N keywords in the abstract of the text data in the order of word frequency from large to small, where N is a positive integer;

The combination module 607 is configured to classify the N keywords by part of speech, and combine the N keywords according to the part of speech of the N keywords according to a preset question order to obtain the text data question;

The processing module 608 is configured to calculate the degree of semantic relevance between the question of the text data and the text in the text data through the neural network semantic representation model, and determine the text with the highest degree of semantic relevance as the answer corresponding to the question of the text data.

For the specific implementation of the data processing device of the present application, please refer to the various embodiments of the above data processing method, which will not be repeated here.

Referring to FIG. 7, FIG. 7 is a schematic structural diagram of an electronic device in a hardware operating environment involved in an embodiment of the application. Wherein, as shown in FIG. 7, the electronic device of the hardware operating environment involved in the embodiment of the present application may include:

The processor 701 is, for example, a CPU.

The memory 702, optionally, the memory may be a high-speed RAM memory, or a stable memory, such as a disk memory.

The communication interface 703 is used to implement connection and communication between the processor 701 and the memory 702.

Those skilled in the art can understand that the structure of the data processing electronic device shown in FIG. 7 does not constitute a limitation on the data processing electronic device, and may include more or less components than shown in the figure, or a combination of certain components , Or different component arrangements.

As shown in FIG. 7, the memory 702 may include an operating system, a network communication module, and data processing programs. The operating system is a program that manages and controls the hardware and software resources of an electronic device for data processing, a program that supports data processing, and the operation of other software or programs. The network communication module is used to implement communication between various components in the memory 702, and communication with other hardware and software in the data processing electronic device.

In the data processing electronic device shown in FIG. 7, the processor 701 is configured to execute the data processing program stored in the memory 702, and implement the following steps:

Acquiring the image data of the book sent by the terminal;

A neural network semantic representation model is used to calculate the degree of semantic correlation between the question of the text data and the text in the text data, and the text with the highest degree of semantic correlation is determined as the answer corresponding to the question of the text data.

For the specific implementation of the electronic device for data processing in this application, please refer to each embodiment of the above data processing method, which will not be repeated here.

Another embodiment of the present application provides a computer-readable storage medium. The computer-readable storage medium may be a non-volatile computer-readable storage medium. The computer-readable storage medium stores a computer program, and the computer program is processed. Execute to achieve the following steps:

Acquiring the image data of the book sent by the terminal;

For the specific implementation of the computer-readable storage medium of the present application, please refer to the various embodiments of the foregoing data processing method, which will not be repeated here.

It should also be noted that for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should know that this application is not limited by the described sequence of actions , Because according to this application, some steps can be performed in other order or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by this application. In the above-mentioned embodiments, the description of each embodiment has its own focus. For parts that are not described in detail in an embodiment, reference may be made to related descriptions of other embodiments.

As mentioned above, the above embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that: The technical solutions recorded in the embodiments are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present application.

Claims

A data processing method, characterized in that it comprises:

Acquiring the image data of the book sent by the terminal;

Performing character recognition processing on the image data to obtain text data corresponding to the image data;

Performing text type detection on the text data to determine whether the text type of the text data meets the preset text type;

When the text type satisfies the preset text type, input the text data into a neural network encoder to obtain a summary vector of the text data, wherein the neural network encoder is used to compress the text data coding;

The summary vector of the text data is input to a neural network decoder to obtain a summary of the text data, wherein the neural network decoder is used to predict the summary vector of the text data through a neural network to obtain multiple predictions Words, the plurality of predicted words are connected as a summary of the text data;

Perform word segmentation processing on the abstract of the text data, and extract N keywords in the abstract of the text data in the order of word frequency from large to small, where N is a positive integer;

Classify the N keywords by part of speech, and combine the N keywords according to the part of speech of the N keywords according to a preset question sentence order to obtain the text data question;

A neural network semantic representation model is used to calculate the degree of semantic relevance between the question of the text data and the text in the text data, and the text with the highest degree of semantic relevance is determined as the answer corresponding to the question of the text data.
The method according to claim 1, wherein before performing character recognition processing on the image data to obtain text data corresponding to the image data, the method comprises:

When the inclination of the image data exceeds the preset inclination threshold, the image data is processed by an image correction algorithm, wherein the image correction algorithm includes any one of Radon algorithm, Hough transform, and linear regression algorithm Species

Or, when the definition of the image data is lower than a preset definition threshold, the image data is processed by an image enhancement algorithm, where the image enhancement algorithm includes histogram equalization, image smoothing, and image sharpening. Any kind

Alternatively, when the inclination of the image data exceeds the preset inclination threshold and the definition of the image data is lower than the preset definition threshold, the image correction algorithm and the image enhancement algorithm are used to compare The image data is processed.
The method according to claim 2, wherein said performing character recognition processing on said image data to obtain text data corresponding to said image data comprises:

Perform character cutting on the image data to obtain M characters, where M is a positive integer;

Performing feature extraction on the M characters to obtain M character features, wherein the M characters correspond to the M character features one-to-one;

Comparing the M character features with a character feature database to identify the M text characters corresponding to the M character features, wherein the M character features correspond to the M text characters one to one;

The M text characters are combined to obtain text data corresponding to the image data.
The method according to claim 1, wherein the text type includes a language type and a style type, and the text type detection is performed on the text data to determine whether the text type of the text data meets a preset text type include:

Performing language type detection on the text data to obtain the language type of the text data;

Performing stylistic type detection on the text data to obtain the stylistic type of the text data;

When the language type meets the preset language type and the style type meets the preset style type, determining that the text type meets the preset text type;

When the language type does not meet the preset language type, or the style type does not meet the preset style type, or the language type does not meet the preset language type and the style type does not meet the When the text type is preset, it is determined that the text type does not satisfy the preset text type.
The method according to claim 4, wherein after the determining that the text type does not satisfy the preset text type, the method comprises:

When the language type does not satisfy the preset language type, a language type error message is sent to the terminal, where the language type error message is used to instruct the terminal to generate a bulletin indicating that the language type of the book is wrong. Window or interface;

When the style type does not meet the preset style type, a style type error message is sent to the terminal, where the style type error message is used to instruct the terminal to generate a bullet indicating that the book's style type is wrong. Window or interface;

When the language type does not meet the preset language type and the style type does not meet the preset style type, a language and style type error message is sent to the terminal, wherein the language and style type error message It is used to instruct the terminal to generate a pop-up window or interface that prompts the language and style of the book to be wrong.
The method according to claim 1, wherein the neural network encoder comprises a first recurrent neural network, and the inputting the text data into the neural network encoder to obtain a summary vector of the text data comprises:

Inputting the first text in the text data into the first recurrent neural network at the current moment to obtain a first encoding vector;

Pass the first coding vector into the next moment;

Input the first code vector and the second text in the text data into the first recurrent neural network at the next moment to obtain a second code vector;

The second coding vector is passed into the next moment until all the text in the text data is input into the first recurrent neural network, and it is determined that the finally obtained coding vector is the summary vector of the text data.
The method according to claim 6, wherein the neural network decoder comprises a second recurrent neural network, and the input of the summary vector of the text data into the neural network decoder to obtain the summary of the text data comprises :

Inputting the summary vector of the text data into the second recurrent neural network at the current moment to predict the first output text;

Pass the first output text to the next moment;

Input the first output text and the summary vector of the text data into the second recurrent neural network at the next moment to predict the second output text;

The second output text is passed into the next moment until the second recurrent neural network predicts the summary vector of the text data, and it is determined that the combination of all the output texts finally obtained is the summary of the text data.
The method according to claim 1, wherein the performing word segmentation processing on the abstract of the text data, and extracting the N keywords in the abstract of the text data according to the order of word frequency from large to small, comprises:

Performing word segmentation processing on the abstract of the text data to obtain K segmentation corresponding to the abstract of the text data, where K is a positive integer greater than N;

Calculating the K word frequencies corresponding to the K word segmentation, wherein the K word segmentation corresponds to the K word frequencies one to one;

Determine the N participles of the K participles in descending order of word frequency;

Extract the N word segmentation.
The method according to claim 8, wherein the problem of the text data is calculated by the neural network semantic representation model and the degree of semantic relevance of the text in the text data is determined, and the text with the highest degree of semantic relevance is determined The answers to the questions that describe the text data include:

Searching for Q segments of text matching the N keywords in the text data, where Q is a positive integer;

Calculating the question of the text data and the Q semantic relevance degrees of the Q segment text, wherein the Q segment text corresponds to the Q semantic relevance degrees one to one;

Obtaining the highest first semantic relevance degree among the Q semantic relevance degrees;

It is determined that the text corresponding to the first degree of semantic relevance is the answer corresponding to the question of the text data.
A data processing device, characterized in that the device includes:

The acquisition module is used to acquire the image data of the book sent by the terminal;

A character recognition module for performing character recognition processing on the image data to obtain text data corresponding to the image data;

The detection module is configured to perform text type detection on the text data to determine whether the text type of the text data meets the preset text type;

The encoding module is used to input the text data into a neural network encoder to obtain a summary vector of the text data when the text type meets the preset text type, wherein the neural network encoder is used to The text data is compressed and encoded;

The decoding module is configured to input the summary vector of the text data into a neural network decoder to obtain a summary of the text data, wherein the neural network decoder is used to predict the summary vector of the text data through a neural network Obtaining a plurality of predicted words, and the plurality of predicted words are connected as a summary of the text data;

The extraction module is configured to perform word segmentation processing on the abstract of the text data, and extract N keywords in the abstract of the text data in the order of word frequency from large to small, where N is a positive integer;

A combination module, configured to classify the N keywords by part of speech, and combine the N keywords according to the part of speech of the N keywords in a preset question sentence order to obtain the text data question;

The processing module is used to calculate the semantic correlation degree between the text data question and the text in the text data through the neural network semantic representation model, and determine the text with the highest semantic correlation degree as the answer corresponding to the text data question.
The device according to claim 10, wherein the device further comprises an image processing module, and the image processing module is configured to:

When the inclination of the image data exceeds the preset inclination threshold, the image data is processed by an image correction algorithm, wherein the image correction algorithm includes any one of Radon algorithm, Hough transform, and linear regression algorithm Species

Or, when the definition of the image data is lower than a preset definition threshold, the image data is processed by an image enhancement algorithm, where the image enhancement algorithm includes histogram equalization, image smoothing, and image sharpening. Any kind

Alternatively, when the inclination of the image data exceeds the preset inclination threshold and the definition of the image data is lower than the preset definition threshold, the image correction algorithm and the image enhancement algorithm are used to compare The image data is processed.
The device according to claim 11, wherein the character recognition module is specifically configured to:

Perform character cutting on the image data to obtain M characters, where M is a positive integer;

Performing feature extraction on the M characters to obtain M character features, wherein the M characters correspond to the M character features one-to-one;

Comparing the M character features with a character feature database to identify the M text characters corresponding to the M character features, wherein the M character features correspond to the M text characters one to one;

The M text characters are combined to obtain text data corresponding to the image data.
The device according to claim 10, wherein the text type includes a language type and a style type, and the detection module is specifically configured to:

Performing language type detection on the text data to obtain the language type of the text data;

Performing stylistic type detection on the text data to obtain the stylistic type of the text data;

When the language type meets the preset language type and the style type meets the preset style type, determining that the text type meets the preset text type;

When the language type does not meet the preset language type, or the style type does not meet the preset style type, or the language type does not meet the preset language type and the style type does not meet the When the text type is preset, it is determined that the text type does not satisfy the preset text type.
The device according to claim 13, wherein the device further comprises a prompt module, and the prompt module is specifically configured to:

When the language type does not satisfy the preset language type, a language type error message is sent to the terminal, where the language type error message is used to instruct the terminal to generate a bulletin indicating that the language type of the book is wrong. Window or interface;

When the style type does not meet the preset style type, a style type error message is sent to the terminal, where the style type error message is used to instruct the terminal to generate a bullet indicating that the book's style type is wrong. Window or interface;

When the language type does not meet the preset language type and the style type does not meet the preset style type, a language and style type error message is sent to the terminal, wherein the language and style type error message It is used to instruct the terminal to generate a pop-up window or interface that prompts the language and style of the book to be wrong.
The apparatus according to claim 10, wherein the neural network encoder comprises a first recurrent neural network, and the encoding module is specifically configured to:

Inputting the first text in the text data into the first recurrent neural network at the current moment to obtain a first encoding vector;

Pass the first coding vector into the next moment;

Input the first code vector and the second text in the text data into the first recurrent neural network at the next moment to obtain a second code vector;

The second coding vector is passed into the next moment until all the text in the text data is input into the first recurrent neural network, and it is determined that the finally obtained coding vector is the summary vector of the text data.
The apparatus according to claim 15, wherein the neural network decoder comprises a second recurrent neural network, and the decoding module is specifically configured to:

Inputting the summary vector of the text data into the second recurrent neural network at the current moment to predict the first output text;

Pass the first output text to the next moment;

Input the first output text and the summary vector of the text data into the second recurrent neural network at the next moment to predict the second output text;

The second output text is passed into the next moment until the second recurrent neural network predicts the summary vector of the text data, and it is determined that the combination of all the output texts finally obtained is the summary of the text data.
The device according to claim 10, wherein the extraction module is specifically configured to:

Performing word segmentation processing on the abstract of the text data to obtain K segmentation corresponding to the abstract of the text data, where K is a positive integer greater than N;

Calculating the K word frequencies corresponding to the K word segmentation, wherein the K word segmentation corresponds to the K word frequencies one to one;

Determine the N participles of the K participles in descending order of word frequency;

Extract the N word segmentation.
The device according to claim 17, wherein the processing module is specifically configured to:

Searching for Q segments of text matching the N keywords in the text data, where Q is a positive integer;

Calculating the question of the text data and the Q semantic relevance degrees of the Q segment text, wherein the Q segment text corresponds to the Q semantic relevance degrees one to one;

Obtaining the highest first semantic relevance degree among the Q semantic relevance degrees;

It is determined that the text corresponding to the first degree of semantic relevance is the answer corresponding to the question of the text data.
An electronic device for data processing, characterized in that the electronic device includes a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and are The configuration is executed by the processor, and the program includes instructions for executing the steps in any one of the methods of claims 1 to 9.
A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the method according to any one of claims 1 to 9.