CN110750624A

CN110750624A - Information output method and device

Info

Publication number: CN110750624A
Application number: CN201911043157.1A
Authority: CN
Inventors: 林毅隆
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd; Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-10-30
Filing date: 2019-10-30
Publication date: 2020-02-04

Abstract

The embodiment of the application discloses an information output method, which comprises the steps of responding to an image acquisition request received from a user, and acquiring an image including a question to be solved; secondly, extracting text contents in the image, and performing structured classification on the text contents, wherein the text contents after structured classification comprise question stems; and then, acquiring an answer text of the to-be-solved question through a pre-trained answer model based on the text content after structured classification. The technical scheme for solving the problems through the pre-trained answer model solves the problem that the problem analysis is inaccurate due to incomplete question banks and untimely question bank updating in a database, and improves the accuracy of intelligent problem solving.

Description

Information output method and device

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to an information output method and device.

Background

In the era of mobile internet, along with the rapid development of artificial intelligence, more and more applications for searching and answering questions through photographing appear. Currently, the topic identification search in the industry generally adopts a topic library retrieval mode, and after a corresponding topic is matched from a topic library, answers of the topic library are displayed on a result card.

The answer mode through the question bank depends on the question bank construction, and the problem of inaccurate question analysis is caused by incomplete question bank and untimely question bank updating.

Disclosure of Invention

The embodiment of the application provides an information output method and device.

In a first aspect, an embodiment of the present application provides an information output method, where the method includes: in response to receiving an image acquisition request of a user, acquiring an image comprising a to-be-solved question; extracting text contents in the image, and performing structured classification on the text contents, wherein the text contents after structured classification comprise question stems; and outputting an answer text of the question to be solved through a pre-trained answer model based on the text content after structured classification.

In some embodiments, the structured and classified text content further includes a text to be solved corresponding to the question stem, and the text to be solved is used for representing text information of the answer text needing to be output according to the question stem; the above-mentioned text content based on after the structuralization is classifyd, through the answer model output answer text of the answer of training in advance, include: and according to the question stem, obtaining an answer text corresponding to the text to be answered through a pre-trained answer model.

In some embodiments, the above method further comprises: according to the question stem, obtaining an answer text corresponding to the text to be answered through a preset database; and setting the answer text obtained by the pre-trained answer model and the text with the highest matching degree with the text to be solved in the texts obtained by the preset database as the final answer text of the question to be solved.

In some embodiments, the answer model is trained by:

acquiring a training sample set corresponding to the question type of a question to be answered, wherein the training sample set comprises a text set to be answered and an answer text set, and answer texts in the answer text set correspond to texts to be answered in the text set to be answered one by one; and taking the text to be answered in the text set to be answered as the input of the initial answer model, taking the answer text corresponding to the text to be answered in the text set to be answered in the answer text set as the target output, and training the initial answer model to obtain the answer model.

In some embodiments, after outputting the answer text of the question to be solved through a pre-trained answer model based on the text content after structured classification, the method further includes: and displaying an answer text corresponding to the to-be-solved question in an answer area of the to-be-solved question in the image.

In some embodiments, the displaying, in the solution area of the to-be-solved question in the image, an answer text corresponding to the to-be-solved question includes: acquiring a solution area of the image; detecting the display direction of the questions to be solved; according to the font and the answering area of the question to be answered, the answer text is scaled in an equal proportion, and the scaling ratios of the length and the width of characters used for representing the answer text in the equal proportion are the same; and displaying the answer text which corresponds to the question to be solved and is subjected to equal-scale scaling in the answer area of the question to be solved in the image in the display direction of the question to be solved.

In some embodiments, the above method further comprises: and responding to a received error correction request of a user, acquiring an error correction text input by the user, writing the error correction text into an answer text set, and writing a text to be solved corresponding to the error correction text into the text set to be solved.

In a second aspect, an embodiment of the present application provides an information output apparatus, where the apparatus includes: an image acquisition unit configured to acquire an image including a question to be solved in response to receiving an image acquisition request of a user; the text classification unit is configured to extract text content in the image and perform structured classification on the text content, wherein the text content after structured classification comprises a question stem; and the answer obtaining unit is configured to output an answer text of the question to be solved through a pre-trained answer model based on the text content after structured classification.

In some embodiments, the structured and classified text content further includes a text to be solved corresponding to the question stem, and the text to be solved is used for representing text information of the answer text needing to be output according to the question stem; and the answer obtaining unit is further configured to obtain an answer text corresponding to the text to be answered through a pre-trained answer model according to the question stem.

In some embodiments, the answer obtaining unit is further configured to obtain, according to the question stem, an answer text corresponding to the text to be answered through a preset database; and setting the answer text obtained by the pre-trained answer model and the text with the highest matching degree with the text to be solved in the answer texts obtained by the preset database as the final answer text of the question to be solved.

In some embodiments, the answer model is trained by:

In some embodiments, the above apparatus further comprises: and the answer fitting unit is configured to display answer texts corresponding to the questions to be answered in the answer areas of the questions to be answered in the images.

In some embodiments, the answer fitting unit is further configured to: acquiring a solution area of the image; detecting the display direction of the questions to be solved; according to the font and the answering area of the question to be answered, the answer text is scaled in an equal proportion, and the scaling ratios of the length and the width of characters used for representing the answer text in the equal proportion are the same; and displaying the answer text which corresponds to the question to be solved and is subjected to equal-scale scaling in the answer area of the question to be solved in the image in the display direction of the question to be solved.

In some embodiments, the above apparatus further comprises: and the error correction unit is configured to respond to a received error correction request of a user, acquire an error correction text input by the user, write the error correction text into the answer text set, and write the text to be solved corresponding to the error correction text into the answer text set.

In a third aspect, the present application provides a computer-readable medium, on which a computer program is stored, where the program, when executed by a processor, implements the method as described in any implementation manner of the first aspect.

In a fourth aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors, cause the one or more processors to implement a method as described in any implementation of the first aspect.

According to the information output method and the information output device, firstly, an image including a to-be-solved question is acquired in response to a received image acquisition request of a user; secondly, extracting text contents in the image, and performing structured classification on the text contents, wherein the text contents after structured classification comprise question stems; and then, outputting an answer text of the question to be solved through a pre-trained answer model based on the text content after structured classification. The technical scheme for solving the problems through the pre-trained answer model solves the problem that the problem analysis is inaccurate due to incomplete question banks and untimely question bank updating in a database, and improves the accuracy of intelligent problem solving.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present application may be applied;

FIG. 2 is a flow diagram of one embodiment of an information output method according to the present application;

fig. 3 is a schematic diagram of an application scenario of the information output method according to the present embodiment;

FIG. 4 is a flow chart of yet another embodiment of an information output method according to the present application;

FIG. 5 is an interface flow diagram of yet another embodiment of an information output method according to the present application;

FIG. 6 is a block diagram of one embodiment of an information output device according to the present application;

FIG. 7 is a block diagram of a computer system suitable for use in implementing embodiments of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Fig. 1 shows an exemplary architecture 100 to which the personal information output method and apparatus of the present application can be applied.

As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The terminal devices 101, 102, 103 may be hardware devices or software that support network connections for data interaction and data processing. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices supporting functions of photographing, information interaction, network connection, and the like, including but not limited to smart phones, tablet computers, e-book readers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented, for example, as multiple software or software modules to provide distributed services, or as a single software or software module. And is not particularly limited herein.

The server 105 may be a server that provides various services, such as a server that provides image recognition, data processing, and data matching functions to the terminal apparatuses 101, 102, and 103. The server can store or process various received data and feed back the processing result to the terminal equipment.

It should be noted that the information output method provided by the embodiment of the present disclosure may be executed by the terminal devices 101, 102, and 103, or executed by the server 105, or may be executed by a part of the terminal devices 101, 102, and 103, and another part of the terminal devices 105. Accordingly, the information output device may be provided in the terminal apparatuses 101, 102, 103, or may be provided in the server 105. And is not particularly limited herein.

The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules, for example, to provide distributed services, or as a single piece of software or software module. And is not particularly limited herein.

It should be understood that the number of terminal devices and servers in fig. 1 is merely illustrative. There may be any number of terminal devices and servers, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of an information output method according to the present application is shown, comprising the steps of:

step 201, in response to receiving an image acquisition request of a user, acquiring an image including a question to be solved.

In this embodiment, after receiving an image acquisition request from a user, an executing subject (for example, the terminal device in fig. 1) may acquire an image including a question to be answered through a shooting device configured by the executing subject, or may acquire an image including a question to be answered and stored locally by the executing subject.

In this embodiment, the operation of initiating the image acquisition request by the user may be submitted in a human-computer interaction manner in the prior art or in a future developed technology. These human-computer interaction means include, but are not limited to: shaking the terminal, clicking on a virtual button (e.g., a virtual button displayed on a display screen), clicking on a physical button, gesture recognition, voice recognition, or other human-machine interaction means developed in the future. Taking gesture recognition as an example, the camera of the execution main body acquires gesture information of the user, and compares a predefined operation gesture corresponding to an operation of initiating the image acquisition request. And submitting the operation of the image acquisition request if the user gesture is identified as an operation gesture corresponding to the operation of initiating the image acquisition request. Correspondingly, the execution subject of the embodiment may provide for receiving and identifying the image acquisition request submitted based on the submission method described above.

In this embodiment, the questions to be solved may be various disciplines and types of questions in education and learning, including but not limited to calculation questions in the mathematics discipline, blank filling questions in the Chinese discipline, and shape filling questions in the English discipline. The image of the question to be solved may be an image of an examination paper, an image of an exercise book.

In some optional implementation manners of this embodiment, the image acquisition request of the user may be used to instruct to continuously shoot a preset number of images including the question to be answered within a preset time period; the execution main body continuously shoots a preset number of images including the questions to be answered within a preset time period according to the image acquisition request; and selecting the image with the highest definition from the preset number of images including the questions to be solved as the image for performing text extraction next.

Step 202, extracting text content in the image, and performing structured classification on the text content.

In this embodiment, the text content in the image may be extracted by an OCR (Optical Character Recognition) method. In some optional implementation manners, the execution subject firstly performs image processing on the image to be solved, and performs text recognition based on the image after the image processing to extract text content in the image. The specific image processing process is as follows: firstly, carrying out gray level processing on an image comprising a question to be answered, and generating a gray level image through a color image; then, generating a black-white image of the image according to the gray scale image of the image, wherein the black-white image only has black and white colors; then, eliminating noise points in the black and white images, because the quality of the image to be recognized is limited by the input equipment, the environment and the printing quality of the document, before the characters in the image are recognized, the image to be recognized needs to be subjected to denoising processing according to the characteristics of the noise, and the recognition processing accuracy is improved. Then, performing tilt correction on the noise-reduced image to enable text content in the image to be displayed horizontally; and finally, horizontally cutting and vertically cutting the corrected image to obtain a single character based on the text content in the image.

The specific text recognition process is as follows: firstly, generating template characters of characters in various fonts based on all characters used in life and study of people; normalizing the character to be recognized based on the text content obtained after image processing, and recording meta (element) information of the character; finally, according to the normalized character to be recognized and the corresponding meta information, matching the template character with the character to be recognized to recognize the character to be recognized, wherein the specific matching method comprises but is not limited to: character pixel matching, projection block matching, Sudoku matching, center matching, aspect ratio matching and the like.

In this embodiment, after the text information in the subject recognition image is executed, the text information needs to be classified structurally. And the structured classification is used for classifying the recognized text information based on the keywords of the questions to be solved, and the text information after structured classification comprises the question stems of the questions to be solved. For example, if the execution subject recognizes a keyword of "happy anti-sense word" from the text information, the execution subject performs structured classification based on the keyword to obtain a stem indicating to fill in the space with the anti-sense word. In some optional implementation manners, the structured and classified text content further includes a text to be solved corresponding to the question stem, and the text to be solved is used for representing text information of the text to be answered, which needs to be output according to the question stem. For example, the structured and classified text content may include stem information of "filling out similar meaning words" and a text to be solved under the stem information, where the text to be solved is various words in which the similar meaning words need to be filled out.

The execution subject for executing the present step may be a mobile terminal or a server. When the execution main body is the mobile terminal, after the mobile terminal acquires the image comprising the questions to be solved, extracting and structurally classifying the text content of the image; when the execution subject is the server, the server extracts and structurally classifies the text content of the image according to the image which is acquired by the mobile terminal and comprises the question to be answered.

And step 203, outputting an answer text of the question to be solved through a pre-trained answer model based on the text content after structured classification.

In this embodiment, the pre-trained answer model may obtain, through a machine learning method, an answer text of the to-be-solved question corresponding to the text content according to the input text content based on semantic understanding. The answer model carries out directional training in different categories through questions to be solved based on different subjects and different question types, so that the model can identify the questions to be solved of each question type to obtain corresponding answer texts.

In some alternative implementations, the answer model is trained by: acquiring a training sample set corresponding to the question type of a question to be answered, wherein the training sample set comprises a text set to be answered and an answer text set, and answer texts in the answer text set correspond to texts to be answered in the text set to be answered one by one; and taking the text to be answered in the text set to be answered as the input of the initial answer model, taking the answer text corresponding to the text to be answered in the text set to be answered in the answer text set as the target output, and training the initial answer model to obtain the answer model. Specifically, the executing body may use models such as a convolutional neural network, a deep learning Model, a Naive Bayesian Model (NBM), or a Support Vector Machine (SVM), take the text to be solved in the text set to be solved as the input of the Model, take the answer text corresponding to the text to be solved in the text set to be solved in the answer text set as the output of the Model, and train to obtain the answer Model.

The execution main body can acquire answer texts of the questions to be answered through a pre-trained answer model according to the text contents after structured classification; in other optional embodiments, the execution subject may also obtain an answer text of the to-be-answered question by combining the preset database and the pre-trained answer model, where the combination may be that, first, the answer text is obtained through the preset question library, and under the condition that the preset database cannot obtain the answer text, the answer is obtained through the pre-trained answer model; or the answer text of the question to be solved can be simultaneously obtained through a preset database and a pre-trained answer model. In some optional implementation manners, the answer text of the question to be solved is obtained simultaneously by the following method: firstly, according to the question stem, obtaining an answer text corresponding to a text to be answered through a preset database; and according to the question stem, obtaining an answer text corresponding to the text to be answered through a pre-trained answer model. And then, setting the answer text obtained by the pre-trained answer model and the text with the highest matching degree with the text to be answered in the answer texts obtained by the preset database as the final answer text of the question to be answered.

The matching degree of the answer text obtained by the answer model and the text to be answered is determined by the confidence coefficient which is output by the answer model and corresponds to the answer text; the matching degree of the answer text obtained through the preset database is determined by the matching degree between the questions in the question bank and the questions to be solved, and the matching degree between the questions in the question bank and the questions to be solved comprises the matching degree of the question stem between the questions in the question bank and the questions to be solved and the matching degree of the text to be solved.

The matching degree of the question stem and the matching degree of the text to be solved can be determined according to the number of matched keywords and the total number of the keywords. For example, the question stem of the question to be solved is "to perform space filling with an antisense word", the text to be solved includes 9 words such as "open heart", "weak small", and the like, wherein the keywords recognized according to the question stem are "space filling with an antisense word", the keywords recognized by the text to be solved are 9 words of the text to be solved, and thus, the total number of the keywords to be solved is 10. If 7 keywords of the topics searched in the preset database are the same as the keywords of the questions to be solved, it is determined that the matching degree of the answer text of the topics searched in the question database is 7/10, that is, the matching degree is 70%. In some alternative implementations, different weights may be set for the keywords of the question stem and the keywords of the text to be solved, so as to obtain a more accurate matching degree. The preset database is a database containing all disciplines, all types of questions and corresponding answers. In order to avoid the problem of inaccurate answer caused by incomplete database, the preset database needs to be updated in time, and the updating manner includes, but is not limited to, correction of wrong answers, supplement of data, and the like. In some optional embodiments, the update time may be preset, and the preset database may be updated in time in response to the update time being reached. The channels for acquiring the updated contents of the database include, but are not limited to, various academic databases, browsers, electronic books and the like.

The execution subject for executing the present step may be a mobile terminal or a server. When the preset database and the pre-trained answer model are arranged in the mobile terminal, the execution main body of the step is the mobile terminal provided with the preset database and the pre-trained answer model; when the preset database and the pre-trained answer model are arranged in the server, the execution main body of the step is the server provided with the preset database and the pre-trained answer model.

In the embodiment, the execution main body obtains the answer text of the to-be-solved question through the pre-trained answer model, so that the problem that the question analysis is inaccurate due to incomplete database and untimely database updating when the answer text is obtained through the preset database can be solved, and the accuracy of intelligent question solving is improved.

Fig. 3 schematically shows one application scenario of the information output method according to the present embodiment. The user 301 clicks a virtual button on the smartphone 302 and initiates an image acquisition request. The smartphone 302 receives an image acquisition request of the user 301, and captures a language test paper 304 being processed by the user 301 to obtain an image including a question to be solved in the language test paper. The smart phone 302 uploads the acquired image to the server 303, the server 303 processes the image transmitted by the smart phone 302, extracts text content in the image, and structurally classifies the text content; then, based on the structured and classified text content, the server 303 outputs an answer text of the question to be solved through a pre-trained answer model in the server.

With continued reference to FIG. 4, an exemplary flow 400 of another embodiment of an information output method according to the present application is shown, comprising the steps of:

step 401, in response to receiving an image acquisition request of a user, acquiring an image including a question to be solved.

In this embodiment, step 401 is performed in a manner similar to step 201, and is not described herein again.

Step 402, extracting text content in the image, and performing structured classification on the text content.

In this embodiment, step 402 is performed in a manner similar to step 202, and is not described herein again.

And 403, outputting an answer text of the question to be answered through a pre-trained answer model based on the text content after structured classification.

In this embodiment, step 403 is performed in a manner similar to step 203, which is not described herein again.

Step 404, displaying an answer text corresponding to the question to be solved in an answer area of the question to be solved in the image.

In this embodiment, after obtaining the answer text corresponding to the to-be-solved question, the execution main body may display the answer text corresponding to the to-be-solved question in the answer area of the to-be-solved question in the image. In some alternative embodiments, the subject is executed to first acquire a solution region of the image; then, detecting the display direction of the questions to be solved; then, according to the font of the question to be answered and the answering area, the answer text is scaled in an equal proportion, and the scaling ratios of the length and the width of characters used for representing the answer text in the equal proportion are the same; and displaying the answer text which corresponds to the question to be solved and is subjected to equal-scale scaling in the answer area of the question to be solved in the image in the display direction of the question to be solved.

Step 405, in response to receiving an error correction request of a user, acquiring an error correction text input by the user, writing the error correction text into an answer text set, and writing a text to be answered corresponding to the error correction text into the text set to be answered.

In this embodiment, the execution main body may receive and recognize the user error correction request, and obtain an error correction text corresponding to the text to be answered, which is input by the user. And writing the obtained error correction text into an answer text set, and writing the answer text corresponding to the error correction text into a text set to be answered. Thus, the training sample set of the answer model is filled for training the answer model to improve the accuracy of the output answer. In some optional implementations, the execution subject may verify correctness of the error correction text input by the user through a preset database, write the correct error correction text into the answer text set, and write the text to be solved corresponding to the correct error correction text into the answer text set.

FIG. 5 is a flow chart of an interface of a terminal corresponding to FIG. 4; the execution main body receives an image acquisition request of a user, acquires an image comprising a question to be answered, and generates an initial image interface 501; then, the text content in the image 501 is recognized, the answer text of the question to be solved is obtained, the answer text is displayed on the image 501, the answer display interface 502 is generated, and then the error correction text input by the user is obtained in response to receiving the error correction request of the user, and the error correction interface 503 is generated. As can be seen from fig. 4 and fig. 5, compared with the embodiment corresponding to fig. 2, the flow 400 of the information output method in this embodiment specifically illustrates that after the answer text is obtained, the answer area of the question to be answered in the obtained image is displayed in the display direction of the question to be answered, the answer text which corresponds to the question to be answered and is scaled in equal proportion is displayed, and the error correction text input by the user is obtained. Therefore, the user can visually check the answers on the images and correct the answer texts, so that the user experience degree is improved, and the learning efficiency is improved.

With continuing reference to fig. 6, as an implementation of the methods illustrated in the above-described figures, the present disclosure provides an embodiment of an information output apparatus, which corresponds to the embodiment of the method illustrated in fig. 2, and which may be applied in various electronic devices.

As shown in fig. 6, the information output apparatus includes: an image acquisition unit 601, a text classification unit 602, an answer acquisition unit 603, an answer attaching unit 604, and an error correction unit 605.

The image acquisition unit 601 is configured to acquire an image including a question to be solved in response to receiving an image acquisition request of a user; the text classification unit 602 is configured to extract text content in the image and perform structured classification on the text content, wherein the structured classified text content includes a stem; the answer obtaining unit 603 is configured to output an answer text of the question to be solved through a pre-trained answer model based on the structured and classified text content; the answer fitting unit 604 is configured to display an answer text corresponding to the question to be answered in an answer area of the question to be answered in the image; the error correction unit 605 is configured to acquire an error correction text input by the user in response to receiving an error correction request by the user.

In this embodiment, the structured and classified text content further includes a text to be solved corresponding to the question stem, where the text to be solved is used to represent text information of an answer text that needs to be output according to the question stem; the answer obtaining unit 603 is further configured to: according to the question stem, obtaining an answer text corresponding to the text to be answered through a pre-trained answer model; according to the question stem, obtaining an answer text corresponding to the text to be answered through a preset database; and setting the answer text obtained by the pre-trained answer model and the text with the highest matching degree with the text to be solved in the answer texts obtained by the preset database as the final answer text of the question to be solved.

In this embodiment, the answer model is obtained by training as follows: acquiring a training sample set corresponding to the question type of a question to be answered, wherein the training sample set comprises a text set to be answered and an answer text set, and answer texts in the answer text set correspond to texts to be answered in the text set to be answered one by one; and taking the text to be answered in the text set to be answered as the input of the initial answer model, taking the answer text corresponding to the text to be answered in the text set to be answered in the answer text set as the target output, and training the initial answer model to obtain the answer model.

In this embodiment, the answer fitting unit 604 is further configured to: acquiring a solution area of the image; detecting the display direction of the questions to be solved; according to the font and the answering area of the question to be answered, the answer text is scaled in an equal proportion, and the scaling ratios of the length and the width of characters used for representing the answer text in the equal proportion are the same; and displaying the answer text which corresponds to the question to be solved and is subjected to equal-scale scaling in the answer area of the question to be solved in the image in the display direction of the question to be solved.

Referring now to FIG. 7, shown is a block diagram of a computer system 700 suitable for use in implementing devices of embodiments of the present application (e.g., devices 101, 102, 103, 105 shown in FIG. 1). The apparatus shown in fig. 6 is only an example, and should not bring any limitation to the function and use range of the embodiments of the present application.

As shown in fig. 6, the computer system 700 includes a processor (e.g., CPU, central processing unit) 701, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data necessary for the operation of the system 700 are also stored. The processor 701, the ROM702, and the RAM703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program, when executed by the processor 701, performs the above-described functions defined in the method of the present application.

It should be noted that the computer readable medium of the present application can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the client computer, partly on the client computer, as a stand-alone software package, partly on the client computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the client computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor comprises an image acquisition unit, a text classification unit, an answer acquisition unit and an answer fitting unit. The names of these units do not constitute a limitation on the units themselves in some cases, and for example, the image acquisition unit may also be described as a unit that acquires an image including a question to be answered in response to receiving an image acquisition request from a user.

As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the computer device to: in response to receiving an image acquisition request of a user, acquiring an image comprising a to-be-solved question; extracting text contents in the image, and performing structured classification on the text contents, wherein the text contents after structured classification comprise question stems; and outputting an answer text of the question to be solved through a pre-trained answer model based on the text content after structured classification.

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. An information output method, wherein the method comprises:

in response to receiving an image acquisition request of a user, acquiring an image comprising a to-be-solved question;

extracting text contents in the image, and performing structured classification on the text contents, wherein the text contents after structured classification comprise question stems;

and outputting the answer text of the question to be solved through a pre-trained answer model based on the text content after structured classification.

2. The method according to claim 1, wherein the structured and classified text content further comprises a text to be solved corresponding to the question stem, and the text to be solved is used for representing text information of an answer text needing to be output according to the question stem;

the outputting of the answer text of the question to be solved through a pre-trained answer model based on the text content after structured classification comprises the following steps:

and acquiring an answer text corresponding to the text to be answered through the pre-trained answer model according to the question stem.

3. The method of claim 2, wherein the method further comprises:

according to the question stem, obtaining an answer text corresponding to the text to be answered through a preset database;

and setting the answer text obtained by the pre-trained answer model and the text with the highest matching degree with the text to be answered in the answer text obtained by the preset database as the final answer text of the question to be answered.

4. The method of claim 2, wherein the answer model is trained by:

acquiring a training sample set corresponding to the question type of a question to be answered, wherein the training sample set comprises a text set to be answered and an answer text set, and answer texts in the answer text set correspond to texts to be answered in the text set to be answered one by one;

and taking the text to be answered in the text set to be answered as the input of an initial answer model, taking the answer text corresponding to the text to be answered in the text set to be answered in the answer text set as a target output, and training the initial answer model to obtain the answer model.

5. The method according to claim 1, wherein after outputting the answer text of the question to be solved through a pre-trained answer model based on the structured classified text content, the method further comprises:

and displaying the answer text corresponding to the to-be-solved question in an answer area of the to-be-solved question in the image.

6. The method according to claim 5, wherein the answer area for the question to be solved in the image, displaying the answer text corresponding to the question to be solved, comprises:

acquiring a solution area of the image;

detecting the display direction of the questions to be solved;

according to the font of the question to be answered and the answering area, the answer text is scaled in an equal proportion, and the scaling ratios of the length and the width of characters used for representing the answer text in the equal proportion are the same;

and displaying the answer text which corresponds to the question to be answered and is subjected to equal-scale scaling in the answer area of the question to be answered in the image according to the display direction of the question to be answered.

7. The method of claim 4, wherein the method further comprises:

and responding to a received error correction request of a user, acquiring an error correction text input by the user, writing the error correction text into the answer text set, and writing a text to be solved corresponding to the error correction text into the text set to be solved.

8. An information output apparatus, wherein the apparatus comprises:

an image acquisition unit configured to acquire an image including a question to be solved in response to receiving an image acquisition request of a user;

the text classification unit is configured to extract text content in the image and perform structured classification on the text content, wherein the text content after structured classification comprises a question stem;

and the answer obtaining unit is configured to output the answer text of the question to be solved through a pre-trained answer model based on the text content after structured classification.

9. The device according to claim 8, wherein the structured and classified text content further comprises a text to be solved corresponding to the question stem, and the text to be solved is used for representing text information of an answer text needing to be output according to the question stem;

the answer obtaining unit is further configured to obtain an answer text corresponding to the text to be answered through the pre-trained answer model according to the question stem.

10. The apparatus of claim 9, wherein,

the answer obtaining unit is further configured to obtain an answer text corresponding to the text to be answered through a preset database according to the question stem; and setting the answer text obtained by the pre-trained answer model and the text with the highest matching degree with the text to be answered in the answer text obtained by the preset database as the final answer text of the question to be answered.

11. The apparatus of claim 9, wherein the answer model is trained by:

12. The apparatus of claim 8, wherein the apparatus further comprises:

and the answer attaching unit is configured to display the answer text corresponding to the question to be solved in an answer area of the question to be solved in the image.

13. The apparatus of claim 12, wherein,

the answer fitting unit is further configured to: acquiring a solution area of the image; detecting the display direction of the questions to be solved; according to the font of the question to be answered and the answering area, the answer text is scaled in an equal proportion, and the scaling ratios of the length and the width of characters used for representing the answer text in the equal proportion are the same; and displaying the answer text which corresponds to the question to be answered and is subjected to equal-scale scaling in the answer area of the question to be answered in the image according to the display direction of the question to be answered.

14. The apparatus of claim 11, wherein the apparatus further comprises:

and the error correction unit is configured to respond to a received error correction request of a user, acquire an error correction text input by the user, write the error correction text into the answer text set, and write a text to be solved corresponding to the error correction text into the answer text set.

15. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-7.

16. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.