CN111753120A

CN111753120A - Method and device for searching questions, electronic equipment and storage medium

Info

Publication number: CN111753120A
Application number: CN202010603635.6A
Authority: CN
Inventors: 何华强
Original assignee: Guangdong Genius Technology Co Ltd
Current assignee: Guangdong Genius Technology Co Ltd
Priority date: 2020-06-29
Filing date: 2020-06-29
Publication date: 2020-10-09
Anticipated expiration: 2040-06-29
Also published as: CN111753120B

Abstract

The embodiment of the invention discloses a method and a device for searching questions, electronic equipment and a storage medium. The method comprises the following steps: receiving a photographing instruction when the electronic equipment is in a finger reading scene, and photographing the carrier piece by using an image acquisition device; identifying the initial image to obtain an internal contour corresponding to each question mark in the initial image; setting a label corresponding to each internal contour according to the numerical value and the grade of the question number; receiving and identifying a target question number in a first voice command sent by a user; determining a target label and a target internal contour according to the target question number; determining a target image according to the internal contour of the target; and performing OCR recognition on the target image, and searching the database for a matched test question by using a recognition result. By implementing the embodiment of the invention, the loss of the identification content caused by the shielding of the test question content by fingers in a finger reading scene can be completely avoided, the integrity of the identification content of the cut test picture and the test question rate of the user can be improved, and the interactive experience of the user in learning can be improved.

Description

Method and device for searching questions, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of intelligent terminals, in particular to a method and a device for searching questions, electronic equipment and a storage medium.

Background

Many present electronic teaching auxiliary equipment have mostly and indicate the scene of reading, and current indicate read the scene and indicate that the user points to when supporting bodies such as books, exercise book or examination paper through the finger, teaching auxiliary equipment can shoot the supporting body through image acquisition device to the position of discernment finger, thereby confirm the user's intention according to the finger position, and then obtain the image that the user's intention corresponds, be used for the original question search etc.. At present, the shot images of the bearing body are all pictures when the fingers of the user read, and the fingers (palm, pen holding and the like) can inevitably shield a part of effective test question data, so that part of the data is lost, the matching degree of OCR recognition and test question searching according to the recognition content is influenced, and the test question rate finally pushed to the user is low.

Disclosure of Invention

Aiming at the defects, the embodiment of the invention discloses a method and a device for searching questions, electronic equipment and a storage medium, which can avoid finger from blocking images of a bearer and improve the rate of pushing test questions of a user.

The first aspect of the embodiments of the present invention discloses a method for searching questions, where the method includes:

receiving a photographing instruction sent by a user when the electronic equipment is in a finger reading scene, and photographing the carrier piece by using an image acquisition device to obtain an initial image;

identifying the initial image to obtain an internal contour corresponding to each question mark in the initial image;

setting a label corresponding to each internal contour according to the numerical value and the grade of the question number;

receiving and identifying a target question number in a first voice command sent by a user;

determining a target label and a target internal contour according to the target question number, wherein the target label is a label matched with the target question number, and the target internal contour is an internal contour associated with the target label;

determining a text outline according to the target internal outline, and segmenting an initial image in the text outline to obtain a target image;

and performing OCR recognition on the target image, and searching a database for matched test questions by using the recognition result.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, identifying the initial image to obtain an internal contour corresponding to each topic number in the initial image includes:

inputting the initial image into a question recognition network model, a text line detection network model and a question number detection network model based on deep learning in parallel to determine a question outline, a text line outline and a question number frame;

creating a blank mask image, wherein the blank mask image has the same size as the initial image;

adding the title contour to the mask map;

determining the upper boundary of an item number line according to the item number frame and the text line outline, and adding the upper boundary into the mask image;

and extending the left end point and the right end point of the upper boundary so as to connect the upper boundary with the theme contour, wherein the theme contour is divided into a plurality of theme areas by the upper boundary, and each theme area forms an internal contour corresponding to each theme number.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, setting a label corresponding to each internal contour according to the value and the level of the title number includes:

obtaining the grade of each question number through a question number classification model, wherein the grade comprises a primary question and a secondary question;

and setting a label for the question mark according to the numerical value of the question mark and the grade of the question mark, wherein the label embodies the value corresponding to the question mark and the grade of the question mark.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, determining a target label and a target inner contour according to the target question mark, where the target label is a label adapted to the target question mark, and the target inner contour is an inner contour associated with the target label, includes:

traversing all the tags according to the target question number, and determining the tags matched with the target question number as target tags;

when the question number corresponding to the target label is a secondary question and the number of the target label is only one, taking the inner contour corresponding to the target label as the target inner contour;

when the question number corresponding to the target label is a first-level question, or/and the target labels are multiple, or no target label exists, sending an interaction instruction to a user;

and receiving a second voice instruction sent by the user according to the interaction instruction, and determining a new target question number according to the second voice instruction until the question number corresponding to the determined target label is a secondary question and the number of the target label is only one.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, the receiving and recognizing a target question number in a first voice instruction issued by a user includes:

receiving a first voice instruction sent by a user, and extracting one or more digital keywords in the first voice instruction, or one or more digital keywords and associated words of the digital keywords;

and taking the information corresponding to the digital key words or the relevant words of the digital key words and the digital key words as target question numbers.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, determining a text contour according to the target internal contour, and segmenting an initial image in the text contour to obtain a target image, includes:

and taking the target internal contour as a text contour, and segmenting the initial image to obtain a target image, wherein the target image is an initial image part in the text contour.

As an alternative implementation manner, in the first aspect of the embodiment of the present invention, performing OCR recognition on the target image, and searching a database for a matching test question using a result of the recognition includes:

performing OCR recognition on the target image to obtain a recognition result;

searching in a database to obtain target test questions, wherein the similarity between the target test questions and the recognition result is greater than or equal to a preset threshold value;

and if the similarity between the test questions in the database and the recognition result is smaller than a preset threshold, selecting a preset number of test questions with the highest similarity with the recognition result as target test questions.

A second aspect of the embodiments of the present invention discloses a device for searching questions, including:

the photographing unit is used for receiving a photographing instruction sent by a user when the electronic equipment is in a finger reading scene, and photographing the carrier piece by using the image acquisition device to obtain an initial image;

the identification unit is used for identifying the initial image to obtain an internal contour corresponding to each question mark in the initial image;

the setting unit is used for setting a label corresponding to each internal contour according to the numerical value and the grade of the question number;

the receiving unit is used for receiving and identifying a target question number in a first voice command sent by a user;

the determining unit is used for determining a target label and a target internal contour according to the target question mark, wherein the target label is a label matched with the target question mark, and the target internal contour is an internal contour associated with the target label;

the segmentation unit is used for determining a text outline according to the target internal outline and segmenting an initial image in the text outline to obtain a target image;

and the searching unit is used for performing OCR recognition on the target image and searching the matched test questions in a database by using the recognition result.

As an optional implementation manner, in a second aspect of the embodiment of the present invention, the identification unit includes:

the input subunit is used for inputting the initial image into a question identification network model, a text line detection network model and a question number detection network model based on deep learning in parallel to determine a question outline, a text line outline and a question number frame;

a creating subunit, configured to create a blank mask map, where the blank mask map has the same size as the initial image;

an adding subunit, configured to add the title contour to the mask map;

the boundary determining subunit is used for determining the upper boundary of the question mark line according to the question mark frame and the text line outline, and adding the upper boundary into the mask image;

and the extension subunit is used for extending the left end point and the right end point of the upper boundary so as to connect the upper boundary with the theme contour, the theme contour is divided into a plurality of theme areas by the upper boundary, and each theme area forms an internal contour corresponding to each theme number.

As an optional implementation manner, in a second aspect of the embodiment of the present invention, the setting unit includes:

the classification subunit is used for acquiring the grade of each question number through the question number classification model, wherein the grade comprises a primary question and a secondary question;

and the label setting subunit is used for setting a label for the question number according to the numerical value of the question number and the grade of the question number, and the label embodies the value corresponding to the question number and the grade of the question number.

As an optional implementation manner, in a second aspect of the embodiment of the present invention, the determining unit includes:

the traversal subunit is used for traversing all the tags according to the target question number, and determining the tags matched with the target question number as target tags;

the judging subunit is configured to, when the question number corresponding to the target tag is a secondary question and only one target tag is available, take the internal contour corresponding to the target tag as a target internal contour;

the feedback subunit is used for sending an interaction instruction to a user when the question number corresponding to the target label is a first-level question, or/and the target labels are multiple, or the target labels do not exist; and receiving a second voice instruction sent by the user according to the interaction instruction, and determining a new target question number according to the second voice instruction until the question number corresponding to the determined target label is a secondary question and the number of the target label is only one.

As an optional implementation manner, in a second aspect of the embodiment of the present invention, the receiving unit includes:

the extraction subunit is used for receiving a first voice instruction sent by a user and extracting one or more digital keywords in the first voice instruction, or one or more digital keywords and associated words of the digital keywords;

and the target question number determining subunit is used for taking the information corresponding to the digital key words or the relevant words of the digital key words as target question numbers.

As an optional implementation manner, in a second aspect of the embodiment of the present invention, the search unit includes:

the OCR recognition subunit is used for performing OCR recognition on the target image to obtain a recognition result;

the calculation subunit is used for searching in the database to obtain target test questions, and the similarity between the target test questions and the recognition results is greater than or equal to a preset threshold value;

and the pushing subunit is used for selecting the preset number of test questions with the highest similarity to the identification result as the target test questions when the similarities of the test questions in the database and the identification result are smaller than a preset threshold value.

A third aspect of an embodiment of the present invention discloses an electronic device, including: a memory storing executable program code; a processor coupled with the memory; the processor calls the executable program code stored in the memory to execute part or all of the steps of the method for searching the topic disclosed by the first aspect of the embodiment of the invention.

A fourth aspect of the present invention discloses a computer-readable storage medium storing a computer program, where the computer program enables a computer to execute part or all of the steps of the method for searching for a topic disclosed in the first aspect of the present invention.

A fifth aspect of the embodiments of the present invention discloses a computer program product, which, when running on a computer, causes the computer to execute part or all of the steps of the method for searching for a topic disclosed in the first aspect of the embodiments of the present invention.

A sixth aspect of the present invention discloses an application publishing platform, where the application publishing platform is configured to publish a computer program product, and when the computer program product runs on a computer, the computer is enabled to execute some or all of the steps of the method for searching for a topic disclosed in the first aspect of the present invention.

Compared with the prior art, the embodiment of the invention has the following beneficial effects:

in the embodiment of the invention, when the electronic equipment is in a finger reading scene, a photographing instruction sent by a user is received, and an image acquisition device is used for photographing the carrier part to obtain an initial image; identifying the initial image to obtain an internal contour corresponding to each question mark in the initial image; setting a label corresponding to each internal contour according to the numerical value and the grade of the question number; receiving and identifying a target question number in a first voice command sent by a user; determining a target label and a target internal contour according to the target question number, wherein the target label is a label matched with the target question number, and the target internal contour is an internal contour associated with the target label; determining a text outline according to the target internal outline, and segmenting an initial image in the text outline to obtain a target image; and performing OCR recognition on the target image, and searching a database for matched test questions by using the recognition result. Therefore, by implementing the embodiment of the invention, the loss of the identification content caused by the shielding of the test question content by fingers (palm, pen holding and the like) in a finger reading scene can be completely avoided, so that the integrity of the identification content of the cut test picture is improved, the test question rate of the pushed user is improved, the intention of the user is met to the maximum extent, and the interactive experience of the user in learning is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic flow chart illustrating a method for searching for a topic according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a preview state of an image capturing device according to an embodiment of the disclosure;

FIG. 3 is a schematic diagram of an internal contour obtaining method according to an embodiment of the disclosure;

FIG. 4 is a schematic diagram of an initial image according to an embodiment of the present invention;

FIG. 5 is a schematic view of a title profile disclosed in an embodiment of the present invention;

FIG. 6 is a schematic illustration of an internal profile disclosed in an embodiment of the present invention;

FIG. 7 is a schematic diagram of a carrier according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of a page structure of another carrier according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a page structure of another carrier according to an embodiment of the present invention;

FIG. 10 is a schematic structural diagram of an apparatus for searching for a question according to an embodiment of the present invention;

fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first", "second", "third", "fourth", and the like in the description and the claims of the present invention are used for distinguishing different objects, and are not used for describing a specific order. The terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The embodiment of the invention discloses a method, a device, electronic equipment and a storage medium for searching questions, which can completely avoid the loss of identification contents caused by the shielding of test question contents by fingers (palm, pen holding and the like) in a finger reading scene, thereby improving the integrity of the identification contents of a cut test picture, further improving the rate of pushing test questions of a user, meeting the intention of the user to the maximum extent and improving the interactive experience of the learning of the user, and are described in detail below by combining with the attached drawings.

Example one

Referring to fig. 1, fig. 1 is a schematic flow chart illustrating a method for searching for a topic according to an embodiment of the present invention. As shown in fig. 1, the method for searching for questions includes the following steps:

110. and receiving a photographing instruction sent by a user when the electronic equipment is in a finger reading scene, and photographing the carrier piece by using the image acquisition device to obtain an initial image.

The electronic equipment can be intelligent equipment such as a family education machine, a learning machine, a mobile phone with a learning function or a tablet computer. It reads the APP for example to search for the question APP or the question receives and records APP etc. and can the automatic entering point read the scene to start corresponding pointing, also can be image acquisition device and electronic equipment accomplish automatic entering when the communication is connected and point read the scene, perhaps image acquisition device and electronic equipment accomplish the communication and be connected and start corresponding point and read when APP automatic entering point read the scene. The supporting body is a paper learning document such as a book, an exercise book, a homework book and the like, images of target questions are obtained through photographing the supporting body and intention recognition, and then corresponding questions are matched in the database through OCR recognition.

After entering the reading environment, as shown in fig. 2, the image capturing device may be in a preview mode, and the touch screen of the electronic device displays the book setting prompt line 11 in real time (the book is corrected in a trapezoid shape), and meanwhile, the electronic device may also send a voice prompt, such as "please put the book well, and do not block the page content of the book".

The photographing instruction sent by the user can be a voice instruction such as 'small cloth, please photograph' or 'small cloth, i want to solve the problem', and the like, or can be a photographing instruction triggered by a mechanical key or a touch key, and the image acquisition device aligns to the bearing body and photographs the bearing body to obtain an initial image.

120. And identifying the initial image to obtain an internal contour corresponding to each question mark in the initial image.

Illustratively, the internal contour corresponding to the topic number can be obtained in a text line detection manner, specifically: referring to fig. 3, the method includes the following steps:

121. and inputting the initial image into a topic identification network model, a text line detection network model and a topic number detection network model based on deep learning to determine a topic outline, a text line outline and a topic number frame.

The topic identification network model refers to the topic outline in the whole initial image, the initial image is taken as a whole, and the topic identification network model is trained through a sample of artificially marking the topic outline, so that the topic outline is obtained after the initial image inputs the trained topic identification network model. The topic identification network model can be a deep convolutional neural network, a full convolutional neural network, and the like.

The text line detection network model mainly detects each text line in the initial image to obtain a text line outline. In the embodiment of the present invention, a text line detection network model based on deep learning is adopted to implement, and the text line detection network model may adopt any deep learning network such as YOLO, CTPN, PseNet, and the like. Illustratively, a PseNet text line detection network model is adopted, so that the detection result has strong robustness to the conditions of illumination, color, texture, blur and the like.

The method for identifying the question mark frame of the initial image can be various, and illustratively, the question mark information is identified through a created and trained Yolo question mark detection network model. YOLO (You Only Look one: Unifield, Real-TimeObject Detection) is a single neural network-based target Detection algorithm proposed by Joseph Redmon and Ali Farhadi et al in 2015, which includes a convolutional layer, a target Detection layer and an NMS screening layer. The sample for the YOLO question mark detection network model training may be a text picture containing question marks, and the sample label is each question mark frame marked in the text picture. The initial image is input into the trained YOLO question mark detection network model to obtain each question mark frame of the initial image, which is called as an initial question mark frame. Of course, the question mark frame can also be identified by other target detection methods of deep learning, such as R-CNN, SSD, retinet, attentionNet, FCOS, and the like.

In order to prevent the numbers in the text line from being recognized as the question mark frame, in the embodiment of the present invention, the question mark frame may be filtered through the text line outline, and when the ratio of the intersection area of the question mark frame and a certain text line outline to the whole area of the question mark frame is greater than a preset threshold, for example, 80%, the question mark frame is deleted. The area calculation can be realized by the number of pixel points.

122. A blankmask is created, which is the same size as the original image.

The size of the created blank mask image is the same as that of the initial image, so that each obtained outline can be added into the blank mask, and the initial value of each pixel point in the blank mask image is 0, namely the blank mask image is a completely black image.

123. Adding the title contour to the mask map.

The topic contour information obtained by the topic identification network model is a set of pixel points forming the topic contour, the positions of the pixel points on the mask graph are set to be 1, and the topic contour can be added into the mask graph. Fig. 4 is a schematic diagram showing an initial image, and fig. 5 is a schematic diagram showing a topic outline 21 added to a mask map according to the image recognition of fig. 4.

124. And determining the upper boundary of the question mark line according to the question mark frame and the text line outline, and adding the upper boundary into the mask map.

And determining a target text line contour according to the question mark frame and the text line contours, wherein the target text line contour is the text line contour with intersection with the question mark frame, and if the question mark frame and the text line contours have intersection, selecting the text line contour with the maximum intersection as the target text line contour. And selecting the upper boundary of the target text line contourgraph as the upper boundary of the question mark line, and adding the upper boundary of the question mark line into the mask graph in a similar manner to the question contour.

125. And extending the left end point and the right end point of the upper boundary so as to connect the upper boundary with the theme contour, wherein the theme contour is divided into a plurality of theme areas by the upper boundary, and each theme area forms an internal contour corresponding to each theme number.

Because the identification modes of the title contour and the upper boundary are different, the two ends of the upper boundary are probably not intersected with the title contour, and in this case, the left end point and the right end point of the upper boundary are extended, and the pixel values of the pixels passing through the extended line are all set to be 1. The extending manner may be horizontal extending, for example, based on the vertical coordinates of the left and right end points, the end points that reach the corresponding subject contour from the left and right end points and have the same vertical coordinate are obtained, and the pixel point values of the same vertical coordinate between the left and right end points and the end point are all set to 1. Of course, there is a possibility that the left and right end points of the upper boundary extend outside the topic outline, and in this case, the values of the pixel points beyond the topic outline can be set to 0.

Therefore, the upper boundary divides the topic outline into a plurality of topic areas, each topic area corresponds to one topic number, the topic areas are changed into internal outlines, the internal outlines and the topic numbers are associated, the internal outline to which the topic numbers belong is determined according to the intersection relation between the topic number frames and the internal outlines, and when the topic number frames and the internal outlines have intersection, the internal outline with the largest intersection is selected as the internal outline corresponding to the topic number frame. And finally obtaining a mask map with each question mark corresponding to the internal outline one by one.

The initial image of fig. 4 is a schematic view of the internal profile of fig. 6. In fig. 4, the inscription number 1 corresponds to the inner contour 22, the inscription number (1) corresponds to the inner contour 23, the inscription number (2) corresponds to the inner contour 24, and the inscription number (3) corresponds to the inner contour 25.

130. And setting a label corresponding to each internal contour according to the numerical value and the grade of the question number.

The purpose of setting the label is to directly obtain the level of the corresponding question mark of the internal outline through the label on one hand, and to obtain the associated question mark through the label on the other hand, for example, the parent question mark or the child question mark of each question mark.

In the teaching and auxiliary materials such as test paper, exercise book and the like, the grades of the question numbers are generally divided into 2 grades (the shape is like the first, second and third question forms or the shape is like 1.2.3, the first grade question is the first grade question, the shape is like 1.2.3, or the shape is like (1) (2) (3), the second grade question is the second grade question, when the second grade question of the type (1) (2) (3) appears, the first grade question is generally 1.2.3, and when the first grade question of the first, second and third question forms, the first grade question is generally 1.2.3, the second grade question is the largest question.

The identification of the question mark grade can be determined through a question mark classification model, and the question mark classification model can be a clustering algorithm or obtained through deep learning neural network training. By identifying the resulting topic number level, there may be a possibility that only secondary topics are possible. The determination of the number value of the question mark can be realized by an OCR recognition method, i.e. recognizing the characters in the question mark frame in the above step 120 to obtain the number value of the question mark.

Different labels are endowed with different label values, and each label value corresponds to different question numbers, question number levels and the hierarchical relationship of the question number levels. For example, the tag value of the question mark may be set by setting four decimal digits, where the first two digits correspond to the first-level question and the second two digits correspond to the second-level question, and of course, each question mark level may also be constructed by one or three or more digits, and the order of each question mark level in the tag value may also be arbitrarily changed.

Illustratively, the identified primary topic corresponds to an internal profile AB00, where AB is the value of the primary topic, 00 indicates that it is a primary topic, and the identified secondary topic corresponds to an internal profile CDEF, where CD is the value of the primary topic to which the secondary topic belongs and EF is the value of the secondary topic. It can be seen that the value, category and membership of a topic can be determined by the tag value.

Based on the above setting rule, the label value corresponding to the title number in the initial image shown in fig. 4 is shown in fig. 6.

140. And receiving and identifying a target question number in a first voice command sent by a user.

After the recognition is completed, the electronic device can also send an interactive instruction to guide the user to send a first voice instruction, wherein the interactive instruction can be a voice instruction or a character instruction such as 'which question you want to answer'.

The first voice instruction may directly include voice information with a target question number, such as: the first voice instruction is "i want to solve question 3". Extracting the target question number may be extracting only one or more numeric keywords in the first voice instruction. For example, if the first voice command is "i want to solve the 3 rd question" as described above, the keyword, that is, the target question number is "3", and of course, in more scenes, some related words may appear, for example, "i want to solve the 3 rd question," and the related word is "small". In general, the number of associated words may be exhaustive, and associated words related to the topic number classification include, but are not limited to, "large", "small", and the like. The digital key words form the target question numbers or the digital key words and the associated words form the target question numbers.

150. And determining a target label and a target internal contour according to the target question mark, wherein the target label is a label matched with the target question mark, and the target internal contour is an internal contour associated with the target label.

And traversing the label according to the target question to obtain the target label. Specifically, if the target topic number is only a keyword, each tag can be regarded as being composed of two groups of numbers by traversing each numerical value in the tag through the keyword, and all traversal is required during traversal, and a tag identical to the keyword is determined to be used as a target tag. And if the target title comprises the key words and the associated words, determining that two groups of numbers in each label need to traverse one group according to the key words and the associated words. For example, if the target topic number is only "3", the label of 03 in both sets of numbers in all labels is determined as the target label, and if 0203, 0300, 0303 exists in the labels, these labels are all the target labels. If the object topic numbers are "3" and "small," then 0203 and 0303 among the above labels are object labels.

In the embodiment of the present invention, since the test questions are solved, the operations of

steps

160 and 170 are performed only when the target label is only one and the target label is a secondary question. For example, if the target question number is only "3", and only 0103 corresponds thereto exists in the tags shown in fig. 6, the tag having the tag value of 0103 is the target tag, and the internal contour 25 corresponding to the target tag having the tag value of 0103 is the target internal contour.

If the target tags are multiple, or the first-order questions exist in the target tags, or the corresponding target tags do not exist, the electronic device sends a corresponding interaction instruction to the user, for example, if the target tags are multiple, the interaction instruction sent by the electronic device may be "ask which 3 rd question you want to answer", the user sends a second voice instruction "i want to answer the 3 rd question of the 2 nd question" to the electronic device according to directions, the corresponding tag is 0203, then the above tags are traversed, if the 0203 tag exists, the electronic device serves as the target tag, otherwise, if the 0203 tag does not exist, the electronic device continues to send the interaction instruction, which may be "find no 3 rd question of the 2 nd question, and request to reselect".

And through one or more times of interaction between the electronic equipment and the user, until the determined question number corresponding to the target label is a secondary question and the target label is only one, or the user gives up the interaction.

And recording the internal contour corresponding to the target label as a target internal contour, and determining the text contour of the target image which needs to be intercepted by the user through the target internal contour.

160. And determining a text outline according to the target internal outline, and segmenting an initial image in the text outline to obtain a target image.

And taking the target internal contour as a text contour, and segmenting the initial image to obtain a target image, wherein the target image is the initial image part in the text contour. In some scenarios, topic listing may be implemented, for example, an error-topic book function, or a target image may be used to search for answers, search for speech or similar words, or antisense words, etc., to implement a topic search function. In the embodiment of the invention, the obtained target image is used for original question searching, namely, a database is searched for a question which is the same as the target image, the question is preferably in a text format, the original question in the searched text format can be used for question recording, so that a wrong-answer book can be printed subsequently, answers or/and answer thoughts can be attached or associated in the original question searching, and the original question and the answers or/and answer thoughts are displayed to a user and used for providing certain inspiration for the learning of the user.

For example, in the embodiment of the present invention, the original questions obtained by the search may be sent to a touch screen of the electronic device to be displayed, after the user completes making the questions and confirms the questions, the question making result of the user is modified according to answers associated with the original questions, and in the case that the user makes a mistake, the answer idea is displayed to the user.

170. And performing OCR recognition on the target image, and searching a database for matched test questions by using the recognition result.

The method for searching the original question is to firstly perform character recognition on a target image by using the traditional OCR technology to obtain a character recognition result. Since the recognition rate and accuracy of OCR recognition cannot be guaranteed to be 100%, when similarity matching is performed on the original question, a threshold needs to be set, and the preset threshold is set as needed, and may also be set to the conventional recognition rate of OCR technology, for example, 98%. The database can be a teaching resource base which is created in advance, in order to reduce the search time, a plurality of small databases can be constructed according to the basic information of the user, such as the grade or/and subject information or/and version numbers used in the region, corresponding search keywords can be identified according to header and footer information and the like of the bearing body, the small databases are matched through the keywords, and the similarity comparison between the identification result and the original question is carried out through the small databases.

Illustratively, a header part and a footer part in the initial image are identified, and a search keyword is determined according to the header part and the footer part, wherein the search keyword is a first condition or the first condition and a second condition; the first condition is the grade and subject, and the second condition is one or more of the title, publisher, version number and brand name.

In the carrier image shown in fig. 7, the header part can acquire grade information 311 (i.e., on the seven-year level), subject information 312 (i.e., language), version information 313 (i.e., human education version), and brand name information 314 (i.e., full teaching material). In the carrier image shown in fig. 8, grade information 321 (i.e. six-grade book), subject information 322 (i.e. language), brand name information 323 (i.e. english curriculum), and book name information 324 (i.e. read and refine in "happy reading bar") are obtained in the footer portion. In the carrier image shown in fig. 9, grade information 331 (i.e., under the third grade), subject information 332 (i.e., mathematics), and version information 333 (i.e., R refers to the religious version) are available in the footer portion, and brand name information 334 (i.e., an image of a child with a doctor's cap, which refers to the small element of the brand name huanggang) is available in the footer portion.

Therefore, the header and the footer of the partial carrier cover the grade and the subject information, so that the partial information is taken as a first condition, one or more of the book name, the publisher, the version number and the brand name also exist in the partial carrier, the partial carrier is taken as an auxiliary second condition, when the second condition exists, the first condition and the second condition are inquired, and when the second condition does not exist, the inquiry can be directly carried out through the first condition.

Specifically, characters in the header part or/and the footer part are identified, and the grade and subject are screened from the characters as a first condition. Illustratively, the characters for identifying the header and footer portions can be implemented by well-established OCR (Optical Character Recognition), where the characters are primarily Chinese characters. Because the grades and subjects can be exhausted, the grade and subject screening from the characters is to set the first search library, exhaust all grade information and subject information, and traverse the characters in the header part or/and the footer part to obtain the grade and subject information.

And detecting whether the characters in the header part or/and the footer part comprise one or more of a version number, a book name and a brand name, and if so, taking the one or more of the version number, the book name and the brand name as a second condition. In the same method as the first condition, common version names, book names and brand names are set in a second search library, characters in a header part or/and a footer part are traversed, and if the second condition exists, specific second condition information is obtained. In fact, the version numbers are uniform for different regions, so that when the user uses the search application or the wrong question collection application, the version number can be determined according to the basic information input by the user, and the version number is known.

There are carriers whose publisher and brand name realize, for example, brand name information 334 in fig. 9 using icons, in which case it is possible to detect whether one or more of the publisher and the brand name are included in the non-character part in the header part or/and footer part, and if so, to take the one or more of the publisher and the brand name as a second condition. The method is realized by determining the information of the publishing houses or the information of the brand names by the non-character part in the header part or/and the footer part in a graph searching mode, for example, if the similarity reaches more than 90%, the corresponding information of the publishing houses or the information of the brand names is considered to be identified.

And traversing the labels of the small databases through the first condition or/and the second condition to determine the small database corresponding to the bearer, so that the corresponding test questions can be matched from the small database, and the time spent on searching is greatly reduced.

And when the similarity between a certain test question and the recognition result is greater than or equal to a preset threshold value, the test question is the target test question, and the search is finished. The associated answers or answers thinking and the like can be determined according to the target test questions, and the answers or answers thinking can also exist in the corresponding small database or other databases and are obtained through a mapping relation or an index searching mode.

If the similarity between all the test questions in the database or the corresponding small database and the identification result is smaller than the preset threshold, corresponding search can be performed in the internet, the test questions with the similarity larger than or equal to the preset threshold are not found, or the test questions with the similarity larger than or equal to the preset threshold are not found when the search time in the database or the corresponding small database, the internet and the like reaches the preset time, the similarity in the search records is sorted from large to small, the test questions with the preset number in the similarity, which are sorted in the front, are selected as target test questions and sent to the user for display, a certain hint is given to the user, and similarly, the answers or/and answer thoughts of the target test questions can also be obtained. Of course, if the user thinks that the target test questions are not related to the intention, the operations of

steps

110 and 160 can be performed again.

By implementing the embodiment of the invention, the loss of the identification content caused by the shielding of the test question content by fingers (palm, pen holding and the like) in a finger reading scene can be completely avoided, so that the integrity of the identification content of the cut test picture is improved, the test question rate of the pushed user is improved, the intention of the user is met to the maximum extent, and the interactive experience of the user in learning is improved.

Example two

Referring to fig. 10, fig. 10 is a schematic structural diagram of an apparatus for searching for a topic according to an embodiment of the present invention. As shown in fig. 10, the apparatus for searching for a question may include:

the photographing unit 410 is configured to receive a photographing instruction sent by a user when the electronic device is in a finger reading scene, and photograph the carrier by using the image acquisition device to obtain an initial image;

the identification unit 420 is configured to identify the initial image to obtain an internal contour corresponding to each topic number in the initial image;

a setting unit 430, configured to set a label corresponding to each internal contour according to the value and the level of the question mark;

the receiving unit 440 is used for receiving and identifying a target question number in a first voice command sent by a user;

the determining unit 450 is configured to determine a target tag and a target internal contour according to the target question mark, where the target tag is a tag adapted to the target question mark, and the target internal contour is an internal contour associated with the target tag;

a segmentation unit 460, configured to determine a text contour according to the target internal contour, and segment an initial image in the text contour to obtain a target image;

a searching unit 470, configured to perform OCR recognition on the target image, and search a database for a matching test question using the recognition result.

As an optional implementation manner, the identifying unit 420 includes:

an input subunit 421, configured to input the initial image in parallel into a question identification network model, a text line detection network model, and a question number detection network model based on deep learning to determine a question contour, a text line contour, and a question number frame;

a creating subunit 422, configured to create a blank mask map, where the blank mask map has the same size as the initial image;

an adding subunit 423, configured to add the title contour to the mask map;

a boundary determining subunit 424, configured to determine an upper boundary of the question mark line according to the question mark box and the text line profile, and add the upper boundary to the mask map;

and an extension subunit 425 configured to extend the left and right end points of the upper boundary, so that the upper boundary is connected to the topic contour, the upper boundary divides the topic contour into a plurality of topic regions, and each topic region forms an inner contour corresponding to each topic number.

As an optional implementation, the setting unit 430 includes:

a classification subunit 431, configured to obtain, through the question number classification model, a level of each question number, where the level includes a primary question and a secondary question;

the label setting subunit 432 is configured to set a label for the question mark according to the value of the question mark and the level of the question mark, where the label represents the value corresponding to the question mark and the level of the question mark.

As an optional implementation manner, the determining unit 450 includes:

a traversal subunit 451, configured to traverse all the tags according to the target question number, and determine a tag matching the target question number as a target tag;

a determining subunit 452, configured to, when the question number corresponding to the target tag is a secondary question and only one target tag is available, take the internal contour corresponding to the target tag as a target internal contour;

a feedback subunit 453, configured to send an interaction instruction to the user when the topic number corresponding to the target tag is a primary topic, or/and the target tag is multiple, or there is no target tag; and receiving a second voice instruction sent by the user according to the interaction instruction, and determining a new target question number according to the second voice instruction until the question number corresponding to the determined target label is a secondary question and the number of the target label is only one.

As an optional implementation manner, the receiving unit 440 includes:

the extracting subunit 441 is configured to receive a first voice instruction sent by a user, and extract one or more digital keywords in the first voice instruction, or one or more digital keywords and associated words of the digital keywords;

a target question number determining subunit 442, configured to use the digital keyword or information corresponding to a relevant word of the digital keyword as a target question number.

As an optional implementation manner, the search unit 470 includes:

an OCR recognition subunit 471, configured to perform OCR recognition on the target image to obtain a recognition result;

the calculation subunit 472 is configured to search in a database to obtain target test questions, where a similarity between the target test questions and the identification result is greater than or equal to a preset threshold;

a pushing subunit 473, configured to select, as the target test question, the test questions with the highest similarity to the identification result if the similarities of the test questions in the database and the identification result are smaller than a preset threshold.

The device for searching for questions shown in fig. 10 can completely avoid the loss of the identification content caused by the shielding of the test question content by fingers (palm, pen holding, etc.) in the finger reading scene, thereby improving the integrity of the identification content of the cut test pictures, further improving the rate of pushing the test questions of the user, meeting the user intention to the maximum extent, and improving the interactive experience of the user learning.

EXAMPLE III

Referring to fig. 11, fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure. As shown in fig. 11, the electronic device may include:

a memory 510 storing executable program code;

a processor 520 coupled to the memory 510;

in which, the processor 520 calls the executable program code stored in the memory 510 to execute part or all of the steps of the method for searching for a question in the first embodiment.

The embodiment of the invention discloses a computer-readable storage medium which stores a computer program, wherein the computer program enables a computer to execute part or all of steps in a method for searching for a question in the first embodiment.

The embodiment of the invention also discloses a computer program product, wherein when the computer program product runs on a computer, the computer is enabled to execute part or all of the steps in the method for searching the problems in the first embodiment.

The embodiment of the invention also discloses an application publishing platform, wherein the application publishing platform is used for publishing the computer program product, and when the computer program product runs on a computer, the computer is enabled to execute part or all of the steps in the method for searching the problems in the first embodiment.

In various embodiments of the present invention, it should be understood that the sequence numbers of the processes do not mean the execution sequence necessarily in order, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated units, if implemented as software functional units and sold or used as a stand-alone product, may be stored in a computer accessible memory. Based on such understanding, the technical solution of the present invention, which is a part of or contributes to the prior art in essence, or all or part of the technical solution, can be embodied in the form of a software product, which is stored in a memory and includes several requests for causing a computer device (which may be a personal computer, a server, a network device, or the like, and may specifically be a processor in the computer device) to execute part or all of the steps of the method according to the embodiments of the present invention.

In the embodiments provided herein, it should be understood that "B corresponding to a" means that B is associated with a from which B can be determined. It should also be understood, however, that determining B from a does not mean determining B from a alone, but may also be determined from a and/or other information.

Those skilled in the art will appreciate that some or all of the steps of the methods of the embodiments may be implemented by hardware instructions of a program, which may be stored in a computer-readable storage medium, such as Read-Only Memory (ROM), Random Access Memory (RAM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (CD-ROM), or other disk Memory, or other Memory, or may be stored in a computer-readable storage medium, A tape memory, or any other medium readable by a computer that can be used to carry or store data.

The method, apparatus, electronic device and storage medium for searching for a topic disclosed in the embodiments of the present invention are described in detail above, and a specific example is applied in the present document to explain the principle and implementation manner of the present invention, and the description of the above embodiments is only used to help understanding the method and core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A method for searching for a question, comprising:

2. The method of claim 1, wherein identifying the initial image to obtain an inner contour corresponding to each topic number in the initial image comprises:

adding the title contour to the mask map;

3. The method of claim 1, wherein setting the label corresponding to each internal contour according to the value and level of the question mark comprises:

4. The method of claim 3, wherein determining a target label and a target inner contour according to the target question number, the target label being a label matched with the target question number, the target inner contour being an inner contour associated with the target label, comprises:

5. The method according to any one of claims 1-4, wherein receiving and identifying the target topic number in the first voice command issued by the user comprises:

6. The method according to any one of claims 1 to 4, wherein determining a text contour from the target internal contour and segmenting an initial image within the text contour to obtain a target image comprises:

7. The method of any one of claims 1-4, wherein performing OCR recognition on the target image and using the recognition results to search a database for matching questions, comprises:

performing OCR recognition on the target image to obtain a recognition result;

8. An apparatus for searching for a topic, the apparatus comprising:

9. The apparatus of claim 8, wherein the identification unit comprises:

an adding subunit, configured to add the title contour to the mask map;

10. The apparatus of claim 8, wherein the setting unit comprises:

11. The apparatus of claim 10, wherein the determining unit comprises:

12. The apparatus according to any one of claims 8-11, wherein the receiving unit comprises:

13. The apparatus according to any one of claims 8-11, wherein the search unit comprises:

14. An electronic device, comprising: a memory storing executable program code; a processor coupled with the memory; the processor calls the executable program code stored in the memory for executing a method of searching for topics as claimed in any one of claims 1 to 7.

15. A computer-readable storage medium storing a computer program, wherein the computer program causes a computer to perform a method of searching for a subject according to any one of claims 1 to 7.