CN113255852A

CN113255852A - Photographing question searching method and device and computer equipment

Info

Publication number: CN113255852A
Application number: CN202110485579.5A
Authority: CN
Inventors: 杨森; 王岩; 蔡红; 安�晟
Original assignee: Zuoyebang Education Technology Beijing Co Ltd
Current assignee: Beijing Baige Feichi Technology Co.,Ltd.
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2021-08-13

Abstract

The invention belongs to the field of education, and provides a photographing question searching method, a photographing question searching device and computer equipment, wherein the method comprises the following steps: acquiring a topic image and displaying the topic image on a display device of a user terminal; performing single-question detection on the question image, and visually identifying the area of each detected single question on a display device of the user terminal in real time; receiving a selection operation instruction of a user so as to photograph, identify and search the area of the single topic selected by the user; and the user terminal receives the returned search result. According to the shooting question searching method, the question detection is carried out by using the question detection model, so that the user can be assisted to quickly determine the region to be identified and searched so as to accurately detect, more intelligent detection processing can be realized, the question searching operation of the user can be simplified, and the experience of the client can be improved.

Description

Photographing question searching method and device and computer equipment

Technical Field

The invention belongs to the technical field of computer vision correlation, is particularly suitable for the field of education, and more particularly relates to a photographing question searching method and device and computer equipment.

Background

At present, many electronic education products have the function of shooting and searching questions, a user shoots questions on a paper surface through a camera of a control terminal, after the terminal finishes shooting and displays the shot pictures, the user determines question images to be intercepted through a selection frame displayed in a control terminal interface, and then searches questions and answers.

In the above process, the user needs to adjust the size of the selection box and move the selection box to a proper position through touch screen operation, so that the selection box just surrounds the question to be searched, and therefore, the screenshot of the question is completed, and the answer of the question is searched. The method is relatively complex in operation and low in operation efficiency.

In addition, the existing problem searching method also has the problems of low problem searching identification accuracy, poor user experience and the like.

Therefore, it is necessary to provide a method for searching questions by taking pictures to solve the above problems.

Disclosure of Invention

Technical problem to be solved

The invention aims to solve the technical problems of complex operation steps, low operation efficiency, low recognition accuracy, poor user experience and the like of the conventional question searching.

(II) technical scheme

In order to solve the above technical problem, an aspect of the present invention provides a method for searching for a question by taking a picture, including the following steps: acquiring a topic image and displaying the topic image on a display device of a user terminal; performing single-question detection on the question image, and visually identifying the area of each detected single question on a display device of the user terminal in real time; receiving a selection operation instruction of a user so as to photograph, identify and search the area of the single topic selected by the user; and the user terminal receives the returned search result.

According to a preferred embodiment of the present invention, the topic detection is performed by a topic detection model, which is a trained image topic identification model for detecting a region of a topic included in an image.

According to the preferred embodiment of the invention, continuous multi-frame theme images are acquired in real time and displayed on a display device of the user terminal in real time; the topic detection is performed based on each frame image of a predetermined number of interval frames in the continuous multi-frame images.

According to a preferred embodiment of the present invention, the area of the theme is a quadrangular area including the theme; the real-time visual identification of the detected area of each single question comprises the selection and display of the quadrilateral area.

According to a preferred embodiment of the present invention, the performing of the frame selection display on the quadrangular region includes: and correcting the quadrangle so that all the contents of the single question can be framed and selected by the corrected quadrangle region displayed to the user terminal.

According to a preferred embodiment of the present invention, the receiving of the selection operation instruction by the user includes: and receiving the clicking operation of the detected quadrangular area of the single subject by the user or the clicking operation of the corrected quadrangular area by the user so as to photograph, identify and search the image area of the selected quadrangular area.

According to a preferred embodiment of the present invention, the photographing of the area of the user-selected theme includes: calling the camera device to take a picture, and cutting the area of the single topic selected by the user in the obtained image to obtain the image of one or more single topic areas; optionally, after the photographing the selected area of the single topic, the method further includes: saving and obtaining the images of the one or more thematic areas; optionally, the images of one or more subject areas are identified and searched in parallel; further optionally, when the image of the single-question area is recognized, based on the difference between the printed form and the written form, the recognition content corresponding to the written form in the single-question area is removed from the recognition result.

According to a preferred embodiment of the present invention, the identifying and searching the area of the topic selected by the user includes: determining the question type and the question key information of each single question, and performing matching judgment on the single question and the questions in a preset question bank by using a judgment rule so as to return a question searching result to the user terminal, wherein the question searching result comprises whether the question exists, a recommended answer corresponding to the question and other recommended questions of the same type.

The second aspect of the present invention provides a device for searching questions by taking pictures, comprising: the first receiving module is used for acquiring a topic image and displaying the topic image on a display device of the user terminal in real time; the detection identification module is used for carrying out single-question detection on the question image and visually identifying the area of each detected single question on a display device of the user terminal in real time; the second receiving module is used for receiving a selection operation instruction of a user so as to photograph, identify and search the area of the single question selected by the user; and the result processing module is used for receiving the returned title search result by the user terminal.

According to a preferred embodiment of the present invention, the mobile terminal further comprises a correction processing module, configured to perform correction processing on the quadrangle, so that the corrected quadrangle area displayed to the user terminal can be used to frame a quadrangle of all contents of the question.

According to a preferred embodiment of the present invention, the second receiving module further comprises: and receiving the clicking operation of the detected quadrangular area of the single subject by the user or the clicking operation of the corrected quadrangular area by the user so as to photograph, identify and search the image area of the selected quadrangular area.

According to a preferred embodiment of the present invention, the photographing of the area of the user-selected theme includes: calling the camera device to take a picture, and cutting the area of the single topic selected by the user in the obtained image to obtain the image of one or more single topic areas; optionally, after the photographing the area of the theme selected by the user, the method further includes: saving and obtaining the images of the one or more thematic areas; optionally, the images of one or more subject areas are identified and searched in parallel; further optionally, when the image of the single-question area is recognized, based on the difference between the printed form and the written form, the recognition content of the written form in the single-question area is removed from the recognition result.

According to the preferred embodiment of the present invention, the system further comprises a determining module, wherein the determining module is configured to determine question types and question key information of each single question, and perform matching judgment with questions in a preset question library by using a judgment rule, so as to return a question searching result to the user terminal, where the question searching result includes whether the question exists, a recommended answer corresponding to the question, and other recommended questions of the same type.

A third aspect of the present invention provides a computer device, comprising a processor and a memory, wherein the memory is used for storing a computer executable program, and when the computer program is executed by the processor, the processor executes the photographing question searching method.

A fourth aspect of the present invention provides a computer program product, storing a computer executable program, wherein the computer executable program, when executed, implements any one of the above photographing question searching methods.

(III) advantageous effects

Compared with the prior art, the invention carries out the single-topic detection by using the single-topic detection model, provides the detected single-topic area for the user to select, can assist the user to quickly determine the area which the user wants to identify and search, and does not need the user to adjust for many times and select frames in a mobile terminal interface in the process.

In addition, the detected single-subject areas are provided for the user to select, and the user can select one or more single-subject areas to take pictures, identify and search. Therefore, the scheme can realize a plurality of single-question searches through one-time shooting search, reduce the searching times of users, simplify the operation steps, and reduce the input of interference information (information outside the area of each single question) which is not needed by the search, thereby optimizing the question searching method and improving the detection precision.

Furthermore, the corrected quadrangle can include all the contents of the single topic by displaying the quadrangle to the user terminal, and the user does not need to manually adjust the position, the angle and the like of the detection frame, so that the topic searching operation of the user can be simplified, and the experience of the client can be improved.

Drawings

FIG. 1 is a flowchart of an example of a photo topic search method according to embodiment 1 of the present invention;

FIG. 2 is a diagram illustrating an example of a frame selection display in a single-question detection process by applying the photo-taking question-searching method according to embodiment 1 of the present invention;

FIG. 3 is a flowchart of another example of the photo topic search method according to embodiment 1 of the present invention;

FIG. 4(a) is a diagram showing a test paper being displayed by framing each question with an uncorrected detection frame during the examination of the single question;

FIG. 4(b) is a diagram showing the examination paper being displayed by selecting the examination paper for each question during the examination of the single question;

FIG. 5 is a flowchart of still another example of the photo topic search method according to embodiment 1 of the present invention;

FIG. 6 is a diagram illustrating an example of a photographing question searching apparatus according to embodiment 2 of the present invention;

FIG. 7 is a diagram illustrating another example of a photo topic searching apparatus according to embodiment 2 of the present invention;

FIG. 8 is a schematic view of still another example of the photographing question searching apparatus according to embodiment 2 of the present invention;

FIG. 9 is a schematic structural diagram of a computer device of one embodiment of the present invention;

FIG. 10 is a schematic diagram of a computer program product of an embodiment of the invention.

Detailed Description

In describing particular embodiments, specific details of structures, properties, effects, or other features are set forth in order to provide a thorough understanding of the embodiments by one skilled in the art. However, it is not excluded that a person skilled in the art may implement the invention in a specific case without the above-described structures, performances, effects or other features.

The flow chart in the drawings is only an exemplary flow demonstration, and does not represent that all the contents, operations and steps in the flow chart are necessarily included in the scheme of the invention, nor does it represent that the execution is necessarily performed in the order shown in the drawings. For example, some operations/steps in the flowcharts may be divided, some operations/steps may be combined or partially combined, and the like, and the execution order shown in the flowcharts may be changed according to actual situations without departing from the gist of the present invention.

The block diagrams in the figures generally represent functional entities and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different network and/or processing unit devices and/or microcontroller devices.

The same reference numerals denote the same or similar elements, components, or parts throughout the drawings, and thus, a repetitive description thereof may be omitted hereinafter. It will be further understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, or sections, these elements, components, or sections should not be limited by these terms. That is, these phrases are used only to distinguish one from another. For example, a first device may also be referred to as a second device without departing from the spirit of the present invention. Furthermore, the term "and/or", "and/or" is intended to include all combinations of any one or more of the listed items.

The invention provides a method for searching problems, which aims to solve the technical problems of complex operation steps, low operation efficiency, low recognition accuracy, poor user experience and the like of the conventional method for searching the problems or at least partially solve the problems and further optimize the method for searching the problems. According to the method, the single-question detection model is used for single-question detection, and the user terminal is used for the user to select and determine, so that the search of one or more single-questions can be realized, the method is more intelligent, the operation steps of the user can be simplified, the shooting and searching of a plurality of single-questions needing to be retrieved in a shot picture can be completed at one time, and the experience of the client can be improved.

In order that the objects, technical solutions and advantages of the present invention will become more apparent, the present invention will be further described in detail with reference to the accompanying drawings in conjunction with the following specific embodiments.

Fig. 1 is a flowchart of an example of a photo topic search method according to embodiment 1 of the present invention.

As shown in fig. 1, the present invention provides a method for searching questions by photographing, which comprises:

step S101, acquiring a theme image and displaying the theme image on a display device of a user terminal.

And S102, performing single-question detection on the question image, and visually identifying the areas of the detected single questions on a display device of the user terminal in real time.

And step S103, receiving a selection operation instruction of the user so as to photograph, identify and search the area of the single topic selected by the user.

And step S104, the user terminal receives the returned search result.

First, in step S101, a theme image can be acquired in real time and displayed on a display device of a user terminal in real time.

The following description will specifically describe the application scenario of the method in the educational service product as an example.

In this example, the educational service product contains a search question function, which is implemented by the photographing search question method of the present invention. The user terminal includes, but is not limited to, a terminal with a communication function, such as a mobile phone, an IPAD, a notebook computer, and a desktop computer.

Specifically, after the user opens education service product APP, when opening to shoot and search the function and search for the question, user terminal opens user terminal's camera device when receiving the operating instruction that for example the user opened shooting function etc. through APP.

It should be noted that in the present invention, the search question refers to a search function using an educational service product, which searches the contents of one or more question stems in an image to obtain all contents of the corresponding question, including but not limited to standardized question stems, answers, parsing, teacher explanation, etc.

Preferably, the subject image of the test paper or the job is acquired in real time by using the imaging device, and the subject image is displayed on a display device of the user terminal in real time, such as a display screen of a mobile phone or a display screen of a tablet computer. It should be noted that the description of "preferably," "optionally," "specifically," "more specifically," "further," "still," "in one example," "in another example," "preferred embodiment according to the invention," and the like in this description is merely illustrative of an alternative or preferred example, merely to facilitate the reader's better understanding of the invention, and is not intended to constitute a limitation on the invention.

Next, in step S102, a topic image is detected for each topic, and a region for visually identifying each detected topic is displayed on a display device of the user terminal in real time.

Specifically, a plurality of continuous frame topic images can be obtained in real time and displayed on the display device of the user terminal in real time. In this example, the topic detection may be performed based on each frame topic image of a predetermined number of interval frames in the continuous multi-frame topic images. Further, the predetermined interval frame number may be 0, 1, 2 or other numbers, and is 0, which is all frames.

Specifically, when an operation instruction of opening the question searching function by the user is received, the user terminal can automatically call a question detection model of the server side and detect the question through the question detection model, wherein the question detection model is a trained image question identification model and is used for detecting the region of the question contained in the image. The area of the detected question may be a quadrangular area including the question, and the areas of the detected questions may be visually identified in real time. As shown in fig. 2 and fig. 4(b), schematic diagrams of exemplary recognition results of the single question are shown.

It should be noted that the topic detection model may be a depth network model based on technologies such as CNN, Attention, LSTM, etc., where the input features are multi-frame images, the multi-frame images include a current frame image and a plurality of frame images (a previous frame image or a few previous frame images) that are a specific number of frames before the current frame image, and the output features are position coordinates of four points in a topic frame selection quadrangle corresponding to the current frame image. Specifically, the training data set may include historical pictures containing various question types, test papers, book text, and position information of a tagged or user-confirmed question boxed quadrilateral.

Further, when the trained single-question detection model is used for detecting the single-question, the detected single-question area can be a quadrilateral area, and the detected single-question areas are visually identified in real time. The specific visual identification manner is not limited in this embodiment, and for example, as shown in fig. 2, fig. 4(a), and fig. 4(b), the indication may be performed in the form of a frame selection quadrangle including a single-subject area/content. Of course, other forms are possible, such as adding a highlight background to the single-subject area.

More specifically, for the single-topic detection model of the image to be detected, the following expression can be used for calculation, for example:

X＝(1-a)*X_old+a*X_new

wherein X is the prediction result of the image to be detected; x_newIs the position information of the framing quadrilateral of the current frame image; x_oldPosition information of a frame selection quadrangle of a plurality of frame images of a specific number of frames before the current frame image (referred to as a history image of a specific number of frames for short); a is the hyper-parameter determined by the model training process. The hyper-parameters need to be optimized in machine learning. Hyper-parameter optimization in machine learning aims to find hyper-parameters that make the machine learning algorithm perform best on the validation dataset.

It should be noted that the single-topic detection model detects all the single topics in the image to be detected, and obtains the prediction result of the image to be detected (i.e. the position information of the framing quadrangles of all the single topics in the image) by performing result fusion on the position information of the framing quadrangles of the current frame image and the framing quadrangle of the history image with a specific frame number.

Therefore, by using the single-question detection model to detect the single-question, the user can be assisted to quickly determine the area to be identified and searched, the problem searching method can be optimized, and the detection precision can be improved.

Furthermore, the quadrangular area (namely, the single subject area) can be displayed in a frame selection mode and displayed on a display device of the user terminal, and the user can customize the lines, the colors, the display mode and the like for displaying in the frame selection mode. In this example, for the frame selection display, the frame selection display may be performed using a solid line, a broken line, and a double line.

The single-subject area can also be displayed by quadrangles with different colors, or can be identified by patterns such as cartoon marks and the like. It should be noted that the above description is only given by way of example, and the present invention is not limited thereto.

Next, in step S103, a selection operation instruction of the user is received to photograph, recognize and search the region of the user-selected question.

Specifically, as shown in fig. 2, a frame selection interface for selection operation may be provided for the user, on which a plurality of quadrangles corresponding to the respective questions are included, and the user may perform a pointing operation on one or more quadrangles.

Further, the receiving of the instruction of the user's pointing operation includes a user's pointing operation on a quadrilateral region of the detected single topic (when the method further includes the correction step, the user's pointing operation on the quadrilateral region after correction is included).

In some embodiments, it may be determined whether to photograph one or more topic areas based on the received page information of the clicking operation.

In an example, when the user selects one topic area, the image capturing device may be invoked to capture the entire topic image, and cut out a corresponding topic area portion in the obtained image to obtain the topic area selected by the user, and further perform recognition and search. Alternatively, the image pickup device may be called to photograph the subject area selected by the user, and the photographed subject area may be recognized and searched. The shooting information of the part outside the single subject area selected by the user is directly discarded.

In another example, when the user selects a plurality of single-subject areas, the image capturing device may be invoked to photograph the entire subject image, and cut out a corresponding plurality of single-subject area portions in the obtained image to obtain images of the plurality of single-subject areas selected by the user, and respectively identify and search a corresponding plurality of quadrilateral areas, that is, respectively identify and search the plurality of single-subject areas. Or, the image pickup device may be invoked to photograph the selected multiple thematic areas respectively, so as to obtain images of multiple independent thematic areas selected by the user, and to identify and search the corresponding multiple quadrilateral areas respectively.

Preferably, the photographing of the selected area of the theme by the user may further include: and storing and obtaining the images of the one or more single-subject areas so that the user can view the historical photographing question searching information in a preset time at the user terminal of the user.

For example, when the user a clicks the first two quadrangles in the image 2, the user terminal receives an operation instruction of the user to select the single-subject areas corresponding to the first two quadrangles through the shooting search APP, shoots the selected single-subject areas corresponding to the first two quadrangles, further identifies the shot picture, and performs a search (i.e., search) according to the subject contents of the identified first two subjects.

For another example, the user b clicks a certain quadrangle in the image 2, and the user terminal receives a shooting and searching operation instruction of the user for the single-subject area corresponding to the quadrangle, intercepts the corresponding single-subject area, and identifies the subject content to search for the subject.

It should be noted that the above description is only given as a preferred example, and the present invention is not limited thereto.

Next, in step S104, a search result returned to the user terminal is received.

Illustratively, the recognition result of step 103 further includes determining the topic type and topic key information of the topic. For example, the question type of the single question and the question key information including whether a figure, a character, a letter, an underline, or the like is included can be determined according to the question contents identified in step 103, the question type includes selection questions corresponding to different disciplines, a blank filling question, and the question key information includes.

More specifically, the method also comprises the step of judging the number of the topics clicked by the user, and under the condition that the number does not exceed the set number, all the topics can be searched simultaneously.

Further, under the condition that the number of the questions exceeds the set number, the exceeding question data is redistributed to the preset image library so as to ensure that the searches can be carried out simultaneously, and the question searching results are returned to the user simultaneously.

Further, according to the identified topic key information, a preset topic library with corresponding resources is provided. Then, the provided preset question bank is searched, so that the searching speed can be increased.

In some examples, the determination rule may be used to perform matching determination with the questions in the preset question bank, so as to return the question searching result to the user terminal.

Specifically, the identified question may be matched with a question in the recommended preset question bank, and the answer or the answer and the analysis may be returned to the user when the identified question is the same as the question in the recommended preset question bank.

Further, under the condition that the identified questions are different from the questions in the recommended preset question bank, the second preset question bank is automatically and continuously provided (the matching degree between the key information of the questions and each question bank can be determined according to the questions), and then matching judgment is carried out until the matching is successful and the search result is obtained.

Specifically, the question searching result includes whether the question exists, a recommended answer corresponding to the question, and other recommended questions of the same type.

It should be noted that the above description is only given by way of example, and the present invention is not limited thereto.

The above-described procedure of the subject search method is only for illustrating the present invention, and the order and number of steps are not particularly limited. In addition, the steps in the method may also be split into two (for example, the step S102 is split into S102 and S501, see fig. 5 in particular), three, or some steps may also be combined into one step, which may be adjusted according to practical examples.

Alternatively, when the user selects a plurality of single-question areas for identification and retrieval, the plurality of single-question areas selected by the user can be identified and retrieved in parallel, corresponding positions on the test question page of the single-question redisplay terminal which returns the retrieval results in advance are displayed in advance, and then the retrieval results of all the single-question are displayed according to the sequence of the returned retrieval results. Alternatively, optimization, equalization, scheduling, and the like may be performed according to the calculation (recognition/search) complexity of each topic area, so that a plurality of topic areas return the search results substantially simultaneously.

Compared with the prior art, the invention can detect the single question by using the single question detection model, can detect the single question more accurately, can optimize a question searching method and can improve the detection precision; through to the quadrangle that detects is shown to user terminal, need not the user and carry out operations such as position, the shooting angle of manual adjustment detection frame many times, from this, can realize more intelligent detection and handle, can simplify the user and search for the question operation, can also promote the customer end and experience and feel.

Fig. 4(a) and 4(b) are schematic diagrams showing another example of frame selection display in the single-question detection process by the photo-taking problem-searching method according to embodiment 1 of the present invention, where fig. 4(a) is a schematic diagram showing a test paper frame selection display for each question in the single-question detection process using an uncorrected detection frame, and fig. 4(b) is a schematic diagram showing a test paper frame selection display for each question in the single-question detection process using a corrected detection frame.

As shown in fig. 4(a), in the illustrated test paper, each topic in the test paper is selected using a quadrangular detection frame during the detection, but the display is inclined due to an inappropriate position (page bending, etc.) where the test paper is placed, and thus, the problem of the entire topic content that cannot be completely framed during the detection occurs.

Specifically, when the mathematical exercise job placement position is inclined with respect to the position of the imaging device, and the display inclination is caused, the problem that the subject contents to be detected are incomplete or the display frame is inclined with respect to the subject occurs when the area of each detected subject is visually identified in real time, because the mathematical exercise job placement position is inclined with respect to the position of the imaging device, and the like.

It should be noted that, in the course of detecting the single topic, the area of the detected single topic may also be a parallelogram or other quadrangles, which is described above only as an example and should not be construed as a limitation to the present invention.

In the above case, it is necessary to perform correction processing on the rectangular area to obtain a quadrangle (see fig. 4(b)) which can frame all contents of the corresponding question after correction, for example, a rectangle or a square. From a rectangle, parallelogram or other quadrangle to a corrected quadrangle, the corrected quadrangle can frame all the contents of the question without requiring a user to manually adjust the position, angle, etc. of the frame (i.e., the detection frame).

It should be noted that the time of the correction processing is particularly short, and the correction processing may present the following change indicators on the display device of the user terminal: and correcting the quadrangle of all the contents of the corresponding single question which is not selected into the quadrangle of all the contents of the corresponding single question which can be selected into the quadrangle. The correction processing will be specifically described below.

Fig. 3 is a flowchart showing another example of the photo title searching method according to embodiment 1 of the present invention.

As shown in fig. 3, a step S301 of performing correction processing on a quadrangle for framing a single subject is further included.

In step S301, correction processing is performed on a quadrangle for framing a single subject.

Specifically, a correction process is performed on a quadrangle for framing a single subject using a method of homography transformation. The corrected quadrangle may be rectangular, square, etc.

It should be noted that the homography transformation is a mapping relationship from one plane to another plane. Corresponding Points (correcting Points) between two images in two planes are determined, and a Homography (Homography) which is a conversion matrix for mapping from one image to the other image is calculated, namely, the point correspondence of the two images in the planes is calculated.

Fig. 4(b) is a schematic diagram showing a frame selection display of the examination paper of fig. 4(a) for each question using the corrected detection frame in the single question detection process.

As shown in fig. 4(b), each detection frame corresponds to an area of each topic in the test paper, and each detection frame after correction can completely frame and select the whole content of the corresponding topic, and in this process, the automation of monitoring and framing, correcting and displaying is realized by using a single topic detection model and correction processing, and the user does not need to manually adjust the position, angle and other operations of the detection frame.

Furthermore, the quadrangle of all the contents of the single question can be selected after correction is displayed on the corresponding user terminal, or the correction process and the corrected quadrangle detection frame are displayed on the corresponding user terminal, so that a selection interface capable of being operated selectively is provided for a user, more intelligent detection processing can be achieved, user question searching operation can be simplified, the user does not need to manually adjust the position, the angle and the like of the detection frame, and the experience of the client side can be improved.

Example 2

Embodiments of the apparatus of the present invention are described below, which may be used to perform method embodiments of the present invention. The details described in the device embodiments of the invention should be regarded as complementary to the above-described method embodiments; reference is made to the above-described method embodiments for details not disclosed in the apparatus embodiments of the invention.

Referring to fig. 6 to 8, a photographing topic searching apparatus 400 according to embodiment 1 of the present invention will be described.

According to the second aspect of the present invention, an embodiment of the present invention further provides a device 400 for searching questions by taking a picture, where the device 400 for searching questions by taking a picture includes: a first receiving module 401, configured to acquire a topic image and display the topic image on a display device of a user terminal; a detection identification module 402, configured to perform question detection on the question image, and visually identify areas of each detected question on a display device of the user terminal in real time; a second receiving module 403, configured to receive a selection operation instruction of a user, so as to photograph, identify, and search an area of a single topic selected by the user; and a result processing module 404, configured to return the search result to the user terminal.

Preferably, the detection identification module 402 includes a topic detection model, and topic detection is performed by the topic detection model, which is a trained image topic identification model for detecting a region of a topic contained in an image.

It should be noted that the topic detection model is a depth network model based on technologies such as CNN, Attention, LSTM, etc., where the input features are multi-frame images, the multi-frame images include a current frame image and a plurality of frame images (a previous frame image or a few previous frame images) that are a specific number of frames before the current frame image, and the output features are position coordinates of four points in a topic frame selection quadrangle corresponding to the current frame image.

Preferably, acquiring continuous multi-frame title images in real time, and displaying the continuous multi-frame title images on a display device of the user terminal in real time; the topic detection is performed based on each frame image of a predetermined number of interval frames in the continuous multi-frame images.

Preferably, the detected region of the single topic is a quadrilateral region; the real-time visual identification of the detected area of each single question comprises the selection and display of the quadrilateral area.

As shown in fig. 7, another photographing question searching apparatus 400 is further provided in this embodiment, where the photographing question searching apparatus 400 further includes a correction processing module 501, and the correction processing module 501 is configured to perform correction processing on the quadrangle, so as to display the quadrangle of all contents of the corrected question that can be selected in a frame to the user terminal.

Further, the second receiving module 403 further includes: and receiving the clicking operation of the detected quadrangular area of the single subject by the user or the clicking operation of the corrected quadrangular area by the user so as to photograph, identify and search the image area of the selected quadrangular area.

Preferably, the receiving of the user's pointing operation includes a pointing operation on a quadrangular region of the detected single topic or a pointing operation on the quadrangular region after correction.

Preferably, the photographing of the area of the selected theme includes: calling the camera device to take a picture, and cutting the single-subject area part selected by the user in the obtained image to obtain the image of the single-subject area; optionally, after the photographing the selected area of the single topic, the method further includes: and saving and obtaining the image of the single subject area.

As shown in fig. 8, the apparatus further includes a determining module 601, where the determining module 601 is configured to determine the question type and the question key information of a single question, and perform matching judgment with the questions in the preset question library by using the judgment rule, so as to return a question searching result to the user terminal, where the question searching result includes whether the question exists, a recommended answer corresponding to the question, and other recommended questions of the same type.

In embodiment 2, the same portions as those in embodiment 1 are not described.

Those skilled in the art will appreciate that the modules in the above-described embodiments of the apparatus may be distributed as described in the apparatus, and may be correspondingly modified and distributed in one or more apparatuses other than the above-described embodiments. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.

Compared with the prior art, the invention can detect the single question more accurately by using the single question detection model, optimize the question searching method and improve the detection precision; through to the quadrangle after the user terminal shows the correction, need not the user and carries out operations such as position, the angle of manual adjustment detection frame, from this, can realize more intelligent detection and handle, can simplify the user and search for the problem operation, can also promote the customer end and experience and feel.

Example 3

In the following, embodiments of the computer apparatus of the present invention are described, which may be seen as specific physical embodiments for the above-described embodiments of the method and apparatus of the present invention. The details described in the computer device embodiment of the invention should be considered as additions to the method or apparatus embodiment described above; for details which are not disclosed in the embodiments of the computer device of the invention, reference may be made to the above-described embodiments of the method or apparatus.

Fig. 9 is a schematic structural diagram of a computer device according to an embodiment of the present invention, the computer device including a processor and a memory, the memory storing a computer-executable program, the processor executing the method of fig. 1 when the computer program is executed by the processor.

As shown in fig. 9, the computer device is in the form of a general purpose computing device. The processor can be one or more and can work together. The invention also does not exclude that distributed processing is performed, i.e. the processors may be distributed over different physical devices. The computer device of the present invention is not limited to a single entity, and may be a sum of a plurality of entity devices.

The memory stores a computer executable program, typically machine readable code. The computer readable program may be executed by the processor to enable a computer device to perform the method of the invention, or at least some of the steps of the method.

The memory may include volatile memory, such as Random Access Memory (RAM) and/or cache memory, and may also be non-volatile memory, such as read-only memory (ROM).

Optionally, in this embodiment, the computer device further includes an I/O interface, which is used for data exchange between the computer device and an external device. The I/O interface may be a local bus representing one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, and/or a memory storage device using any of a variety of bus architectures.

It should be understood that the computer device shown in fig. 9 is only one example of the present invention, and elements or components not shown in the above examples may also be included in the computer device of the present invention. For example, some computer devices also include display units such as display screens, and some computer devices also include human-computer interaction elements such as buttons, keyboards, and the like. The computer device can be considered to be covered by the present invention as long as the computer device can execute the computer readable program in the memory to implement the method of the present invention or at least part of the steps of the method.

FIG. 10 is a schematic diagram of a computer program product of an embodiment of the invention. As shown in fig. 10, a computer-executable program is stored in the computer program product, and when the computer-executable program is executed, the method of the present invention is implemented. The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

From the above description of the embodiments, those skilled in the art will readily appreciate that the present invention can be implemented by hardware capable of executing a specific computer program, such as the system of the present invention, and electronic processing units, servers, clients, mobile phones, control units, processors, etc. included in the system. The invention may also be implemented by computer software for performing the method of the invention, e.g. control software executed by a microprocessor, an electronic control unit, a client, a server, etc. It should be noted that the computer software for executing the method of the present invention is not limited to be executed by one or a specific hardware entity, and can also be realized in a distributed manner by non-specific hardware. For computer software, the software product may be stored in a computer readable storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or may be distributed over a network, as long as it enables the computer device to perform the method according to the present invention.

While the foregoing detailed description has described the objects, aspects and advantages of the present invention in further detail, it should be appreciated that the present invention is not inherently related to any particular computer, virtual machine, or computer apparatus, as various general purpose devices may implement the present invention. The invention is not to be considered as limited to the specific embodiments thereof, but is to be understood as being modified in all respects, all changes and equivalents that come within the spirit and scope of the invention.

Claims

1. A method for searching questions by photographing is characterized by comprising the following steps:

acquiring a topic image and displaying the topic image on a display device of a user terminal;

performing single-question detection on the question image, and visually identifying the area of each detected single question on a display device of the user terminal in real time;

receiving a selection operation instruction of a user so as to photograph, identify and search the area of the single topic selected by the user;

and the user terminal receives the returned search result.

2. The method for searching questions by taking a picture according to claim 1, wherein the question detection is performed by a question detection model, which is a trained image question recognition model for detecting a region of a question included in an image.

3. The photographing question searching method according to claim 1 or 2, wherein continuous multi-frame question images are obtained in real time and displayed on a display device of the user terminal in real time;

the topic detection is performed based on each frame image of a predetermined number of interval frames in the continuous multi-frame images.

4. The photo title searching method of claim 2,

the area of the single question is a quadrilateral area comprising the single question;

the real-time visual identification of the detected area of each single question comprises the selection and display of the quadrilateral area.

5. The photo title searching method of claim 4, wherein the displaying of the quadrilateral area by frame selection comprises:

and correcting the quadrangle so that all the contents of the single question can be framed and selected by the corrected quadrangle region displayed to the user terminal.

6. The question photographing and searching method of claim 4 or 5, wherein the receiving of the selection operation instruction of the user comprises:

and receiving the clicking operation of the detected quadrangular area of the single subject by the user or the clicking operation of the corrected quadrangular area by the user so as to photograph, identify and search the image area of the selected quadrangular area.

7. The photo-taking question searching method according to claim 1, wherein the photo-taking a region of the user-selected question includes:

calling the camera device to take a picture, and cutting the area of the single topic selected by the user in the obtained image to obtain the image of one or more single topic areas;

optionally, after the photographing the selected area of the single topic, the method further includes: saving and obtaining the images of the one or more thematic areas;

optionally, the images of one or more subject areas are identified and searched in parallel; further optionally, when the image of the single-question area is recognized, based on the difference between the printed form and the written form, the recognition content corresponding to the written form in the single-question area is removed from the recognition result.

8. The photo-taking problem searching method according to claim 1, wherein the identifying and searching the area of the user-selected problem comprises:

determining the question type and the question key information of each single question, and performing matching judgment on the single question and the questions in a preset question bank by using a judgment rule so as to return a question searching result to the user terminal, wherein the question searching result comprises whether the question exists, a recommended answer corresponding to the question and other recommended questions of the same type.

9. The utility model provides a take a picture and search for a question device which characterized in that includes:

the first receiving module is used for acquiring a topic image and displaying the topic image on a display device of the user terminal;

the detection identification module is used for carrying out single-question detection on the question image and visually identifying the area of each detected single question on a display device of the user terminal in real time;

the second receiving module is used for receiving a selection operation instruction of a user so as to photograph, identify and search the area of the single question selected by the user;

and the result processing module is used for receiving the title search result returned to the user terminal.

10. A computer device comprising a processor and a memory, the memory for storing a computer executable program, characterized in that:

the computer program, when executed by the processor, causes the processor to perform the photo title method of any of claims 1-8.

11. A computer program product storing a computer executable program, wherein the computer executable program when executed implements the photo topic searching method of any one of claims 1-8.