CN111753168A - Method and device for searching questions, electronic equipment and storage medium - Google Patents

Method and device for searching questions, electronic equipment and storage medium Download PDF

Info

Publication number
CN111753168A
CN111753168A CN202010581532.4A CN202010581532A CN111753168A CN 111753168 A CN111753168 A CN 111753168A CN 202010581532 A CN202010581532 A CN 202010581532A CN 111753168 A CN111753168 A CN 111753168A
Authority
CN
China
Prior art keywords
image
target
determining
point
expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010581532.4A
Other languages
Chinese (zh)
Inventor
赵华
史云奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Genius Technology Co Ltd
Original Assignee
Guangdong Genius Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Genius Technology Co Ltd filed Critical Guangdong Genius Technology Co Ltd
Priority to CN202010581532.4A priority Critical patent/CN111753168A/en
Publication of CN111753168A publication Critical patent/CN111753168A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5846Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Ophthalmology & Optometry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Library & Information Science (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention relates to the technical field of intelligent equipment, and discloses a method and a device for searching questions, electronic equipment and a storage medium. The method comprises the following steps: under a click-to-read scene, starting a first image acquisition device to acquire a preview image of a bearing body, and starting a second image acquisition device to acquire a face image of a user; inputting a pre-trained expression recognition model in the facial image to obtain expression information; when the expression information is a preset expression, determining a fixation point of eyeballs of the user on the bearing body according to the facial image; acquiring a target question image based on the fixation point and the preview image; character information of the target topic image is identified through OCR, and answer search is carried out in a resource library or the Internet by utilizing the character information. The embodiment of the invention can be combined with behavior analysis, when the preset expression is obtained, the target question is determined through the action of the eyeballs, and the corresponding question searching operation is executed, so that the intelligent question searching is realized, and the user experience is improved.

Description

Method and device for searching questions, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of intelligent equipment, in particular to a method and a device for searching questions, electronic equipment and a storage medium.
Background
At present, intelligent equipment such as a learning machine and a family education machine has a problem searching function, the problem searching mode is roughly divided into two categories, and one category is to photograph the contents needing to be selected by a supporting body (such as a book) through a rear camera of the intelligent equipment; the other is mainly completed by a front camera of the intelligent device, the camera identifies the position of an operating body (such as a finger) on the carrier, and then the content on the carrier is photographed based on the position. And searching the photographed image for corresponding answers or pronunciation and the like in a resource library or the Internet through OCR recognition.
The first type of framing problem mode is complicated in flow, depends on the photographing level of a user, and cannot be subjected to subsequent operation due to over-fuzzy; the second type of block topic is simple to use, but also has the following problems: in practice, it is found that an operation body appears in an image and possibly shields partial contents, so that a decontamination operation is required, the response time is long, and the intelligence degree is low.
Disclosure of Invention
Aiming at the defects, the embodiment of the invention discloses a method and a device for searching for the problems, electronic equipment and a storage medium, which can obtain the intention of a user through behavior analysis so as to achieve the aim of searching for the problems.
The first aspect of the embodiments of the present invention discloses a method for searching questions, where the method includes:
under a click-to-read scene, starting a first image acquisition device to acquire a preview image of a bearing body, and starting a second image acquisition device to acquire a face image of a user;
inputting a pre-trained expression recognition model into the facial image to obtain expression information;
when the expression information is a preset expression, determining a fixation point of eyeballs of the user on a bearing body according to the facial image;
acquiring a target question image based on the fixation point and the preview image;
and recognizing character information of the target topic image through OCR, and searching answers in a resource library or the Internet by utilizing the character information.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, when the expression information is a preset expression, determining a gaze point of an eyeball of a user on a carrier according to the facial image includes:
when the expression information is a preset expression and the duration time reaches preset time, determining the position of the pupil center in the face image and the offset of the position of the pupil center relative to a reference point;
and determining the sight line direction of the eyeball and the fixation point on the bearing body based on the offset.
As an alternative implementation, in the first aspect of the embodiment of the present invention, the determining the position of the pupil center in the face image and the offset of the position of the pupil center with respect to the reference point includes:
inputting the facial image into a convolutional neural network trained in advance to determine the characteristic points of the pupil;
determining the position of the pupil center by using the characteristic points of the pupil;
and determining the offset of the pupil center according to the pupil center position and the reference point.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, determining the offset of the pupil center position according to the pupil center position and the reference point includes:
constructing the appearance of the eye according to the characteristic points of the pupil, and taking the position of the pupil center when looking straight as a reference point;
and calculating the offset of the pupil center position relative to the reference point according to the pupil center position and the reference point position.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, acquiring a target topic image based on the gaze point and the preview image includes:
converting the fixation point coordinates into the preview image through affine transformation to obtain target point coordinates in the preview image corresponding to the fixation point coordinates, wherein the face image corresponding to the fixation point coordinates is matched with the preview image corresponding to the target point coordinates;
inputting the preview image into a text detection model to obtain a text outline of each question;
determining a target text outline corresponding to the target point coordinates;
and segmenting the corresponding preview image in the target text outline to obtain a target topic image.
A second aspect of the embodiments of the present invention discloses a device for searching questions, including:
the preview unit is used for starting the first image acquisition device to acquire a preview image of the bearing body and starting the second image acquisition device to acquire a face image of the user in a click-to-read scene;
the recognition unit is used for inputting a pre-trained expression recognition model into the facial image to obtain expression information;
the watching unit is used for determining a watching point of eyeballs of the user on the bearing body according to the facial image when the expression information is a preset expression;
the acquisition unit is used for acquiring a target item image based on the fixation point and the preview image;
and the searching unit is used for identifying the character information of the target topic image through OCR and searching answers in a resource library or the Internet by utilizing the character information.
As an alternative implementation, in a second aspect of the embodiments of the present invention, the gaze unit includes:
the first determining subunit is configured to determine, when the expression information is a preset expression and the duration reaches a preset time, a position of a pupil center in the face image and an offset of the position of the pupil center with respect to a reference point;
a second determining subunit, configured to determine, based on the offset amount, a gaze direction of the eyeball and a gaze point on the carrier.
As an optional implementation manner, in a second aspect of the embodiment of the present invention, the first determining subunit includes:
the first grandchild unit is used for inputting the face image into a convolutional neural network trained in advance to determine the characteristic point of the pupil;
the second grandchild unit is used for determining the position of the center of the pupil by utilizing the characteristic point of the pupil;
the third grandchild unit is used for constructing the appearance of the eye according to the characteristic points of the pupil, and taking the position of the pupil center when looking straight as a reference point;
and a fourth sun unit, configured to calculate an offset of the pupil center position with respect to a reference point according to the position of the pupil center and the position of the reference point.
As an optional implementation manner, in a second aspect of the embodiment of the present invention, the obtaining unit includes:
the transformation subunit is used for transforming the fixation point coordinates into the preview image through affine transformation to obtain target point coordinates in the preview image corresponding to the fixation point coordinates, and the face image corresponding to the fixation point coordinates is matched with the preview image corresponding to the target point coordinates;
the outline identification subunit is used for inputting the preview image into a text detection model to obtain a text outline of each question;
the third determining subunit is used for determining a target text contour corresponding to the target point coordinates;
and the segmentation subunit is used for segmenting the preview image corresponding to the target text outline to obtain a target topic image.
A third aspect of an embodiment of the present invention discloses an electronic device, including: a memory storing executable program code; a processor coupled with the memory; the processor calls the executable program code stored in the memory to execute part or all of the steps of the method for searching the topic disclosed by the first aspect of the embodiment of the invention.
A fourth aspect of the present invention discloses a computer-readable storage medium storing a computer program, where the computer program enables a computer to execute part or all of the steps of the method for searching for a topic disclosed in the first aspect of the present invention.
A fifth aspect of the embodiments of the present invention discloses a computer program product, which, when running on a computer, causes the computer to execute part or all of the steps of the method for searching for a topic disclosed in the first aspect of the embodiments of the present invention.
A sixth aspect of the present invention discloses an application publishing platform, where the application publishing platform is configured to publish a computer program product, and when the computer program product runs on a computer, the computer is enabled to execute some or all of the steps of the method for searching for a topic disclosed in the first aspect of the present invention.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
in the embodiment of the invention, in a click-to-read scene, a first image acquisition device is started to acquire a preview image of a bearing body, and a second image acquisition device is started to acquire a face image of a user; inputting a pre-trained expression recognition model into the facial image to obtain expression information; when the expression information is a preset expression, determining a fixation point of eyeballs of the user on a bearing body according to the facial image; acquiring a target question image based on the fixation point and the preview image; and recognizing character information of the target topic image through OCR, and searching answers in a resource library or the Internet by utilizing the character information. Therefore, the embodiment of the invention can be combined with behavior analysis, when the preset expression is obtained, the target question is determined through the action of the eyeballs, and the corresponding question searching operation is executed, so that the intelligent question searching is realized, and the user experience is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic flow chart illustrating a method for searching for a topic according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a method for acquiring a fixation point according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a topic searching apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first", "second", "third", "fourth", and the like in the description and the claims of the present invention are used for distinguishing different objects, and are not used for describing a specific order. The terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiment of the invention discloses a method and a device for searching for a question, electronic equipment and a storage medium, which can be combined with behavior analysis, determine a target question through the action of eyeballs when a preset expression is obtained, execute corresponding question searching operation, realize intelligent question searching and improve user experience, and are described in detail in combination with the attached drawings.
Example one
110. And under the click-to-read scene, starting a first image acquisition device to acquire a preview image of the bearing body, and starting a second image acquisition device to acquire a face image of the user.
The electronic equipment can be intelligent equipment such as a family education machine, a learning machine, a mobile phone with a learning function or a tablet computer. And starting a corresponding point reading APP in the point reading scene, such as a question searching APP or a question receiving APP. In the click-to-read scene, the image acquisition device can be automatically started to obtain the preview image, or a user triggers to start the image acquisition device to obtain the preview image in the click-to-read scene.
The first image acquisition device and the second image acquisition device are photographing devices integrated on the electronic equipment, such as a front camera, and photograph a bearing body placed on the front side of the electronic equipment; it can also be a discrete component that takes a picture of a carrier placed near the electronic device. The first image acquisition device and the second image acquisition device are both in a preview mode, the bearing body is placed at the position aligned with the first image acquisition device, a preview image of the bearing body can be obtained, and the bearing body is a paper learning document such as a book, an exercise book and an exercise book. And placing the second image acquisition device at a position aligned with the face of the user for acquiring the face image of the user.
120. And inputting a pre-trained expression recognition model into the facial image to obtain expression information.
The expression recognition model can be an existing mature model, because the existing mature model is mostly performed for the front face, and the face faces slightly face downwards in the user question making process, the obtained face image is not a standard front face image, and therefore certain errors can occur during recognition.
In the embodiment of the invention, an expression recognition model is trained by a large number of face samples for doing questions, the trained labels are expressions, and expression information can be summarized into 5 types: pleasure, calmness, aversion, pain and confusion. Of course, the classification may be performed in other categories according to other classification methods. And continuously training the expression recognition model by using the face sample and the labeling information of the face sample, and determining the parameter information of the expression recognition model through a back propagation algorithm. The expression recognition model can be a neural network model such as a convolutional neural network and a recurrent neural network.
And inputting the facial image into the trained expression recognition model to obtain the expression information corresponding to the facial image. The method comprises the steps that a facial image is a preview image, and the frame rate (FPS) of an image acquisition device, namely the number of frames of the preview image obtained every second, is far greater than the expression information of the facial image obtained by the expression recognition model, so that the recognition operation can be completed in a frame abandoning mode.
130. And when the expression information is a preset expression, determining a fixation point of the eyeballs of the user on the bearing body according to the facial image.
The method comprises the steps of presetting an expression as a suspicion, generally appearing in frown, left-falling mouth, pen point biting and the like, judging whether the suspicion expression is instant expression information or not when recognized expression information is the suspicion expression, giving up subsequent operation if the suspicion expression is the instant expression information, and judging whether the duration time of the preset expression reaches the preset time or not.
The preset time is set as required, and in the embodiment of the invention, the preset time can be determined according to the approximate recognition time of the expression recognition model, that is, if the expression information recognized by the expression recognition model for the continuous preset times is all the preset expressions, the duration of the preset expression is considered to reach the preset time.
Since no operation body such as a finger exists on the carrier, the first image capturing device can obtain the image of the carrier, but cannot determine the specific area where the user confuses the title, and therefore, cannot perform the corresponding title searching operation. In the embodiment of the present invention, the gaze point of the user on the carrier is determined by the face image obtained by the second image capturing device, and then the world coordinate of the gaze point is converted into the preview image in step 140, so as to obtain the specific position of the world coordinate corresponding to the preview image.
The user's gaze point may also be implemented by a training mode, for example, a distance recognition model is trained first to determine the distance between the facial image and the supporting body, the distance recognition model may be trained through the inclination angle of the facial image and the distance label, and then a gaze point recognition model is trained based on the facial image and the distance, i.e., the gaze point recognition model is trained according to the facial image, the distance and the gaze point coordinate label, so that after a facial image is obtained, the distance recognition model and the gaze point recognition model may be sequentially input to determine the gaze point coordinates.
Exemplarily, the method for determining the user's gaze point may also be implemented by a shift of the pupil position, which may include the following steps, as shown in fig. 2:
131. determining a location of a pupil center in the facial image.
There are various ways to determine the pupil center position from the face image. Illustratively, the features of the face image are extracted by a machine learning method such as a convolutional neural network, so as to obtain the position of the pupil center in the face image. The pupil center position can also be determined by training a cascade classifier for human eye recognition based on an Adaboost algorithm and tracking the human eye feature points by combining an ASM algorithm.
132. Determining an offset of the location of the pupil center relative to a reference point.
The amount of shift here is a position of the pupil center when the operator looks straight in the horizontal direction, and when the operator deviates from the reference point, the line of sight shifts in another direction, and based on the straight-view direction and the shift position, the line of sight direction at the shift position can be specified.
Illustratively, the eye appearance can be constructed by the feature points of the pupil, the position of the pupil center when looking straight is determined and is recorded as the reference point, the position of the pupil center obtained in step 131 is recorded as the second position, and the variation of the second position relative to the reference point is the offset.
133. And determining the sight line direction of the eyeball and the fixation point on the bearing body based on the offset.
For example, the mapping relationship between the offset and the gaze direction and the gaze point may be directly established according to the measurement data, the gaze direction may be determined based on the offset, and the gaze point may be determined based on the gaze direction.
For example, training data may be established in advance, training may be performed by using a training sample corresponding to a classifier, and a mapping relationship between the offset and the gaze direction and the gaze point may be established. After that, the classifier can be used for directly classifying to obtain the sight line direction and the fixation point.
The gaze point may also be determined, for example, by means of an auxiliary light source. For example, a plurality of auxiliary light sources such as near-infrared light sources illuminate a human eye, purkinje light spots of the auxiliary light sources exist in a face image, and a mapping relation among the face image, the eye and a carrier is established based on cross ratio invariance by using the purkinje light spots and the pupil center position, so that a fixation point of an eyeball on the carrier is obtained.
After the gazing point is obtained, the position of the gazing point can be corrected according to the face orientation of the face image, and the correction can be realized by adding the feature of the face orientation in the mapping relation, so that more accurate gazing point information can be obtained.
140. And acquiring a target subject image based on the fixation point and the preview image.
After the obtained fixation point coordinates are obtained, the fixation point coordinates cannot be directly combined with the preview image to obtain a fixation area of the target, namely the target topic image. It should be noted that: the preview image corresponding to the coordinates of the target point is matched with the face image corresponding to the gaze point, that is, the adopted preview image and the adopted face image are respectively acquired by the first image acquisition device and the second image acquisition device at the same time point, and if a page turning operation occurs, the first image acquisition device still is the preview image before page turning, certain influence is caused on user experience.
For example, it may be determined by affine transformation that the gaze point corresponds to the target point coordinates in the preview image. Under the condition that the first image acquisition device is installed on the electronic equipment, the position of the first image acquisition device is fixed, a transformation matrix of affine transformation can be determined through a position marking mode, namely, a plurality of first coordinate points are marked on the bearing body, second coordinate points corresponding to the first coordinate points are obtained in a preview image, and the transformation matrix is determined through a least square method or an SVM algorithm by utilizing the first coordinate points and the second coordinate points.
Through the transformation matrix, the coordinates of the fixation point in the preview image can be obtained and defined as target point coordinates, the target point coordinates are similar to the coordinates obtained through the operation body recognition, and the target subject image can be obtained based on the target point coordinates and the preview image.
Illustratively, text detection is performed on the preview image, for example, a text contour corresponding to each topic of the preview image is identified through a pre-trained MASK R-CNN model or PSEnet model, in order to make the text contour more accurate, the text contour may be filtered through the topic number, and when an IOU does not exist between a certain text contour and all topic number contours, the text contour is deleted.
And determining a target text outline according to the target point coordinates and the text outline. When the coordinates of a certain target point fall into a certain text contour, the text contour is considered as a target text contour. Of course, since the target point coordinates are the text contours obtained by training, there may be a case where the target point coordinates do not fall within any text contour or fall within two or more text contours. When the target point coordinates do not fall into any text contour, calculating the text contour closest to the target point coordinates, and defining the text contour closest to the target point as a target text contour; and when the target point falls into two or more text outlines, selecting the text outline with the maximum confidence coefficient as the target text outline, wherein the confidence coefficient of the text outline is obtained by a corresponding text detection model in the outline identification process.
Determining a target text outline, basically similar to the prior art in the following operation, wherein the preview image content in the target text outline is the target topic content, segmenting the preview image in the target text outline to obtain a target topic image, and then identifying the target topic image.
150. And recognizing character information of the target topic image through OCR, and searching answers in a resource library or the Internet by utilizing the character information.
The character information in the target topic image can be identified through a mature OCR identification model, answer search is conducted in the internet behind a resource library according to the identified character information, the resource library is a pre-created topic and a corresponding answer library, preferably, the answer search is conducted through the resource library firstly, the obtained answer is not needed by the user when the resource library does not exist or is searched in the resource library for many times, and then the answer search is conducted through the internet.
The embodiment of the invention can be combined with behavior analysis, when the preset expression is obtained, the target question is determined through the action of the eyeballs, and the corresponding question searching operation is executed, so that the intelligent question searching is realized, and the user experience is improved.
Example two
Referring to fig. 3, fig. 3 is a schematic structural diagram of a topic searching device according to an embodiment of the present invention. As shown in fig. 3, the title searching apparatus may include:
the preview unit 210 is configured to start a first image acquisition device to acquire a preview image of the bearer and start a second image acquisition device to acquire a face image of the user in a click-to-read scene;
the recognition unit 220 is configured to input a pre-trained expression recognition model into the facial image to obtain expression information;
a gazing unit 230, configured to determine a gazing point of an eyeball of the user on the carrier according to the facial image when the expression information is a preset expression;
an obtaining unit 240, configured to obtain a target item image based on the gaze point and the preview image;
and the searching unit 250 is configured to recognize character information of the target topic image through OCR, and perform answer search in a resource library or the internet by using the character information.
As an alternative embodiment, the gaze unit 230 may include:
a first determining subunit 231, configured to determine, when the expression information is a preset expression and the duration reaches a preset time, a position of a pupil center in the face image and an offset of the position of the pupil center with respect to a reference point;
a second determining subunit 232, configured to determine, based on the offset amount, a gaze direction of the eyeball and a gaze point on the carrier.
As an optional implementation manner, the first determining subunit 231 may include:
a first grandchild unit 2311, configured to input the face image into a convolutional neural network trained previously to determine a feature point of a pupil;
a second sun unit 2312, configured to determine a position of a pupil center by using the feature point of the pupil;
a third sun unit 2313, configured to construct an eye appearance according to the feature points of the pupil, and use a position of the pupil center when looking straight as a reference point;
a fourth sun unit 2314, configured to calculate an offset of the pupil center position with respect to the reference point according to the position of the pupil center and the position of the reference point.
As an optional implementation manner, the obtaining unit 240 may include:
a transformation subunit 241, configured to transform the gaze point coordinates into the preview image through affine transformation, to obtain target point coordinates in the preview image corresponding to the gaze point coordinates, where a face image corresponding to the gaze point coordinates is adapted to the preview image corresponding to the target point coordinates;
a contour identification subunit 242, configured to input the preview image into a text detection model, so as to obtain a text contour of each question;
a third determining subunit 243, configured to determine a target text contour corresponding to the target point coordinates;
and a segmentation subunit 244, configured to segment the preview image corresponding to the target text outline to obtain a target topic image.
The question searching device shown in fig. 3 can be combined with behavior analysis, when a preset expression is obtained, a target question is determined through the action of eyeballs, corresponding question searching operation is executed, intelligent question searching is achieved, and user experience is improved.
EXAMPLE III
Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure. As shown in fig. 4, the electronic device may include:
a memory 310 storing executable program code;
a processor 320 coupled to the memory 310;
the processor 320 calls the executable program code stored in the memory 310 to execute part or all of the steps of the method for searching for a topic in the first embodiment or the second embodiment.
The embodiment of the invention discloses a computer-readable storage medium which stores a computer program, wherein the computer program enables a computer to execute part or all of the steps in the method for searching for the topic in any one of the first embodiment and the second embodiment.
The embodiment of the invention also discloses a computer program product, wherein when the computer program product runs on a computer, the computer is enabled to execute part or all of the steps in the method for searching the questions in any one of the first embodiment and the second embodiment.
The embodiment of the invention also discloses an application publishing platform, wherein the application publishing platform is used for publishing the computer program product, and when the computer program product runs on a computer, the computer is enabled to execute part or all of the steps in the method for searching the problems in any one of the first embodiment and the second embodiment.
In various embodiments of the present invention, it should be understood that the sequence numbers of the processes do not mean the execution sequence necessarily in order, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated units, if implemented as software functional units and sold or used as a stand-alone product, may be stored in a computer accessible memory. Based on such understanding, the technical solution of the present invention, which is a part of or contributes to the prior art in essence, or all or part of the technical solution, can be embodied in the form of a software product, which is stored in a memory and includes several requests for causing a computer device (which may be a personal computer, a server, a network device, or the like, and may specifically be a processor in the computer device) to execute part or all of the steps of the method according to the embodiments of the present invention.
In the embodiments provided herein, it should be understood that "B corresponding to a" means that B is associated with a from which B can be determined. It should also be understood, however, that determining B from a does not mean determining B from a alone, but may also be determined from a and/or other information.
Those skilled in the art will appreciate that some or all of the steps of the methods of the embodiments may be implemented by hardware instructions of a program, which may be stored in a computer-readable storage medium, such as Read-Only Memory (ROM), Random Access Memory (RAM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (CD-ROM), or other disk Memory, or other Memory, or may be stored in a computer-readable storage medium, A tape memory, or any other medium readable by a computer that can be used to carry or store data.
The method, the apparatus, the electronic device and the storage medium for searching for a topic disclosed in the embodiments of the present invention are described in detail above, and a specific example is applied in the present document to explain the principle and the implementation of the present invention, and the description of the above embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (11)

1. A method for searching for a question, comprising:
under a click-to-read scene, starting a first image acquisition device to acquire a preview image of a bearing body, and starting a second image acquisition device to acquire a face image of a user;
inputting a pre-trained expression recognition model into the facial image to obtain expression information;
when the expression information is a preset expression, determining a fixation point of eyeballs of the user on a bearing body according to the facial image;
acquiring a target question image based on the fixation point and the preview image;
and recognizing character information of the target topic image through OCR, and searching answers in a resource library or the Internet by utilizing the character information.
2. The method of claim 1, wherein when the expression information is a preset expression, determining a gaze point of an eyeball of the user on a carrier according to the facial image comprises:
when the expression information is a preset expression and the duration time reaches preset time, determining the position of the pupil center in the face image and the offset of the position of the pupil center relative to a reference point;
and determining the sight line direction of the eyeball and the fixation point on the bearing body based on the offset.
3. The method of claim 2, wherein determining the location of the pupil center in the face image and the offset of the location of the pupil center from a reference point comprises:
inputting the facial image into a convolutional neural network trained in advance to determine the characteristic points of the pupil;
determining the position of the pupil center by using the characteristic points of the pupil;
and determining the offset of the pupil center according to the pupil center position and the reference point.
4. The method of claim 3, wherein determining an offset of the pupil center location from the pupil center location and a reference point comprises:
constructing the appearance of the eye according to the characteristic points of the pupil, and taking the position of the pupil center when looking straight as a reference point;
and calculating the offset of the pupil center position relative to the reference point according to the pupil center position and the reference point position.
5. The method of any of claims 1-4, wherein obtaining a target topic image based on the point of regard and a preview image comprises:
converting the fixation point coordinates into the preview image through affine transformation to obtain target point coordinates in the preview image corresponding to the fixation point coordinates, wherein the face image corresponding to the fixation point coordinates is matched with the preview image corresponding to the target point coordinates;
inputting the preview image into a text detection model to obtain a text outline of each question;
determining a target text outline corresponding to the target point coordinates;
and segmenting the corresponding preview image in the target text outline to obtain a target topic image.
6. An apparatus for searching for a topic, the apparatus comprising:
the preview unit is used for starting the first image acquisition device to acquire a preview image of the bearing body and starting the second image acquisition device to acquire a face image of the user in a click-to-read scene;
the recognition unit is used for inputting a pre-trained expression recognition model into the facial image to obtain expression information;
the watching unit is used for determining a watching point of eyeballs of the user on the bearing body according to the facial image when the expression information is a preset expression;
the acquisition unit is used for acquiring a target item image based on the fixation point and the preview image;
and the searching unit is used for identifying the character information of the target topic image through OCR and searching answers in a resource library or the Internet by utilizing the character information.
7. The apparatus of claim 6, wherein the gaze unit comprises:
the first determining subunit is configured to determine, when the expression information is a preset expression and the duration reaches a preset time, a position of a pupil center in the face image and an offset of the position of the pupil center with respect to a reference point;
a second determining subunit, configured to determine, based on the offset amount, a gaze direction of the eyeball and a gaze point on the carrier.
8. The apparatus of claim 7, wherein the first determining subunit comprises:
the first grandchild unit is used for inputting the face image into a convolutional neural network trained in advance to determine the characteristic point of the pupil;
the second grandchild unit is used for determining the position of the center of the pupil by utilizing the characteristic point of the pupil;
the third grandchild unit is used for constructing the appearance of the eye according to the characteristic points of the pupil, and taking the position of the pupil center when looking straight as a reference point;
and a fourth sun unit, configured to calculate an offset of the pupil center position with respect to a reference point according to the position of the pupil center and the position of the reference point.
9. The apparatus according to any one of claims 6-8, wherein the obtaining unit comprises:
the transformation subunit is used for transforming the fixation point coordinates into the preview image through affine transformation to obtain target point coordinates in the preview image corresponding to the fixation point coordinates, and the face image corresponding to the fixation point coordinates is matched with the preview image corresponding to the target point coordinates;
the outline identification subunit is used for inputting the preview image into a text detection model to obtain a text outline of each question;
the third determining subunit is used for determining a target text contour corresponding to the target point coordinates;
and the segmentation subunit is used for segmenting the preview image corresponding to the target text outline to obtain a target topic image.
10. An electronic device, comprising: a memory storing executable program code; a processor coupled with the memory; said processor calling said executable program code stored in said memory for executing a method of searching for a topic as claimed in any one of claims 1 to 5.
11. A computer-readable storage medium storing a computer program, wherein the computer program causes a computer to perform a method of searching for a subject according to any one of claims 1 to 5.
CN202010581532.4A 2020-06-23 2020-06-23 Method and device for searching questions, electronic equipment and storage medium Pending CN111753168A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010581532.4A CN111753168A (en) 2020-06-23 2020-06-23 Method and device for searching questions, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010581532.4A CN111753168A (en) 2020-06-23 2020-06-23 Method and device for searching questions, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111753168A true CN111753168A (en) 2020-10-09

Family

ID=72677589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010581532.4A Pending CN111753168A (en) 2020-06-23 2020-06-23 Method and device for searching questions, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111753168A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112395973A (en) * 2020-11-16 2021-02-23 华中科技大学鄂州工业技术研究院 User intention identification method, device, equipment and storage medium
CN113673479A (en) * 2021-09-03 2021-11-19 济南大学 Method for identifying object based on visual attention point
CN114518800A (en) * 2020-11-18 2022-05-20 北京搜狗科技发展有限公司 Request sending method and device and request sending device

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5912721A (en) * 1996-03-13 1999-06-15 Kabushiki Kaisha Toshiba Gaze detection apparatus and its method as well as information display apparatus
JP2010039646A (en) * 2008-08-01 2010-02-18 Dainippon Printing Co Ltd Contract terminal apparatus, contract management system and method, program, and recording medium
US20100189358A1 (en) * 2007-06-18 2010-07-29 Canon Kabushiki Kaisha Facial expression recognition apparatus and method, and image capturing apparatus
US20170243052A1 (en) * 2016-02-19 2017-08-24 Fujitsu Limited Book detection apparatus and book detection method
CN107168536A (en) * 2017-05-19 2017-09-15 广东小天才科技有限公司 Test question searching method, test question searching device and electronic terminal
CN108615159A (en) * 2018-05-03 2018-10-02 百度在线网络技术(北京)有限公司 Access control method and device based on blinkpunkt detection
CN108985172A (en) * 2018-06-15 2018-12-11 北京七鑫易维信息技术有限公司 A kind of Eye-controlling focus method, apparatus, equipment and storage medium based on structure light
CN109086713A (en) * 2018-07-27 2018-12-25 腾讯科技(深圳)有限公司 Eye recognition method, apparatus, terminal and storage medium
CN109086726A (en) * 2018-08-10 2018-12-25 陈涛 A kind of topography's recognition methods and system based on AR intelligent glasses
CN111160303A (en) * 2019-12-31 2020-05-15 深圳大学 Eye movement response information detection method and device, mobile terminal and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5912721A (en) * 1996-03-13 1999-06-15 Kabushiki Kaisha Toshiba Gaze detection apparatus and its method as well as information display apparatus
US20100189358A1 (en) * 2007-06-18 2010-07-29 Canon Kabushiki Kaisha Facial expression recognition apparatus and method, and image capturing apparatus
JP2010039646A (en) * 2008-08-01 2010-02-18 Dainippon Printing Co Ltd Contract terminal apparatus, contract management system and method, program, and recording medium
US20170243052A1 (en) * 2016-02-19 2017-08-24 Fujitsu Limited Book detection apparatus and book detection method
CN107168536A (en) * 2017-05-19 2017-09-15 广东小天才科技有限公司 Test question searching method, test question searching device and electronic terminal
CN108615159A (en) * 2018-05-03 2018-10-02 百度在线网络技术(北京)有限公司 Access control method and device based on blinkpunkt detection
CN108985172A (en) * 2018-06-15 2018-12-11 北京七鑫易维信息技术有限公司 A kind of Eye-controlling focus method, apparatus, equipment and storage medium based on structure light
CN109086713A (en) * 2018-07-27 2018-12-25 腾讯科技(深圳)有限公司 Eye recognition method, apparatus, terminal and storage medium
CN109086726A (en) * 2018-08-10 2018-12-25 陈涛 A kind of topography's recognition methods and system based on AR intelligent glasses
CN111160303A (en) * 2019-12-31 2020-05-15 深圳大学 Eye movement response information detection method and device, mobile terminal and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
程萌萌;林茂松;王中飞;: "应用表情识别与视线跟踪的智能教学系统研究", 中国远程教育, no. 03, pages 59 - 64 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112395973A (en) * 2020-11-16 2021-02-23 华中科技大学鄂州工业技术研究院 User intention identification method, device, equipment and storage medium
CN112395973B (en) * 2020-11-16 2023-04-25 华中科技大学鄂州工业技术研究院 User intention recognition method, device, equipment and storage medium
CN114518800A (en) * 2020-11-18 2022-05-20 北京搜狗科技发展有限公司 Request sending method and device and request sending device
CN113673479A (en) * 2021-09-03 2021-11-19 济南大学 Method for identifying object based on visual attention point

Similar Documents

Publication Publication Date Title
CN111753767A (en) Method and device for automatically correcting operation, electronic equipment and storage medium
CN111611865B (en) Examination cheating behavior identification method, electronic equipment and storage medium
CN111753168A (en) Method and device for searching questions, electronic equipment and storage medium
TWI586160B (en) Real time object scanning using a mobile phone and cloud-based visual search engine
CN110648170A (en) Article recommendation method and related device
CN111026949A (en) Question searching method and system based on electronic equipment
CN118673210A (en) Systems and methods for providing personalized product recommendations using deep learning
CN111753715B (en) Method and device for shooting test questions in click-to-read scene, electronic equipment and storage medium
CN111753120A (en) Method and device for searching questions, electronic equipment and storage medium
CN112150349A (en) Image processing method and device, computer equipment and storage medium
CN111026924A (en) Method for acquiring content to be searched and electronic equipment
CN111432131B (en) Photographing frame selection method and device, electronic equipment and storage medium
CN111274854A (en) Human body action recognition method and vision enhancement processing system
CN110443122B (en) Information processing method and related product
CN112861633A (en) Image recognition method and device based on machine learning and storage medium
CN111711758B (en) Multi-pointing test question shooting method and device, electronic equipment and storage medium
CN111027353A (en) Search content extraction method and electronic equipment
CN112613436B (en) Examination cheating detection method and device
CN111582281B (en) Picture display optimization method and device, electronic equipment and storage medium
CN111553365B (en) Question selection method and device, electronic equipment and storage medium
CN113496143B (en) Action recognition method and device and storage medium
CN114745592A (en) Bullet screen message display method, system, device and medium based on face recognition
CN115661903A (en) Map recognizing method and device based on spatial mapping collaborative target filtering
CN107870995B (en) Paraphrasing method based on user information and electronic equipment
JP2019105751A (en) Display control apparatus, program, display system, display control method and display data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination