WO2019100888A1 - 目标对象识别方法、装置、存储介质和电子设备 - Google Patents

目标对象识别方法、装置、存储介质和电子设备 Download PDF

Info

Publication number
WO2019100888A1
WO2019100888A1 PCT/CN2018/111513 CN2018111513W WO2019100888A1 WO 2019100888 A1 WO2019100888 A1 WO 2019100888A1 CN 2018111513 W CN2018111513 W CN 2018111513W WO 2019100888 A1 WO2019100888 A1 WO 2019100888A1
Authority
WO
WIPO (PCT)
Prior art keywords
prediction information
target object
key point
image
inspected
Prior art date
Application number
PCT/CN2018/111513
Other languages
English (en)
French (fr)
Inventor
李七星
余锋伟
闫俊杰
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Priority to JP2020500847A priority Critical patent/JP6994101B2/ja
Priority to SG11202000076WA priority patent/SG11202000076WA/en
Priority to KR1020207000574A priority patent/KR20200015728A/ko
Publication of WO2019100888A1 publication Critical patent/WO2019100888A1/zh
Priority to US16/734,336 priority patent/US11182592B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/167Detection; Localisation; Normalisation using comparisons between temporally consecutive images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the embodiments of the present application relate to computer vision technology, but are not limited to computer vision technology, and in particular, to a target object identification method, apparatus, storage medium, and electronic device.
  • the process of identifying objects is generally divided into detection tracking, key point detection and alignment, and feature extraction processing.
  • the recognition of the target object for example, face recognition
  • the related technology still has a high false positive rate, that is, the expected recognition rate is not achieved.
  • An embodiment of the present invention provides a target object identification method, including: performing target object detection on an object to be inspected, obtaining target object prediction information of the object, and the target object prediction information is a confidence that the detected object is a target object.
  • Information performing key point detection on the object of the image to be inspected, and obtaining key point prediction information of the object; the key point prediction information is a confidence information that detects a key point of the object as a key point of the target object; Combining the target object prediction information and the key point prediction information to obtain comprehensive prediction information of the object; and identifying the target object according to the comprehensive prediction information.
  • the embodiment of the present invention provides a target object identification apparatus, including: an object detection module configured to perform target object detection on an object to be inspected, obtain target object prediction information of the object, and the target object prediction information is detected.
  • the object is the trusted information of the target object;
  • the key point detecting module is configured to perform key point detection on the object of the image to be detected, and obtain key point prediction information of the object;
  • the key point prediction information is the detected object
  • the key point is the confidence information of the key point of the target object;
  • the prediction information fusion module is configured to fuse the target object prediction information obtained by the object detection module and the key point prediction information obtained by the key point detection module to obtain the The comprehensive prediction information of the object;
  • the object recognition module is configured to identify the target object according to the comprehensive prediction information obtained by the prediction information fusion module.
  • An embodiment of the present application provides an electronic device, including: a processor, a memory, a communication component, and a communication bus, where the processor, the memory, and the communication component complete communication with each other through the communication bus; Storing at least one executable instruction that causes the processor to perform any of the operations corresponding to the target object identification method as previously described.
  • the embodiment of the present application provides a computer readable storage medium having stored thereon computer program instructions, wherein the program instructions are executed by a processor to implement any of the steps of the target object identification method as described above.
  • the embodiment of the present application provides a computer program, including computer program instructions, wherein the program instructions are executed by a processor to implement any of the steps of the target object identification method as described above.
  • the target object prediction information of the object may be obtained in the process of performing target object detection on the object to be inspected, and key point detection is performed on the image to be detected.
  • the key point prediction information of the object is obtained, and the target object prediction information and the key point prediction information are merged, and the object of the image to be inspected is subjected to comprehensive prediction evaluation of the target object, and the image for indicating the image to be inspected is obtained.
  • the comprehensive prediction information of the integrated image quality recognized by the target object is further identified by the comprehensive prediction evaluation result.
  • FIG. 1 is a flowchart illustrating a target object identification method according to an embodiment of the present application
  • FIG. 2 is a flowchart illustrating a target object identification method according to an embodiment of the present application
  • FIG. 3 is a flowchart illustrating a target object identification method according to an embodiment of the present application.
  • FIG. 4 is a flowchart illustrating a target object identification method according to an embodiment of the present application.
  • FIG. 5 is a logic block diagram showing a target object recognition apparatus according to an embodiment of the present application.
  • FIG. 6 is a logic block diagram showing a target object recognition apparatus according to an embodiment of the present application.
  • FIG. 7 is a logic block diagram showing a target object recognition apparatus according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • a plurality means two or more, and “at least one” means one, two or more. Any one of the components, data or structures mentioned in the present application may be understood as one or more if it is not explicitly defined.
  • FIG. 1 is a flowchart showing a target object recognition method according to an embodiment of the present application.
  • step S110 target object detection is performed on an object to be inspected, and target object prediction information of the object is obtained, and the target object prediction information is confidence information that the detected object is a target object.
  • the image to be examined here is a photo or video frame image in which one or more object objects are taken.
  • the image should meet certain resolution requirements, at least through the naked eye to identify the object object that was captured.
  • the target object here is the object object intended to be recognized, including but not limited to a face, a pedestrian, a vehicle, a dog, a cat, an ID card, and the like.
  • the object to be inspected may be detected by any suitable image analysis and processing method to detect an image region that may have a target object from the image to be detected, and the image region is a rectangular frame image region that may contain the target object. Alternatively, based on the outer contour image area of the preliminary detected target object.
  • the image to be inspected there may be a plurality of objects, and when detecting each target object, it is also possible to detect a plurality of rectangular frame image regions. Therefore, in the process of detecting the target object, the prediction accuracy of each rectangular frame image area is also evaluated, and the target object prediction information is obtained, and the target object prediction information is used to represent the detected object as the target object prediction. Accurate information; for example, the target object information represents the detected image region as predicted accurate information of the target object.
  • the target object prediction information includes, but is not limited to, an evaluation score, a prediction probability, or a detection confidence.
  • step S120 key detection is performed on the object of the image to be inspected, and key point prediction information of the object is obtained, where the key point prediction information is that the key point of the detected object is a key point of the target object. Confidence information.
  • the key point location of the target object is preset.
  • the key point positioning here includes: detecting the image coordinates of the key point of the target object in the image.
  • five key points can be set, namely, the mouth, the nose, the left eye, the right eye, and the top of the head; for the human body/pedestrian, 14 key points can be set in various key parts of the human body.
  • the key points of the target object can be detected from the image to be detected by any suitable key point location method for the image.
  • the positioning accuracy of the key points of the detected object is also evaluated, that is, the key point prediction information, and the key point prediction information represents the key point of the detected object as the target. Confidence information for the key points of the object.
  • the key point prediction information includes, but is not limited to, an evaluation score, a prediction probability, or a detection confidence.
  • the key point prediction information can be obtained by averaging the evaluation of multiple key points.
  • step S120 does not need to rely on the detection result of step S110, that is, the object of the image to be inspected can be directly detected in the case where the target object is not detected, so Step S110 and step S120 may be sequentially performed, step S120 and step S110 are sequentially performed, or steps S110 and S120 are performed in parallel.
  • step S130 the target object prediction information and the key point prediction information are fused to obtain comprehensive prediction information of the object.
  • the combined prediction of the detected object can be obtained by merging, summing or multiplying the two. information.
  • the comprehensive prediction information is obtained by at least combining two target prediction accuracy indicators, which are target object prediction information that characterizes target object detection accuracy and key point prediction information that characterizes key point location accuracy, both prediction accuracy will affect the target object.
  • the comprehensive prediction information can be used to indicate the overall image quality of the image to be examined for target object recognition.
  • step S140 the target object is identified based on the comprehensive prediction information.
  • the object to be inspected is continued to perform target object recognition; conversely, it can be estimated that the quality of the comprehensive prediction for the target object detection is not high, and the object not to be inspected is targeted.
  • the object recognition processing or the target recognition processing is performed after filtering, cropping, enlarging, and brightening the image to be inspected.
  • the image to be inspected is a preview image taken by the camera, and if the determined comprehensive prediction information meets a predetermined predicted quality threshold, the target object is identified from the image to be inspected according to any applicable target object recognition method.
  • the target object prediction information of the object may be obtained in the process of performing target object detection on the object to be inspected, and the key point detection process is performed on the image to be detected. And obtaining key point prediction information of the object, and fusing the target object prediction information and the key point prediction information, performing comprehensive prediction evaluation of the target object on the object to be inspected, and obtaining an image indicating that the image to be inspected is used for The comprehensive prediction information of the integrated image quality recognized by the target object is further identified by the comprehensive prediction evaluation result.
  • FIG. 2 is a flowchart illustrating a target object recognition method according to an embodiment of the present application.
  • step S210 an image area corresponding to the object of the image to be inspected is acquired.
  • An image area that may contain a target object such as an image area that may contain an circumscribed rectangle of the object, may be detected by the image analysis method used.
  • step S220 target object detection is performed on an image region corresponding to the object of the image to be inspected, and target object prediction information of the object is obtained.
  • the target area detection processing may be performed on the image area by an applicable image analysis method, and the target object prediction information of the object is obtained.
  • a neural network for object detection may be employed by a pre-trained neural network including, but not limited to, a regional candidate network, a convolutional neural network, etc., from which the target object is detected, and the indication target is acquired
  • the target object predicts the accuracy of the object detection to improve the recognition rate of the object detection.
  • step S230 key point detection is performed on the image area corresponding to the object of the image to be inspected, and key point prediction information of the object is obtained.
  • key point detection may be performed on the image area to obtain key point prediction information of the object.
  • step S240 the target object prediction information and the key point prediction information are multiplied to obtain comprehensive prediction information of the object.
  • the target object prediction information by multiplying the target object prediction information and the key point prediction information, it is possible to highlight an image to be inspected with high target prediction accuracy and high key point prediction accuracy, thereby preferentially recalling the target object recognition task.
  • Comprehensive quality image to be inspected At the same time, a higher recognition rate is ensured by adjusting the picking threshold for comprehensive quality assessment.
  • step S250 the target object is identified based on the comprehensive prediction information.
  • the processing of this step is similar to the processing of the foregoing step S140, and details are not described herein.
  • step S260 any of the following operations may be performed.
  • the foregoing image to be detected is a video frame image in a sequence of video frames, and the target object is tracked according to a result of identifying the target object from the plurality of the video frame images, thereby performing a task of object tracking.
  • Operation 2 Select, according to the comprehensive prediction information obtained for each of the plurality of to-be-detected images, a to-be-detected image with the highest comprehensive prediction quality from the plurality of the to-be-detected images as the captured image. For example, during shooting, an image with the highest overall quality of prediction can be selected as a captured image from among a plurality of images (preview images) captured within 2 seconds, stored in a memory, and displayed to the user.
  • images preview images
  • Operation 3 Select a predetermined number of to-be-detected images from the plurality of the to-be-detected images according to the comprehensive prediction information obtained for each of the plurality of the to-be-detected images, and perform feature fusion on the selected to-be-detected images, and the merged image features Data can be further used for tasks that are detected or processed.
  • an image region corresponding to the object of the image to be detected is first acquired, and then target object detection and key point detection are performed for the image region, and target object prediction information and key point prediction are obtained.
  • the information is further multiplied by the target object prediction information and the key point prediction information to obtain comprehensive prediction information of the object.
  • processing such as target object tracking, snap image selection, and image feature fusion is further performed, so that other target related objects can be better executed based on the integrated image quality assessment.
  • Image processing tasks is further performed.
  • FIG. 3 is a flowchart illustrating a target object recognition method according to an embodiment of the present application.
  • step S310 an image area corresponding to the object of the image to be inspected is acquired.
  • step S320 target object detection is performed on an image region corresponding to the object of the image to be inspected, and target object prediction information of the object is obtained.
  • step S330 a key point detection is performed on the object of the image to be inspected by using a first neural network model of the positioning key point, and key point prediction information of the object is obtained.
  • the pre-trained first neural network model for key point positioning of the object candidate frame is used to directly perform key point detection on the acquired image region, acquire key points of the object, and corresponding key point predictions. information.
  • a key point of the object and corresponding key point prediction information are acquired from the image to be detected by using a first neural network model for locating a key point of the image to be inspected. That is to say, the image to be detected, instead of the image region corresponding to the object, can be used as an input of the first neural network model, and the key points are detected from the image to be detected.
  • step S340 the deflection angle information of the object is detected from the image region corresponding to the object of the image to be inspected.
  • the deflection angle of the object is also detected at the same time, and therefore, the deflection angle information of the object can be detected by the processing of step S340.
  • the deflection angle may include a horizontal deflection angle (side rotation angle), and may also include a vertical deflection angle (pitch angle), or a horizontal deflection angle (side rotation angle) and a vertical deflection angle (pitch angle). .
  • the second neural network model of the object classification may be used to detect the object from the image region corresponding to the object of the image to be inspected and acquire the deflection angle information of the object.
  • a second neural network model for detecting the deflection angle information of the object may be pre-trained.
  • the deflection angle information can also be obtained by other image analysis methods.
  • step S350 the target object prediction information, the key point prediction information, and the deflection angle information are fused to obtain comprehensive prediction information of the object.
  • the deflection angle information of the object is also used as one of the indicators for image quality evaluation.
  • the target object prediction information that can characterize the target object detection accuracy, the key point prediction information that characterizes the key point positioning accuracy, and the deflection angle information of the object are, for example, averaged, summed, or multiplied.
  • the fusion is performed in a manner such as to obtain comprehensive prediction information of the object.
  • step S360 the target object is identified based on the comprehensive prediction information.
  • step S260 the processing of the foregoing step S260 can be continued.
  • the deflection angle information of the object detected from the image region corresponding to the object of the image to be inspected is also used as one of the evaluation indexes, and the deflection angle information and the target object prediction information and the key are The point prediction information is fused, and the object to be inspected is subjected to a comprehensive quality assessment for the target object recognition, and the target object is further identified based on the comprehensive prediction evaluation result.
  • FIG. 4 is a flowchart illustrating a target object recognition method according to an embodiment of the present application.
  • the processing of the target object recognition method is described with the target object as a human face as an example.
  • step S410 the object to be inspected is subjected to face detection to obtain target object prediction information of the face.
  • the face of the image to be inspected can be detected by any face detection method applicable, and the target object prediction information of the face can be obtained.
  • step S420 the first point of the positioning key is used to perform key point detection on the object of the image to be inspected, and key point prediction information of the face is obtained.
  • step S430 a face pitch angle and/or a face roll angle in the image to be inspected are acquired.
  • the face pitch angle refers to the deflection angle of the face with the horizontal direction as the axis
  • the face rotation angle refers to the deflection angle of the face with the vertical direction as the axis.
  • the face pitch angle and the face rotation angle range from -90 degrees to +90 degrees.
  • the face is detected and the face pitch angle and/or the face roll angle are obtained by the aforementioned second neural network model.
  • either or both of the face pitch angle and the face roll angle can be acquired for subsequent processing.
  • step S440 the face pitch angle and/or the face roll angle are normalized according to an applicable index function.
  • the face pitch angle is normalized by the exponential function exp (-10 ⁇ face pitch angle ⁇ face pitch angle/8100); similarly, by the exponential function exp( ⁇ 10 ⁇ face rotation angle ⁇
  • the face rotation angle / 8100) normalizes the face rotation angle.
  • the face tilt angle and the face roll angle may be normalized using the formula
  • step S450 comprehensive prediction information of the object is obtained by one of the following operations:
  • the target object prediction information, the key point prediction information, the normalized face pitch angle, and the normalized face rotation angle are multiplied to obtain comprehensive prediction information of the object.
  • one or both of the normalized face pitch angle and the normalized face rotation angle can be performed with the target object prediction information and the key point prediction information. Fusion to obtain comprehensive prediction information of the object.
  • the face recognition of the object to be inspected by the applicable face recognition method is continued.
  • any existing network training method can be used to pre-train a neural network for object detection, a first neural network model for locating key points, and/or a second neural network model for object classification.
  • the aforementioned neural network model may be pre-trained using a supervised learning method, an unsupervised method, an intensive learning method, or a semi-supervised method according to functions, characteristics, and training requirements to be implemented.
  • the key point positioning and the deflection angle detection of the face can be performed through the pre-trained model to ensure the accuracy of the face detection, and
  • the obtained target object prediction information, key point prediction information, and normalized face pitch angle and/or normalized face rotation angle are combined to obtain comprehensive quality data related to face recognition, and further
  • the face is identified based on the comprehensive prediction evaluation result.
  • a target object recognition apparatus includes:
  • the object detection module 510 is configured to perform target object detection on the object to be inspected, and obtain target object prediction information of the object, where the target object prediction information is the confidence information that the detected object is the target object;
  • the key point detection module 520 is configured to perform key point detection on the object of the image to be inspected to obtain key point prediction information of the object; and the key point prediction information is that the key point of the detected object is the target object. Confidence information of key points;
  • the prediction information fusion module 530 is configured to combine the target object prediction information obtained by the object detection module 510 and the key point prediction information obtained by the key point detection module 520 to obtain comprehensive prediction information of the object;
  • the object identification module 540 is configured to identify the target object according to the comprehensive prediction information obtained by the prediction information fusion module.
  • the target object identification device of the present embodiment is used to implement the corresponding target object identification method in the foregoing method embodiments, and has the beneficial effects of the corresponding method embodiments, and details are not described herein again.
  • the target object recognition apparatus includes an image area acquisition module 550 in addition to the object detection module 510, the key point detection module 520, the prediction information fusion module 530, and the object recognition module 540.
  • the image region obtaining module 550 is configured to acquire an image region corresponding to the object of the image to be inspected.
  • the object detection module 510 is configured to perform target object detection on the image region corresponding to the object of the image to be detected acquired by the image region acquisition module 550;
  • the key point detection module 520 is configured to use the image to be detected acquired by the image region acquisition module 550.
  • the image area corresponding to the object performs key point detection.
  • the prediction information fusion module 530 is configured to multiply the target object prediction information and the keypoint prediction information to obtain comprehensive prediction information for the object.
  • the key point detection module 520 is configured to perform key point detection on the object of the image to be inspected by using a neural network model of the positioning key to obtain key point prediction information of the object.
  • the target object prediction information and the key point prediction information are merged to obtain comprehensive prediction information of the object.
  • the device further includes a deflection angle detecting module 560A configured to detect an image region acquired by the image region acquiring module 550 and detect deflection angle information of the object.
  • the prediction information fusion module 530 is configured to perform fusion according to the target object prediction information, the key point prediction information, and the deflection angle information, to obtain comprehensive prediction information of the object.
  • the deflection angle detection module 560A is configured to detect a deflection angle information of the object from the image region using a neural network model of the object classification.
  • the image to be detected is a video frame image; after the target object is identified according to the comprehensive prediction information, the device further includes:
  • the object tracking module 570 is configured to track the target object according to a result of identifying the target object from the plurality of the video frame images;
  • the captured image selection module 580 is configured to select a video frame image with the highest comprehensive prediction quality from the plurality of the video frame images as a captured image according to the comprehensive prediction information obtained for each of the plurality of the video frame images;
  • the feature fusion module 590 is configured to select a predetermined number of video frame images from the plurality of the video frame images according to the integrated prediction information obtained for each of the plurality of the video frame images, and perform feature fusion on the selected video frame images.
  • the target object may be: a human face.
  • the target object recognition device includes a face deflection angle detecting module 560B in addition to the object detection module 510, the key point detection module 520, the prediction information fusion module 530, the object recognition module 540, and the image region acquisition module 550.
  • the face deflection angle detecting module 560B is configured to be acquired from the image region acquiring module 550.
  • the image area detects the face pitch angle and/or the face side rotation angle.
  • the prediction information fusion module 530 is configured to
  • the target object prediction information, the key point prediction information, the normalized face pitch angle, and the normalized face rotation angle are multiplied to obtain comprehensive prediction information of the object.
  • the target object recognition device further includes an object tracking module 570, a snap image selection module 580, or a feature fusion module 590.
  • the target object identification device of the present embodiment is used to implement the corresponding target object identification method in the foregoing method embodiments, and has the beneficial effects of the corresponding method embodiments, and details are not described herein again.
  • the embodiment of the present application provides a computer readable storage medium having stored thereon computer program instructions, wherein the program instructions are executed by a processor to implement the steps of the target object identification method described in any of the foregoing embodiments, and have corresponding The beneficial effects of the embodiments are not described herein.
  • FIG. 8 is a block diagram showing the structure of an electronic device 800 suitable for implementing the terminal device or server of the embodiment of the present application.
  • electronic device 800 includes one or more processors, communication elements, etc., such as one or more central processing units (CPUs) 801, and/or one or more An image processor (GPU) 813 or the like, the processor may execute various kinds according to executable instructions stored in a read only memory (ROM) 802 or executable instructions loaded from the storage portion 808 into the random access memory (RAM) 803. Proper action and handling.
  • the communication component includes a communication component 812 and a communication interface 809.
  • the communication component 812 can include, but is not limited to, a network card.
  • the network card can include, but is not limited to, an IB (Infiniband) network card.
  • the communication interface 809 includes a communication interface of a network interface card such as a LAN card, a modem, etc., and the communication interface 809 is via an Internet interface.
  • the network performs communication processing.
  • the processor can communicate with the read only memory 802 and/or the random access memory 803 to execute executable instructions, connect to the communication component 812 via the bus 804, and communicate with other target devices via the communication component 812, thereby completing the embodiments of the present application.
  • the operation corresponding to any one of the methods for example, the object to be inspected is subjected to target object detection, and the target object prediction information of the object is obtained, and the target object prediction information is the confidence information that the detected object is the target object;
  • the object of the image to be inspected performs key point detection to obtain key point prediction information of the object;
  • the key point prediction information is information that the key point of the detected object is the key point of the target object;
  • the target object prediction information and the key point prediction information are fused to obtain comprehensive prediction information of the object; and the target object is identified according to the comprehensive prediction information.
  • RAM 803 various programs and data required for the operation of the device can be stored.
  • the CPU 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
  • ROM 802 is an optional module.
  • the RAM 803 stores executable instructions or writes executable instructions to the ROM 802 at runtime, the executable instructions causing the processor 801 to perform operations corresponding to the above-described communication methods.
  • An input/output (I/O) interface 805 is also coupled to bus 804.
  • the communication component 812 can be integrated or can be configured to have multiple sub-modules (eg, multiple IB network cards) and be on a bus link.
  • the following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, etc.; an output portion 807 including, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a storage portion 808 including a hard disk or the like. And a communication interface 809 including a network interface card such as a LAN card, modem, or the like.
  • Driver 810 is also coupled to I/O interface 805 as needed.
  • a removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like, is mounted on the drive 810 as needed so that a computer program read therefrom is installed into the storage portion 808 as needed.
  • FIG. 8 is only an optional implementation manner.
  • the number and type of components in the foregoing FIG. 8 may be selected, deleted, added, or replaced according to actual needs;
  • implementations such as separate settings or integrated settings may also be adopted.
  • the GPU and the CPU may be detachably set or the GPU may be integrated on the CPU, the communication component 812 may be separately configured, or may be integrated in the CPU or GPU.
  • the communication component 812 may be separately configured, or may be integrated in the CPU or GPU.
  • embodiments of the present application include a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart, the program code comprising the corresponding execution
  • An instruction corresponding to the method step provided by the embodiment of the present application for example, performing target object detection on an object to be inspected, obtaining executable code of target object prediction information of the object, and the target object prediction information is detected.
  • the object is the confidence information of the target object; the key point detection is performed on the object of the image to be inspected, and the executable code of the key point prediction information of the object is obtained; the key point prediction information is the detected object a key point is confidence information of a key point of the target object; an executable code for fusing the target object prediction information and the key point prediction information to obtain comprehensive prediction information of the object; An executable code that predicts information identifying the target object.
  • the computer program can be downloaded and installed from the network via a communication component, and/or installed from the removable media 811.
  • CPU central processing unit
  • the electronic device provided by the embodiment of the present application can obtain the target object prediction information of the object in the process of performing target object detection on the object to be inspected, in the process of performing key point detection on the image to be detected. Obtaining key point prediction information of the object, and fusing the target object prediction information and the key point prediction information, performing comprehensive prediction evaluation of the target object on the object to be inspected, and obtaining an image indicating that the image to be inspected is used for the target The comprehensive prediction information of the integrated image quality of the object recognition further identifies the target object based on the comprehensive prediction evaluation result.
  • the methods, apparatus, and apparatus of the present application may be implemented in a number of ways.
  • the method, apparatus, and apparatus of the embodiments of the present application can be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above-described sequence of steps for the method is for illustrative purposes only, and the steps of the method of the embodiments of the present application are not limited to the order specifically described above unless otherwise specifically stated.
  • the present application may also be embodied as a program recorded in a recording medium, the programs including machine readable instructions for implementing a method in accordance with embodiments of the present application.
  • the present application also covers a recording medium storing a program for executing the method according to the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

本申请实施例提供一种目标对象识别方法、装置、存储介质和电子设备。目标对象识别方法包括:对待检图像的对象进行目标对象检测,获得所述对象的目标对象预测信息,目标对象预测信息为检测到的对象为目标对象的置信信息;对待检图像的所述对象进行关键点检测,获得所述对象的关键点预测信息;关键点预测信息为检测到对象的关键点为目标对象的关键点的置信信息;将目标对象预测信息以及关键点预测信息进行融合,获得所述对象的综合预测信息;根据综合预测信息对目标对象进行识别。

Description

目标对象识别方法、装置、存储介质和电子设备
相关申请的交叉引用
本申请基于申请号为201711181299.5、申请日为2017年11月23日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请实施例涉及计算机视觉技术但不限于计算机视觉技术,尤其涉及一种目标对象识别方法、装置、存储介质和电子设备。
背景技术
对对象的识别过程通常分为检测跟踪、关键点检测和对齐以及特征提取的处理。在此过程中,期望尽可能准确地识别到目标对象,而同时降低误判,也就是说,追求最高的识别率和最低的误报率。但是目前相关技术对目标对象(例如,人脸的识别)的识别依然存在较高的误报率,即达不到预期的识别率。
发明内容
本申请实施例提供一种目标对象识别方法,包括:对待检图像的对象进行目标对象检测,获得所述对象的目标对象预测信息,所述目标对象预测信息为检测到的对象为目标对象的置信信息;对所述待检图像的所述对象进行关键点检测,获得所述对象的关键点预测信息;所述关键点预测信息为检测到对象的关键点为目标对象的关键点的置信信息;将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息;根据所述综合预测信息对所述目标对象进行识别。
本申请实施例提供一种目标对象识别装置,包括:对象检测模块, 配置为对待检图像的对象进行目标对象检测,获得所述对象的目标对象预测信息,所述目标对象预测信息为检测到的对象为目标对象的置信信息;关键点检测模块,配置为对所述待检图像的所述对象进行关键点检测,获得所述对象的关键点预测信息;所述关键点预测信息为检测到对象的关键点为目标对象的关键点的置信信息;预测信息融合模块,配置为将所述对象检测模块获得的目标对象预测信息以及所述关键点检测模块获得的关键点预测信息进行融合,获得所述对象的综合预测信息;对象识别模块,配置为根据所述预测信息融合模块获得的综合预测信息对所述目标对象进行识别。
本申请实施例提供一种电子设备,包括:处理器、存储器、通信元件和通信总线,所述处理器、所述存储器和所述通信元件通过所述通信总线完成相互间的通信;所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行任一如前所述的目标对象识别方法相应的操作。
本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序指令,其中,所述程序指令被处理器执行时实现任一如前所述的目标对象识别方法的步骤。
本申请实施例提供一种计算机程序,其包括有计算机程序指令,其中,所述程序指令被处理器执行时实现任一如前所述的目标对象识别方法的步骤。
根据本申请实施例提供的目标对象识别方案,可通过在对待检图像的对象进行目标对象检测的过程中,获得所述对象的目标对象预测信息,在对所述待检图像进行关键点检测的过程中,获得所述对象的关键点预测信息,以及将所述目标对象预测信息以及所述关键点预测信息进行融合,对待检图像的对象进行目标对象的综合预测评估,获得指示待检图像用于目标对象识别的综合图像质量的综合预测信息, 再进一步根据所述综合预测评估结果对所述目标对象进行识别。通过前述综合预测评估,能够过滤掉综合质量相对低的待检图像,从而降低对目标对象进行处理时产生的误报率;此外,通过对待检图像的对象进行综合评估,还可确保得到较高的识别率。
附图说明
图1是示出根据本申请实施例提供的目标对象识别方法的流程图;
图2是示出根据本申请实施例提供的目标对象识别方法的流程图;
图3是示出根据本申请实施例提供的目标对象识别方法的流程图;
图4是示出根据本申请实施例提供的目标对象识别方法的流程图;
图5是示出根据本申请实施例提供的目标对象识别装置的逻辑框图;
图6是示出根据本申请实施例提供的目标对象识别装置的逻辑框图;
图7是示出根据本申请实施例提供的目标对象识别装置的逻辑框图;
图8是示出根据本申请实施例提供的电子设备的结构示意图。
具体实施方式
下面结合附图详细描述本申请实施例的示例性实施例。
在本申请中,“多个”指两个或两个以上,“至少一个”指一个、两个或两个以上。对于本申请中提及的任一部件、数据或结构,在没有明确限定一个的情况下,可理解为一个或多个。
图1是示出根据本申请实施例的目标对象识别方法的流程图。
参照图1,在步骤S110,对待检图像的对象进行目标对象检测, 获得所述对象的目标对象预测信息,所述目标对象预测信息为检测到的对象为目标对象的置信信息。
这里的待检图像是拍摄有一个或多个物体对象的照片或视频帧图像。该图像应满足一定的分辨率要求,至少通过肉眼能够辨别出拍摄到的物体对象。这里的目标对象即为意图要识别的物体对象,包括但不限于人脸、行人、车辆、狗、猫、身份证等。
可通过任何适用的图像分析、处理方法,对待检图像的对象进行目标对象检测,以从待检图像检测出可能存在目标对象的图像区域,该图像区域为可能含有目标对象的矩形框图像区域,或者,基于初步检测的目标对象的外轮廓图像区域。
在待检图像中,可能存在多个对象,并且在对每个目标对象进行检测时,还可能检测到多个矩形框图像区域。因此,在进行目标对象检测的过程中,还对检测到的每个矩形框图像区域进行预测准确性的评估,获得目标对象预测信息,该目标对象预测信息表征检测到的对象为目标对象的预测准确信息;例如,该目标对象信息表征检测到的图像区域为目标对象的预测准确信息。
该目标对象预测信息包括但不限于评估打分、预测概率或检测置信度等。
在步骤S120,对所述待检图像的所述对象进行关键点检测,获得所述对象的关键点预测信息,所述关键点预测信息为检测到的对象的关键点为目标对象的关键点的置信信息。
对于任一意图检测的目标对象,预先设定目标对象的关键点定位。此处的关键点定位包括:检测出目标对象的关键点在图像的图像坐标。例如,针对人脸,通常可设定5个关键点,分别为嘴部、鼻部、左眼、右眼、头顶;针对人体/行人,可在人体各个关键部位设定14个关键点。
可通过任何适用的针对图像的关键点定位方法,从待检图像检测得到目标对象的关键点的信息。此外,在对待检图像的对象进行检测的过程中,还对检测得到对象的关键点进行定位准确性的评估,即为关键点预测信息,关键点预测信息表征检测到的对象的关键点为目标对象的关键点的置信信息。
该关键点预测信息包括但不限于评估打分、预测概率或检测置信度等。针对多个关键点的检测,可根据对多个关键点的评估打分求平均,获得该关键点预测信息。
这里,需要指出,由于根据现有的计算机视觉技术,步骤S120的执行无需依赖步骤S110的检测结果,即,可在未检测目标对象的情况下,直接对待检图像的对象进行关键点检测,因此,可顺序地执行步骤S110和步骤S120,顺序地执行步骤S120和步骤S110,或者并行地执行步骤S110和步骤S120。
在步骤S130,将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息。
基于用于指示目标对象检测的目标对象预测信息和用于指示关键点对齐的关键点预测信息,可通过将两者求平均、求和或相乘的方式进行融合,获得检测到对象的综合预测信息。
由于通过至少融合表征目标对象检测准确性的目标对象预测信息和表征关键点定位准确性的关键点预测信息两个预测准确性指标获得该综合预测信息,这两个预测准确性都将影响目标对象识别的结果,因此,该综合预测信息可用于指示待检图像用于目标对象识别的综合图像质量。
在步骤S140,根据所述综合预测信息,对所述目标对象进行识别。
例如,如果获得的综合预测信息符合预定的预测质量阈值,则继续对待检图像的对象进行目标对象识别;反之,可推定针对目标对象 检测的综合预测质量不高,不对待检图像的对象进行目标对象识别处理,或者在对该待检图像进行滤波、剪裁、放大、调亮处理后,再进行目标识别处理。
再例如,假设该待检图像为相机拍摄得到的预览图像,如果确定的综合预测信息符合预定的预测质量阈值,则根据任何适用的目标对象识别方法,从待检图像识别目标对象。
根据本申请实施例的目标对象识别方法,可通过在对待检图像的对象进行目标对象检测的过程中,获得所述对象的目标对象预测信息,在对所述待检图像进行关键点检测的过程中,获得所述对象的关键点预测信息,以及将所述目标对象预测信息以及所述关键点预测信息进行融合,对待检图像的对象进行目标对象的综合预测评估,获得指示待检图像用于目标对象识别的综合图像质量的综合预测信息,再进一步根据所述综合预测评估结果对所述目标对象进行识别。通过前述综合预测评估,能够过滤掉综合质量相对低的待检图像,从而降低对目标对象进行处理时产生的误报率;此外,通过对待检图像的对象进行综合评估,还可确保得到较高的识别率。
图2是示出根据本申请实施例的目标对象识别方法的流程图。
参照图2,在步骤S210,获取所述待检图像的对象对应的图像区域。
可通过使用的图像分析方法,检测到可能包含目标对象的图像区域,如可能含有对象的外接矩形的图像区域。
在步骤S220,对待检图像的对象对应的图像区域进行目标对象检测,获得所述对象的目标对象预测信息。
在获取到可能含有目标对象的图像区域后,可通过适用的图像分析方法对该图像区域进行目标对象检测的处理,并获得对象的目标对象预测信息。
此外,在一些实施例中,可通过预先训练的用于物体检测的神经网络,神经网络包括但不限于如区域候选网络、卷积神经网络等,从该图像区域检测目标对象,并且获取指示目标对象检测准确性的目标对象预测信息,以提高对象检测的识别率。
在步骤S230,对待检图像的对象对应的图像区域进行关键点检测,获得所述对象的关键点预测信息。
同理,在获取到可能含有目标对象的图像区域后,可针对该图像区域进行关键点检测,来获得对象的关键点预测信息。
在步骤S240,将所述目标对象预测信息以及所述关键点预测信息相乘,得到所述对象的综合预测信息。
这里,通过将所述目标对象预测信息以及所述关键点预测信息相乘,能够突显目标对象预测准确性高和关键点预测准确性高的待检图像,从而在目标对象识别任务中,优先召回综合质量好的待检图像。同时,通过调整用于综合质量评估的拣选门限,可确保较高的识别率。
在步骤S250,根据所述综合预测信息对所述目标对象进行识别。该步骤的处理与前述步骤S140的处理类似,在此不予赘述。
在步骤S260,可执行以下操作中的任一个。
操作一、前述待检图像是视频帧序列中的视频帧图像,根据从多个所述视频帧图像对目标对象进行识别的结果,对所述目标对象进行跟踪,由此执行对象跟踪的任务。
操作二、根据为多个待检图像各自获得的综合预测信息,从多个所述待检图像选择综合预测质量最高的待检图像作为抓拍图像。例如,在拍摄过程中,可从在2秒钟内抓取的多个图像(预览图像)当中,选择综合预测质量最高的图像作为抓拍图像,存储在存储器中并显示给用户。
操作三、根据为多个所述待检图像各自获得的综合预测信息,从 多个所述待检图像选择预定个数的待检图像,对选择的待检图像进行特征融合,融合的图像特征数据可进一步用于检测或处理的任务。
以上仅描述了几种对待检图像的对象进行用于目标对象的示例性处理,需要理解,可用于任何图像处理任务。
根据本申请实施例的目标对象识别方法,先获取所述待检图像的对象对应的图像区域,再针对该图像区域进行目标对象检测和关键点检测,获得对象的目标对象预测信息和关键点预测信息,再将所述目标对象预测信息以及所述关键点预测信息相乘,得到所述对象的综合预测信息。此外,在根据综合预测信息对目标对象进行识别之后,进一步进行例如目标对象跟踪、抓拍图像选定以及图像特征融合等处理,从而可基于综合图像质量评估,更好地执行其他与目标对象相关的图像处理任务。
图3是示出根据本申请实施例的目标对象识别方法的流程图。
参照图3,在步骤S310,获取所述待检图像的对象对应的图像区域。
在步骤S320,对待检图像的对象对应的图像区域进行目标对象检测,获得所述对象的目标对象预测信息。
在步骤S330,利用定位关键点的第一神经网络模型,对所述待检图像的对象进行关键点检测,获得所述对象的关键点预测信息。
在一些实施例中,使用预先训练的用于对物体候选框进行关键点定位的第一神经网络模型,直接对获取到的图像区域进行关键点检测,获取对象的关键点以及相应的关键点预测信息。
根据本申请的另一种实施方式,利用对待检图像定位关键点的第一神经网络模型,从所述待检图像获取所述对象的关键点以及相应的关键点预测信息。也就是说,可将待检图像而不是对象对应的图像区域作为该第一神经网络模型的输入,先行从待检图像检测关键点。
此后,在步骤S340,从待检图像的对象对应的图像区域,检测所述对象的偏转角度信息。
通常,在目标对象检测过程中,也同时检测对象的偏转角度,因此,通过步骤S340的处理,可检测到对象的偏转角度信息。
该偏转角度可包括水平方向的偏转角度(侧转角度),也可包括垂直方向的偏转角度(俯仰角度),或者水平方向的偏转角度(侧转角度)和垂直方向的偏转角度(俯仰角度)。
例如,可利用对象分类的第二神经网络模型,从待检图像的对象对应的图像区域,检测对象并获取对象的偏转角度信息。可预先训练用于检测对象的偏转角度信息的第二神经网络模型。也可通过其他图像分析方法来获取偏转角度信息。
在步骤S350,将所述目标对象预测信息、所述关键点预测信息和所述偏转角度信息进行融合,获得所述对象的综合预测信息。
由于非正面的对象的偏转角度通常会影响对目标对象的识别效果,因此将对象的偏转角度信息也作为图像质量评估的指标之一。
与前述步骤S130的处理类似地,对可表征目标对象检测准确性的目标对象预测信息、表征关键点定位准确性的关键点预测信息以及对象的偏转角度信息进行例如求平均、求和或相乘等方式进行融合,获得对象的综合预测信息。
在步骤S360,根据所述综合预测信息对所述目标对象进行识别。
在此基础上,在一些实施例中,可继续执行前述步骤S260的处理。
根据本申请任意实施例的目标对象识别方法,将从待检图像的对象对应的图像区域检测到的对象的偏转角度信息也作为评估指标之一,将偏转角度信息与前述目标对象预测信息和关键点预测信息进行融合,来对待检图像的对象进行用于目标对象识别相关的综合质量评估,再进一步根据所述综合预测评估结果对所述目标对象进行识别。通过这 种方法,有助于针对影响目标对象识别来评估综合图像质量,过滤掉综合质量相对低的待检图像,从而降低对目标对象进行识别时产生的误报率,还可确保得到较高的识别率,从而更准确地执行目标对象识别任务。
图4是示出根据本申请实施例的目标对象识别方法的流程图。
在本实施例中,以目标对象为人脸作为示例来描述目标对象识别方法的处理。
参照图4,在步骤S410,对待检图像的对象进行人脸检测,获得人脸的目标对象预测信息。
可通过适用的任何人脸检测方法对待检图像的对象进行人脸检测,并获得人脸的目标对象预测信息。
在步骤S420,利用定位关键点的第一神经网络模型,对所述待检图像的所述对象进行关键点检测,获得所述人脸的关键点预测信息。
在步骤S430,获取所述待检图像中的人脸俯仰角度和/或人脸侧转角度。
人脸俯仰角度是指以水平方向为轴,人脸的偏转角度;人脸侧转角度是指以竖直方向为轴,人脸的偏转角度。
通常,人脸俯仰角度和人脸侧转角度的取值范围均为-90度~+90度。
在一些实施例中,通过前述第二神经网络模型,从检测到的人脸图像区域,检测人脸并获取人脸俯仰角度和/或人脸侧转角度。
在该步骤,可获取人脸俯仰角度和人脸侧转角度当中的任一个或两者,以用于后续处理。
在步骤S440,根据适用指数函数将所述人脸俯仰角度和/或人脸侧转角度进行归一化处理。
例如,通过指数函数exp(-10×人脸俯仰角度×人脸俯仰角度/8100) 对人脸俯仰角度进行归一化处理;类似地,通过指数函数exp(-10×人脸侧转角度×人脸侧转角度/8100)对人脸侧转角度进行归一化处理。或者,可简单地使用公式|人脸俯仰角度/90|和|人脸侧转角度值/90|分别对人脸俯仰角度和人脸侧转角度进行归一化处理。此后,再将经过归一化处理的人脸俯仰角度和人脸侧转角度进行融合,如将两者相乘,生成目标对象的角度评估信息。
在步骤S450,通过以下操作之一获得所述对象的综合预测信息:
将所述目标对象预测信息、所述关键点预测信息和归一化的人脸俯仰角度相乘,获得所述对象的综合预测信息;
或,
将所述目标对象预测信息、所述关键点预测信息和归一化的人脸侧转角度相乘,获得所述对象的综合预测信息;
或,
将所述目标对象预测信息、所述关键点预测信息、归一化的人脸俯仰角度和归一化的人脸侧转角度相乘,获得所述对象的综合预测信息。
也就是说,可根据人脸识别任务的需要,将归一化的人脸俯仰角度和归一化的人脸侧转角度当中的任一个或两者与目标对象预测信息和关键点预测信息进行融合,来获得对象的综合预测信息。
例如,如果获得的综合预测信息符合预定的预测质量阈值,则继续通过适用的人脸识别方法对待检图像的对象进行人脸识别。
此外,可采用任何现有的网络训练方法预先训练用于物体检测的神经网络、用于定位关键点的第一神经网络模型以及/或者用于对象分类的第二神经网络模型。可根据要实现的功能、特性和训练要求,使用基于监督学习方法、无监督方法、强化学习方法或半监督方法等预先训练前述神经网络模型。
根据本申请实施例的目标对象识别方法,在前述实施例的基础上,可通过预先训练的模型来进行人脸的关键点定位以及偏转角度检测等,以确保人脸检测的准确性,并且对获得的目标对象预测信息、关键点预测信息以及归一化的人脸俯仰角度和/或归一化的人脸侧转角度进行融合,来获得用于人脸识别相关的综合质量数据,再进一步根据所述综合预测评估结果对人脸进行识别。通过这种方法,有助于针对影响人脸识别来评估综合图像质量,过滤掉综合质量相对低的待检图像,从而降低对人脸进行识别时产生的误报率,还可确保得到较高的识别率,从而更准确地执行人脸识别任务。
参照图5,一种目标对象识别装置包括:
对象检测模块510,配置为对待检图像的对象进行目标对象检测,获得所述对象的目标对象预测信息,所述目标对象预测信息为检测到的对象为目标对象的置信信息;
关键点检测模块520,配置为对所述待检图像的所述对象进行关键点检测,获得所述对象的关键点预测信息;所述关键点预测信息为检测到对象的关键点为目标对象的关键点的置信信息;
预测信息融合模块530,配置为将对象检测模块510获得的目标对象预测信息以及关键点检测模块520获得的关键点预测信息进行融合,获得所述对象的综合预测信息;
对象识别模块540,配置为根据所述预测信息融合模块获得的综合预测信息对所述目标对象进行识别。
本实施例的目标对象识别装置用于实现前述方法实施例中相应的目标对象识别方法,并具有相应的方法实施例的有益效果,在此不再赘述。
参照图6,本实施例提供的目标对象识别装置除包括前述对象检测模块510、关键点检测模块520、预测信息融合模块530和对象识别 模块540以外,还包括图像区域获取模块550。
图像区域获取模块550配置为获取所述待检图像的对象对应的图像区域。相应地,对象检测模块510配置为对图像区域获取模块550获取的待检图像的对象对应的图像区域进行目标对象检测;关键点检测模块520用于对图像区域获取模块550获取的待检图像的对象对应的图像区域进行关键点检测。
在一些实施例中,预测信息融合模块530配置为将所述目标对象预测信息以及所述关键点预测信息相乘,得到所述对象的综合预测信息。
在一些实施例中,关键点检测模块520配置为利用定位关键点的神经网络模型,对所述待检图像的对象进行关键点检测,获得所述对象的关键点预测信息。
在一些实施例中,所述获取所述待检图像的对象对应的图像区域之后,所述将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息之前,所述装置还包括:偏转角度检测模块560A,配置为从图像区域获取模块550获取的图像区域,检测所述对象的偏转角度信息。相应地,预测信息融合模块530用于根据所述目标对象预测信息、所述关键点预测信息和所述偏转角度信息进行融合,获得所述对象的综合预测信息。
在一些实施例中,偏转角度检测模块560A配置为利用对象分类的神经网络模型,从所述图像区域检测所述对象的偏转角度信息。
在一些实施例中,所述待检图像为视频帧图像;在根据所述综合预测信息对所述目标对象进行识别之后,所述装置还包括:
对象跟踪模块570,配置为根据从多个所述视频帧图像对目标对象进行识别的结果,对所述目标对象进行跟踪;
或者,
抓拍图像选取模块580,配置为根据为多个所述视频帧图像各自获得的综合预测信息,从多个所述视频帧图像选择综合预测质量最高的视频帧图像作为抓拍图像;
或者,
特征融合模块590,配置为根据为多个所述视频帧图像各自获得的综合预测信息,从多个所述视频帧图像选择预定个数的视频帧图像,对选择的视频帧图像进行特征融合。
根据本申请实施例,所述目标对象可为:人脸。
参照图7,目标对象识别装置除包括前述对象检测模块510、关键点检测模块520、预测信息融合模块530、对象识别模块540和图像区域获取模块550以外,还包括人脸偏转角度检测模块560B。
在预测信息融合模块530将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息之前,人脸偏转角度检测模块560B配置为从图像区域获取模块550获取的图像区域,检测人脸俯仰角度和/或人脸侧转角度。
相应地,预测信息融合模块530配置为
根据适用指数函数将所述人脸俯仰角度和/或人脸侧转角度进行归一化处理;将所述目标对象预测信息、所述关键点预测信息和归一化的人脸俯仰角度相乘,获得所述对象的综合预测信息;
或,
将所述目标对象预测信息、所述关键点预测信息和归一化的人脸侧转角度相乘,获得所述对象的综合预测信息;
或,
将所述目标对象预测信息、所述关键点预测信息、归一化的人脸俯仰角度和归一化的人脸侧转角度相乘,获得所述对象的综合预测信息。
在一些实施例中,该目标对象识别装置还包括对象跟踪模块570、抓拍图像选取模块580或特征融合模块590。
本实施例的目标对象识别装置用于实现前述方法实施例中相应的目标对象识别方法,并具有相应的方法实施例的有益效果,在此不再赘述。
本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序指令,其中,所述程序指令被处理器执行时实现前述任意实施例所述的目标对象识别方法的步骤,并具有相应的实施例的有益效果,在此不再赘述。
本申请实施例提供了一种电子设备,例如可以是移动终端、个人计算机(PC)、平板电脑、服务器等。下面参考图8,图8示出了适于用来实现本申请实施例的终端设备或服务器的电子设备800的结构示意图。
如图8所示,电子设备800包括一个或多个处理器、通信元件等,所述一个或多个处理器例如:一个或多个中央处理单元(CPU)801,和/或一个或多个图像处理器(GPU)813等,处理器可以根据存储在只读存储器(ROM)802中的可执行指令或者从存储部分808加载到随机访问存储器(RAM)803中的可执行指令而执行各种适当的动作和处理。通信元件包括通信组件812和通信接口809。其中,通信组件812可包括但不限于网卡,所述网卡可包括但不限于IB(Infiniband)网卡,通信接口809包括诸如LAN卡、调制解调器等的网络接口卡的通信接口,通信接口809经由诸如因特网的网络执行通信处理。
处理器可与只读存储器802和/或随机访问存储器803中通信以执行可执行指令,通过总线804与通信组件812相连、并经通信组件812与其他目标设备通信,从而完成本申请实施例提供的任一项方法对应的操作,例如,对待检图像的对象进行目标对象检测,获得所述对象 的目标对象预测信息,所述目标对象预测信息为检测到的对象为目标对象的置信信息;对所述待检图像的所述对象进行关键点检测,获得所述对象的关键点预测信息;所述关键点预测信息为检测到对象的关键点为目标对象的关键点的置信信息;将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息;根据所述综合预测信息对所述目标对象进行识别。
此外,在RAM 803中,还可存储有装置操作所需的各种程序和数据。CPU 801、ROM 802以及RAM 803通过总线804彼此相连。在有RAM 803的情况下,ROM 802为可选模块。RAM 803存储可执行指令,或在运行时向ROM 802中写入可执行指令,可执行指令使处理器801执行上述通信方法对应的操作。输入/输出(I/O)接口805也连接至总线804。通信组件812可以集成设置,也可以设置为具有多个子模块(例如多个IB网卡),并在总线链接上。
以下部件连接至I/O接口805:包括键盘、鼠标等的输入部分806;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分807;包括硬盘等的存储部分808;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信接口809。驱动器810也根据需要连接至I/O接口805。可拆卸介质811,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器810上,以便于从其上读出的计算机程序根据需要被安装入存储部分808。
需要说明的是,如图8所示的架构仅为一种可选实现方式,在具体实践过程中,可根据实际需要对上述图8的部件数量和类型进行选择、删减、增加或替换;在不同功能部件设置上,也可采用分离设置或集成设置等实现方式,例如GPU和CPU可分离设置或者可将GPU集成在CPU上,通信组件812可分离设置,也可集成设置在CPU或GPU上,等等。这些可替换的实施方式均落入本申请的保护范围。
特别地,根据本申请实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本申请实施例包括一种计算机程序产品,其包括有形地包含在机器可读介质上的计算机程序,计算机程序包含用于执行流程图所示的方法的程序代码,程序代码可包括对应执行本申请实施例提供的方法步骤对应的指令,例如,用于对待检图像的对象进行目标对象检测,获得所述对象的目标对象预测信息的可执行代码,所述目标对象预测信息为检测到的对象为目标对象的置信信息;用于对所述待检图像的所述对象进行关键点检测,获得所述对象的关键点预测信息的可执行代码;所述关键点预测信息为检测到对象的关键点为目标对象的关键点的置信信息;用于将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息的可执行代码;用于根据所述综合预测信息对所述目标对象进行识别的可执行代码。在这样的实施例中,该计算机程序可以通过通信元件从网络上被下载和安装,和/或从可拆卸介质811被安装。在该计算机程序被中央处理单元(CPU)801执行时,执行本申请实施例的方法中限定的上述功能。
本申请实施例还提供的电子设备,可通过在对待检图像的对象进行目标对象检测的过程中,获得所述对象的目标对象预测信息,在对所述待检图像进行关键点检测的过程中,获得所述对象的关键点预测信息,以及将所述目标对象预测信息以及所述关键点预测信息进行融合,对待检图像的对象进行目标对象的综合预测评估,获得指示待检图像用于目标对象识别的综合图像质量的综合预测信息,再进一步根据所述综合预测评估结果对所述目标对象进行识别。通过前述综合预测评估,能够过滤掉综合质量相对低的待检图像,从而降低对目标对象进行处理时产生的误报率;此外,通过对待检图像的对象进行综合评估,还可确保得到较高的识别率。
需要指出,根据实施的需要,可将本申请中描述的各个部件/步骤拆分为更多部件/步骤,也可将两个或多个部件/步骤或者部件/步骤的部分操作组合成新的部件/步骤,以实现本申请实施例的目的。
可能以许多方式来实现本申请的方法和装置、设备。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本申请实施例的方法和装置、设备。用于方法的步骤的上述顺序仅是为了进行说明,本申请实施例的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本申请实施为记录在记录介质中的程序,这些程序包括用于实现根据本申请实施例的方法的机器可读指令。因而,本申请还覆盖存储用于执行根据本申请的方法的程序的记录介质。
本申请实施例的描述是为了示例和描述起见而给出的,而并不是无遗漏的或者将本申请限于所公开的形式。很多修改和变化对于本领域的普通技术人员而言是显然的。选择和描述实施例是为了更好说明本申请的原理和实际应用,并且使本领域的普通技术人员能够理解本申请从而设计适于特定用途的带有各种修改的各种实施例。

Claims (19)

  1. 一种目标对象识别方法,包括:
    对待检图像的对象进行目标对象检测,获得所述对象的目标对象预测信息,所述目标对象预测信息为检测到的对象为目标对象的置信信息;
    对所述待检图像的所述对象进行关键点检测,获得所述对象的关键点预测信息;所述关键点预测信息为检测到对象的关键点为目标对象的关键点的置信信息;
    将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息;
    根据所述综合预测信息对所述目标对象进行识别。
  2. 根据权利要求1所述的方法,其中,所述对待检图像的对象进行目标对象检测,和对所述待检图像的所述对象进行关键点检测之前,包括:
    获取所述待检图像的对象对应的图像区域;
    所述对待检图像的对象进行目标对象检测,包括:
    对待检图像的对象对应的图像区域进行目标对象检测;
    对所述待检图像的对象进行关键点检测,包括:
    对待检图像的对象对应的图像区域进行关键点检测。
  3. 根据权利要求1或2所述的方法,其中,所述将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息,包括:
    将所述目标对象预测信息以及所述关键点预测信息相乘,得到所述对象的综合预测信息。
  4. 根据权利要求1~3中任一项所述的方法,其中,所述对所述待 检图像的所述对象进行关键点检测,获得所述对象的关键点预测信息,包括:
    利用定位关键点的神经网络模型,对所述待检图像的对象进行关键点检测,获得所述对象的关键点预测信息。
  5. 根据权利要求2至4中任一项所述的方法,其中,所述获取所述待检图像的对象对应的图像区域之后,所述将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息之前,还包括:
    从所述图像区域,检测所述对象的偏转角度信息;
    所述将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息,包括:
    根据所述目标对象预测信息、所述关键点预测信息和所述偏转角度信息进行融合,获得所述对象的综合预测信息。
  6. 根据权利要求5所述的方法,其中,所述从所述图像区域,检测所述对象的偏转角度信息,包括:
    利用对象分类的神经网络模型,从所述图像区域检测所述对象的偏转角度信息。
  7. 根据权利要求1至6中任一项所述的方法,其中,
    所述目标对象为:人脸;
    所述将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息之前,还包括:
    从所述图像区域,检测人脸俯仰角度和/或人脸侧转角度;
    所述将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息,包括:
    根据适用指数函数将所述人脸俯仰角度和/或人脸侧转角度进行归一化处理;将所述目标对象预测信息、所述关键点预测信息和归一 化的人脸俯仰角度相乘,获得所述对象的综合预测信息;
    或,
    将所述目标对象预测信息、所述关键点预测信息和归一化的人脸侧转角度相乘,获得所述对象的综合预测信息;
    或,
    将所述目标对象预测信息、所述关键点预测信息、归一化的人脸俯仰角度和归一化的人脸侧转角度相乘,获得所述对象的综合预测信息。
  8. 根据权利要求1至7中任一项所述的方法,其中,所述待检图像为视频帧图像;
    在根据所述综合预测信息对所述目标对象进行识别之后,还包括:
    根据从多个所述视频帧图像对目标对象进行识别的结果,对所述目标对象进行跟踪;
    或者,
    根据为多个所述视频帧图像各自获得的综合预测信息,从多个所述视频帧图像选择综合预测质量最高的视频帧图像作为抓拍图像;
    或者,
    根据为多个所述视频帧图像各自获得的综合预测信息,从多个所述视频帧图像选择预定个数的视频帧图像,对选择的视频帧图像进行特征融合。
  9. 一种目标对象识别装置,包括:
    对象检测模块,配置为对待检图像的对象进行目标对象检测,获得所述对象的目标对象预测信息,所述目标对象预测信息为检测到的对象为目标对象的置信信息;
    关键点检测模块,配置为对所述待检图像的所述对象进行关键点检测,获得所述对象的关键点预测信息;所述关键点预测信息为检测 到对象的关键点为目标对象的关键点的置信信息;
    预测信息融合模块,配置为将所述对象检测模块获得的目标对象预测信息以及所述关键点检测模块获得的关键点预测信息进行融合,获得所述对象的综合预测信息;
    对象识别模块,配置为根据所述预测信息融合模块获得的综合预测信息对所述目标对象进行识别。
  10. 根据权利要求9所述的装置,其中,所述装置还包括:
    图像区域获取模块,配置为获取所述待检图像的对象对应的图像区域;
    所述对象检测模块,配置为对所述图像区域获取模块获取的待检图像的对象对应的图像区域进行目标对象检测;
    所述关键点检测模块,配置为对所述图像区域获取模块获取的待检图像的对象对应的图像区域进行关键点检测。
  11. 根据权利要求9或10所述的装置,其中,所述预测信息融合模块,配置为将所述目标对象预测信息以及所述关键点预测信息相乘,得到所述对象的综合预测信息。
  12. 根据权利要求9至11中任一项所述的装置,其中,所述关键点检测模块,配置为利用定位关键点的神经网络模型,对所述待检图像的对象进行关键点检测,获得所述对象的关键点预测信息。
  13. 根据权利要求10至12中任一项所述的装置,其中,所述获取所述待检图像的对象对应的图像区域之后,所述将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息之前,所述装置还包括:
    偏转角度检测模块,配置为从所述图像区域获取模块获取的图像区域,检测所述对象的偏转角度信息;
    所述预测信息融合模块,配置为根据所述目标对象预测信息、所 述关键点预测信息和所述偏转角度信息进行融合,获得所述对象的综合预测信息。
  14. 根据权利要求13所述的装置,其中,所述偏转角度检测模块配置为利用对象分类的神经网络模型,从所述图像区域检测所述对象的偏转角度信息。
  15. 根据权利要求9至14中任一项所述的装置,其中,
    所述目标对象为:人脸;
    所述将所述目标对象预测信息以及所述关键点预测信息进行融合,获得所述对象的综合预测信息之前,所述装置还包括:
    人脸偏转角度检测模块,配置为从所述图像区域,检测人脸俯仰角度和/或人脸侧转角度;
    所述预测信息融合模块,配置为根据适用指数函数将所述人脸俯仰角度和/或人脸侧转角度进行归一化处理;将所述目标对象预测信息、所述关键点预测信息和归一化的人脸俯仰角度相乘,获得所述对象的综合预测信息;
    或,
    将所述目标对象预测信息、所述关键点预测信息和归一化的人脸侧转角度相乘,获得所述对象的综合预测信息;
    或,
    将所述目标对象预测信息、所述关键点预测信息、归一化的人脸俯仰角度和归一化的人脸侧转角度相乘,获得所述对象的综合预测信息。
  16. 根据权利要求9至15中任一项所述的装置,其中,所述待检图像为视频帧图像;
    在根据所述综合预测信息对所述目标对象进行识别之后,所述装置还包括:
    对象跟踪模块,配置为根据从多个所述视频帧图像对目标对象进行识别的结果,对所述目标对象进行跟踪;
    或者,
    抓拍图像选取模块,配置为根据为多个所述视频帧图像各自获得的综合预测信息,从多个所述视频帧图像选择综合预测质量最高的视频帧图像作为抓拍图像;
    或者,
    特征融合模块,配置为根据为多个所述视频帧图像各自获得的综合预测信息,从多个所述视频帧图像选择预定个数的视频帧图像,对选择的视频帧图像进行特征融合。
  17. 一种电子设备,包括:处理器、存储器、通信元件和通信总线,所述处理器、所述存储器和所述通信元件通过所述通信总线完成相互间的通信;
    所述存储器配置为存放至少一可执行指令,所述可执行指令使所述处理器执行如权利要求1至8中任一项所述的目标对象识别方法相应的操作。
  18. 一种计算机可读存储介质,其上存储有计算机程序指令,其中,所述程序指令被处理器执行时实现权利要求1至8中任一项所述的目标对象识别方法的步骤。
  19. 一种计算机程序,其包括有计算机程序指令,其中,所述程序指令被处理器执行时实现权利要求1至8中任一项所述的目标对象识别方法的步骤。
PCT/CN2018/111513 2017-11-23 2018-10-23 目标对象识别方法、装置、存储介质和电子设备 WO2019100888A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2020500847A JP6994101B2 (ja) 2017-11-23 2018-10-23 目標物認識方法、装置、記憶媒体および電子機器
SG11202000076WA SG11202000076WA (en) 2017-11-23 2018-10-23 Target object recognition method and device, storage medium, and electronic apparatus
KR1020207000574A KR20200015728A (ko) 2017-11-23 2018-10-23 목표 대상 인식 방법, 장치, 저장 매체 및 전자 기기
US16/734,336 US11182592B2 (en) 2017-11-23 2020-01-05 Target object recognition method and apparatus, storage medium, and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711181299.5A CN108229308A (zh) 2017-11-23 2017-11-23 目标对象识别方法、装置、存储介质和电子设备
CN201711181299.5 2017-11-23

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/734,336 Continuation US11182592B2 (en) 2017-11-23 2020-01-05 Target object recognition method and apparatus, storage medium, and electronic device

Publications (1)

Publication Number Publication Date
WO2019100888A1 true WO2019100888A1 (zh) 2019-05-31

Family

ID=62652693

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/111513 WO2019100888A1 (zh) 2017-11-23 2018-10-23 目标对象识别方法、装置、存储介质和电子设备

Country Status (6)

Country Link
US (1) US11182592B2 (zh)
JP (1) JP6994101B2 (zh)
KR (1) KR20200015728A (zh)
CN (1) CN108229308A (zh)
SG (1) SG11202000076WA (zh)
WO (1) WO2019100888A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210279883A1 (en) * 2020-03-05 2021-09-09 Alibaba Group Holding Limited Image processing method, apparatus, electronic device, and storage medium
CN113505763A (zh) * 2021-09-09 2021-10-15 北京爱笔科技有限公司 关键点检测方法、装置、电子设备及存储介质
CN113657155A (zh) * 2021-07-09 2021-11-16 浙江大华技术股份有限公司 一种行为检测方法、装置、计算机设备和存储介质
JP2022503426A (ja) * 2019-09-27 2022-01-12 ベイジン センスタイム テクノロジー デベロップメント カンパニー, リミテッド 人体検出方法、装置、コンピュータ機器及び記憶媒体

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018033137A1 (zh) * 2016-08-19 2018-02-22 北京市商汤科技开发有限公司 在视频图像中展示业务对象的方法、装置和电子设备
CN108229308A (zh) * 2017-11-23 2018-06-29 北京市商汤科技开发有限公司 目标对象识别方法、装置、存储介质和电子设备
CN109101901B (zh) * 2018-07-23 2020-10-27 北京旷视科技有限公司 人体动作识别及其神经网络生成方法、装置和电子设备
TWI751381B (zh) * 2018-09-26 2022-01-01 宏碁股份有限公司 機器視覺的效能評估方法與系統
CN109448007B (zh) * 2018-11-02 2020-10-09 北京迈格威科技有限公司 图像处理方法、图像处理装置及存储介质
CN111274852B (zh) * 2018-12-05 2023-10-31 北京猎户星空科技有限公司 目标对象关键点检测方法和装置
CN109800680A (zh) * 2018-12-29 2019-05-24 上海依图网络科技有限公司 一种确定视频中的对象的方法及装置
CN110335313B (zh) * 2019-06-17 2022-12-09 腾讯科技(深圳)有限公司 音频采集设备定位方法及装置、说话人识别方法及系统
CN110532891B (zh) * 2019-08-05 2022-04-05 北京地平线机器人技术研发有限公司 目标对象状态识别方法、装置、介质和设备
CN111062239A (zh) * 2019-10-15 2020-04-24 平安科技(深圳)有限公司 人体目标检测方法、装置、计算机设备及存储介质
CN111105442B (zh) * 2019-12-23 2022-07-15 中国科学技术大学 切换式目标跟踪方法
CN111079717B (zh) * 2020-01-09 2022-02-22 西安理工大学 一种基于强化学习的人脸识别方法
CN111507244B (zh) * 2020-04-15 2023-12-08 阳光保险集团股份有限公司 Bmi检测方法、装置以及电子设备
CN112612434A (zh) * 2020-12-16 2021-04-06 杭州当虹科技股份有限公司 一种基于ai技术视频竖屏解决方法
JPWO2022130616A1 (zh) * 2020-12-18 2022-06-23
CN114220063B (zh) * 2021-11-17 2023-04-07 浙江大华技术股份有限公司 目标检测方法及装置
CN115631525B (zh) * 2022-10-26 2023-06-23 万才科技(杭州)有限公司 基于人脸边缘点识别的保险即时匹配方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160033552A (ko) * 2014-09-18 2016-03-28 한화테크윈 주식회사 키포인트 기술자 매칭 및 다수결 기법 기반 얼굴 인식 시스템 및 방법
CN106485230A (zh) * 2016-10-18 2017-03-08 中国科学院重庆绿色智能技术研究院 基于神经网络的人脸检测模型的训练、人脸检测方法及系统
CN106778585A (zh) * 2016-12-08 2017-05-31 腾讯科技(上海)有限公司 一种人脸关键点跟踪方法和装置
CN108229308A (zh) * 2017-11-23 2018-06-29 北京市商汤科技开发有限公司 目标对象识别方法、装置、存储介质和电子设备

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4264663B2 (ja) 2006-11-21 2009-05-20 ソニー株式会社 撮影装置、画像処理装置、および、これらにおける画像処理方法ならびに当該方法をコンピュータに実行させるプログラム
JP4389956B2 (ja) 2007-04-04 2009-12-24 ソニー株式会社 顔認識装置及び顔認識方法、並びにコンピュータ・プログラム
JP4999570B2 (ja) 2007-06-18 2012-08-15 キヤノン株式会社 表情認識装置及び方法、並びに撮像装置
JP5072757B2 (ja) 2008-07-24 2012-11-14 キヤノン株式会社 画像処理装置、画像処理方法及びプログラム
AU2012219026B2 (en) 2011-02-18 2017-08-03 Iomniscient Pty Ltd Image quality assessment
US9858501B2 (en) 2012-02-16 2018-01-02 Nec Corporation Reliability acquiring apparatus, reliability acquiring method, and reliability acquiring program
JP6049448B2 (ja) 2012-12-27 2016-12-21 キヤノン株式会社 被写体領域追跡装置、その制御方法及びプログラム
JP6222948B2 (ja) 2013-03-14 2017-11-01 セコム株式会社 特徴点抽出装置
WO2014205768A1 (zh) * 2013-06-28 2014-12-31 中国科学院自动化研究所 基于增量主成分分析的特征与模型互匹配人脸跟踪方法
KR101612605B1 (ko) 2014-05-07 2016-04-14 포항공과대학교 산학협력단 얼굴 특징점 추출 방법 및 이를 수행하는 장치
CN105868769A (zh) * 2015-01-23 2016-08-17 阿里巴巴集团控股有限公司 图像中的人脸关键点定位方法及装置
CN105205486B (zh) * 2015-09-15 2018-12-07 浙江宇视科技有限公司 一种车标识别方法及装置
CN105631439B (zh) * 2016-02-18 2019-11-08 北京旷视科技有限公司 人脸图像处理方法和装置
CN106295567B (zh) * 2016-08-10 2019-04-12 腾讯科技(深圳)有限公司 一种关键点的定位方法及终端
CN106815566B (zh) * 2016-12-29 2021-04-16 天津中科智能识别产业技术研究院有限公司 一种基于多任务卷积神经网络的人脸检索方法
WO2018153267A1 (zh) * 2017-02-24 2018-08-30 腾讯科技(深圳)有限公司 群组视频会话的方法及网络设备
CN107273845B (zh) * 2017-06-12 2020-10-02 大连海事大学 一种基于置信区域和多特征加权融合的人脸表情识别方法
WO2019000462A1 (zh) * 2017-06-30 2019-01-03 广东欧珀移动通信有限公司 人脸图像处理方法、装置、存储介质及电子设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160033552A (ko) * 2014-09-18 2016-03-28 한화테크윈 주식회사 키포인트 기술자 매칭 및 다수결 기법 기반 얼굴 인식 시스템 및 방법
CN106485230A (zh) * 2016-10-18 2017-03-08 中国科学院重庆绿色智能技术研究院 基于神经网络的人脸检测模型的训练、人脸检测方法及系统
CN106778585A (zh) * 2016-12-08 2017-05-31 腾讯科技(上海)有限公司 一种人脸关键点跟踪方法和装置
CN108229308A (zh) * 2017-11-23 2018-06-29 北京市商汤科技开发有限公司 目标对象识别方法、装置、存储介质和电子设备

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2022503426A (ja) * 2019-09-27 2022-01-12 ベイジン センスタイム テクノロジー デベロップメント カンパニー, リミテッド 人体検出方法、装置、コンピュータ機器及び記憶媒体
JP7101829B2 (ja) 2019-09-27 2022-07-15 ベイジン・センスタイム・テクノロジー・デベロップメント・カンパニー・リミテッド 人体検出方法、装置、コンピュータ機器及び記憶媒体
US20210279883A1 (en) * 2020-03-05 2021-09-09 Alibaba Group Holding Limited Image processing method, apparatus, electronic device, and storage medium
US11816842B2 (en) * 2020-03-05 2023-11-14 Alibaba Group Holding Limited Image processing method, apparatus, electronic device, and storage medium
CN113657155A (zh) * 2021-07-09 2021-11-16 浙江大华技术股份有限公司 一种行为检测方法、装置、计算机设备和存储介质
CN113505763A (zh) * 2021-09-09 2021-10-15 北京爱笔科技有限公司 关键点检测方法、装置、电子设备及存储介质
CN113505763B (zh) * 2021-09-09 2022-02-01 北京爱笔科技有限公司 关键点检测方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
US20200143146A1 (en) 2020-05-07
US11182592B2 (en) 2021-11-23
KR20200015728A (ko) 2020-02-12
JP2020527792A (ja) 2020-09-10
JP6994101B2 (ja) 2022-01-14
CN108229308A (zh) 2018-06-29
SG11202000076WA (en) 2020-02-27

Similar Documents

Publication Publication Date Title
WO2019100888A1 (zh) 目标对象识别方法、装置、存储介质和电子设备
US10753881B2 (en) Methods and systems for crack detection
US10506174B2 (en) Information processing apparatus and method for identifying objects and instructing a capturing apparatus, and storage medium for performing the processes
US10055843B2 (en) System and methods for automatic polyp detection using convulutional neural networks
CN105938622B (zh) 检测运动图像中的物体的方法和装置
US20080267458A1 (en) Face image log creation
WO2019020103A1 (zh) 目标识别方法、装置、存储介质和电子设备
US9626577B1 (en) Image selection and recognition processing from a video feed
US10185886B2 (en) Image processing method and image processing apparatus
CN109271848B (zh) 一种人脸检测方法及人脸检测装置、存储介质
US20210124928A1 (en) Object tracking methods and apparatuses, electronic devices and storage media
JP2006048322A (ja) オブジェクト画像検出装置、顔画像検出プログラムおよび顔画像検出方法
US9721153B2 (en) Image processing apparatus, image processing method, and storage medium that recognize an image based on a designated object type
US10664523B2 (en) Information processing apparatus, information processing method, and storage medium
US20180046842A1 (en) Line-of-sight detection device and line-of-sight detection method
CN108229289B (zh) 目标检索方法、装置和电子设备
CN109145752B (zh) 用于评估对象检测和跟踪算法的方法、装置、设备和介质
JP2018124689A (ja) 移動物体検出装置、移動物体検出システム、及び移動物体検出方法
US20240104769A1 (en) Information processing apparatus, control method, and non-transitory storage medium
US20180314893A1 (en) Information processing device, video image monitoring system, information processing method, and recording medium
JP2005134966A (ja) 顔画像候補領域検索方法及び検索システム並びに検索プログラム
US11132778B2 (en) Image analysis apparatus, image analysis method, and recording medium
JP2006323779A (ja) 画像処理方法、画像処理装置
KR20120049605A (ko) 동공 중심 검출 장치 및 방법
Bâce et al. Accurate and robust eye contact detection during everyday mobile device interactions

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18880809

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20207000574

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2020500847

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 10.09.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18880809

Country of ref document: EP

Kind code of ref document: A1