WO2022111044A1 - Image processing method and apparatus, and terminal control method and apparatus - Google Patents

Image processing method and apparatus, and terminal control method and apparatus Download PDF

Info

Publication number
WO2022111044A1
WO2022111044A1 PCT/CN2021/121457 CN2021121457W WO2022111044A1 WO 2022111044 A1 WO2022111044 A1 WO 2022111044A1 CN 2021121457 W CN2021121457 W CN 2021121457W WO 2022111044 A1 WO2022111044 A1 WO 2022111044A1
Authority
WO
WIPO (PCT)
Prior art keywords
object detection
detection frame
translated
face
target
Prior art date
Application number
PCT/CN2021/121457
Other languages
French (fr)
Chinese (zh)
Inventor
黄耿石
滕家宁
邵婧
Original Assignee
深圳市商汤科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市商汤科技有限公司 filed Critical 深圳市商汤科技有限公司
Publication of WO2022111044A1 publication Critical patent/WO2022111044A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular, to an image processing method and device, and a terminal control method and device.
  • live body detection can be performed based on a set of images collected by a binocular camera.
  • a set of images collected by each module can be realized by using a live body detection model. Liveness detection.
  • the embodiments of the present disclosure provide at least an image processing method and apparatus, and a terminal control method and apparatus.
  • an embodiment of the present disclosure provides an image processing method, including: acquiring two images to be detected obtained by shooting a target object by each camera in a binocular camera; and performing target object detection on the two images to be detected respectively , obtain the object detection frame of the target object in each of the to-be-detected images; perform the expansion processing on each of the object detection frames, and translate at least one object detection frame after the expansion processing, and obtain the translated The object detection frame; at least based on the translated object detection frame, determine the recognition result of the target object.
  • the target object detection can be performed on the images to be detected to obtain the object detection frame of the target object in each image to be detected.
  • the two object detection frames are expanded, one or two object detection frames in the expanded two object detection frames can be translated to determine the target object based on the processed object detection frames. recognition result.
  • the above image processing method can focus on the target object based on the target object detection, which can initially reduce the influence of the baseline, and consider that in the process of target object recognition for the two images to be detected collected by the binocular camera, it is necessary to refer to a module.
  • the parallax formed by determines the depth information about the target object. Therefore, the embodiment of the present disclosure further achieves the effect of simulating the parallax of the human eye through the cooperative processing operation of expanding and translating the object detection frame. , to obtain the processed object detection frame for identifying the target object.
  • the embodiments of the present disclosure are highly versatile, and can achieve the purpose of different modules sharing a set of pseudo-baselines to simulate human eye parallax, thereby improving the generalization capability of the modules and reducing the time cost of subsequent applications such as target object recognition.
  • the binocular camera includes a first camera and a second camera; in the case of translating an object detection frame after the expansion processing, the at least one object detection frame after the expansion processing
  • the frame is translated to obtain the translated object detection frame, which includes: expanding the object detection frame detected in the to-be-detected image collected by the first camera to obtain the to-be-translated detection frame; Translate away from the direction of the second camera to obtain the translated object detection frame.
  • the translation direction of the detection frame to be translated can be determined based on the relative positional relationship between the two cameras included in the binocular camera.
  • the detection frame to be translated corresponds to the first camera.
  • the to-be-translated detection frame may be translated in a direction away from the second camera, and the to-be-translated detection frame moved based on this translation direction may meet the parallax requirement of the pseudo baseline.
  • the step of translating the detection frame to be translated in a direction away from the second camera to obtain the translated object detection frame includes: based on the size of the detection frame to be translated information to determine the translation distance; move the to-be-translated detection frame by the translation distance according to the direction away from the second camera to obtain the translated object detection frame.
  • the determining the translation distance based on the size information of the detection frame to be translated includes: determining the translation distance based on the width value in the size information of the detection frame to be translated and a preset translation coefficient. the translation distance.
  • the size information of the object detection frame after imaging can be used. To determine the translation distance for simulating parallax, and then based on the pseudo-baseline constructed by the translation distance to achieve the recognition effect of binocular parallax.
  • the at least one object detection frame after the outer expansion processing is translated to obtain the translated object detection frame, including : respectively translate each object detection frame in the two object detection frames after the expansion processing in a direction away from the other object detection frame, so as to obtain the translated two object detection frames.
  • the performing expansion processing on each of the object detection frames includes: for each of the object detection frames, in the to-be-detected image corresponding to the object detection frame, determining the object The position coordinates of the corner points of the detection frame; based on the determined position coordinates of the corner points and the preset expansion ratio, the expansion processing is performed on the object detection frame, and the expanded object detection frame corresponding to the object detection frame is obtained.
  • the object detection frame can be expanded based on the corner position coordinates of the object detection frame and the preset expansion ratio.
  • the obtained object detection frame can not only help construct a pseudo-baseline, but also improve the accuracy of subsequent result recognition.
  • the determining the recognition result of the target object based on at least the translated object detection frame includes: in the case of translating an object detection frame after expansion processing, based on A said translated object detection frame and an untranslated object detection frame after expansion processing, determine the recognition result of the target object; or perform the expansion processing on the two object detection frames In the case of translation, the recognition result of the target object is determined based on the two translated object detection frames.
  • an embodiment of the present disclosure further provides a terminal control method, the terminal is provided with a binocular camera, the method includes: acquiring a set of face images shot by the binocular camera on a target face, the The group of face images includes a first face image captured by a first camera in the binocular camera, and a second face image captured by a second camera in the binocular camera; through the first aspect and its
  • the image processing method obtains a recognition result corresponding to the set of face images, where the recognition result includes whether the target face is a real face; in response to the recognition of the person The result includes that the target face is a real face, and the person corresponding to the target face has passed identity authentication, and the terminal is controlled to perform the specified operation.
  • an embodiment of the present disclosure further provides an image processing apparatus, including: an acquisition module for acquiring two to-be-detected images obtained by shooting a target object by each camera in a binocular camera; a detection module for The two to-be-detected images are respectively subjected to target object detection to obtain an object detection frame of the target object in each of the to-be-detected images; an external expansion module is used to perform external expansion processing on each of the object detection frames, And at least one object detection frame after expansion processing is translated to obtain a translated object detection frame; a determining module is used to determine a recognition result of the target object based on at least the translated object detection frame.
  • an embodiment of the present disclosure further provides a terminal control device, including: an acquisition module configured to acquire a group of face images captured by the binocular camera on a target face, the group of face images including A first face image captured by the first camera in the binocular camera, and a second face image captured by the second camera in the binocular camera; a determination module, used for the first aspect and its various The image processing method according to any one of the embodiments obtains a recognition result corresponding to the group of face images, and the recognition result includes whether the target face is a real face; a control module is configured to respond to the The identification result of the person includes that the target face is a real face, and the person corresponding to the target face has passed identity authentication, and the terminal is controlled to perform a specified operation.
  • embodiments of the present disclosure further provide an electronic device, including: a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the A bus communicates between the processor and the memory, and the machine-readable instructions execute the steps or the second aspect of the image processing method according to any one of the first aspect and its various embodiments when the machine-readable instructions are executed by the processor The steps of the terminal control method.
  • an embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by an electronic device, the electronic device executes the first The steps of the image processing method described in any one of the aspect and its various embodiments or the steps of the terminal control method described in the second aspect.
  • an embodiment of the present disclosure further provides a computer program, comprising computer-readable codes, when the codes are executed in an electronic device, causing a processor in the electronic device to execute the first aspect and the method when executed by a processor in the electronic device.
  • a computer program comprising computer-readable codes, when the codes are executed in an electronic device, causing a processor in the electronic device to execute the first aspect and the method when executed by a processor in the electronic device.
  • FIG. 1 shows a flowchart of an image processing method provided by Embodiment 1 of the present disclosure
  • FIG. 2( a ) shows a schematic diagram of an application of an image processing method provided by Embodiment 1 of the present disclosure
  • FIG. 2(b) shows a schematic diagram of the application of an image processing method provided by Embodiment 1 of the present disclosure
  • Fig. 2(c) shows a schematic diagram of the application of an image processing method provided by Embodiment 1 of the present disclosure
  • FIG. 2(d) shows an application schematic diagram of an image processing method provided by Embodiment 1 of the present disclosure
  • FIG. 3 shows a flowchart of a terminal control method provided by Embodiment 1 of the present disclosure
  • FIG. 4 shows a schematic diagram of an image processing apparatus provided by Embodiment 2 of the present disclosure
  • FIG. 5 shows a schematic diagram of a terminal control apparatus provided by Embodiment 2 of the present disclosure
  • FIG. 6 shows a schematic diagram of an electronic device according to Embodiment 3 of the present disclosure
  • FIG. 7 shows a schematic diagram of another electronic device provided by Embodiment 3 of the present disclosure.
  • a set of images collected by each module can often be detected by using a living body detection model. For example, a picture obtained by the left eye camera of the binocular camera of the object and another picture obtained by the right eye camera of the binocular camera of the object are input into the living body detection model at the same time, so as to finally obtain whether the object is a living body. result.
  • the present disclosure provides an image processing method and device, and a terminal control method and device, so as to improve the generalization capability of the module and reduce the time cost of subsequent applications such as target object recognition.
  • the execution body of the image processing method provided by the embodiment of the present disclosure is generally an electronic device with
  • the devices include, for example, terminal devices or servers or other processing devices, and the terminal devices may be user equipment (User Equipment, UE), mobile devices, user terminals, cellular phones, cordless phones, personal digital assistants (Personal Digital Assistant, PDA), handheld devices, computing devices, in-vehicle devices, wearable devices, etc.
  • the image processing method may be implemented by the processor calling computer-readable instructions stored in the memory.
  • FIG. 1 is a flowchart of an image processing method provided by an embodiment of the present disclosure, the method includes steps S101-S104, wherein:
  • the binocular camera may include an RGB (Red Green Blue) camera and a near-infrared camera, that is, the two images to be detected may be an RGB image and an infrared image, respectively.
  • RGB Red Green Blue
  • the two images to be detected are both infrared images.
  • two RGB cameras may also be included, that is, the two images to be detected are both RGB images. This application does not limit the specific structure of the binocular camera.
  • the above image processing method can be mainly applied to related applications of target recognition based on binocular cameras. For example, it can be used to perform liveness detection on faces captured by binocular cameras, and it can also be used to perform license plate recognition on vehicles captured by binocular cameras. It may also be other related applications, which are not specifically limited here.
  • two images may be acquired based on the binocular camera shooting the target object in the same scene.
  • the disparity map is obtained by using the stereo matching algorithm, and then the depth map is obtained to realize target recognition.
  • the relative distance between the two cameras included in different modules is different, even if the same recognition method is used for target recognition for the same target, it may be due to the baseline.
  • the relative distance between the two cameras leads to different recognition results, especially in the process of using the target detection model for target recognition, since the training model itself needs to input a large number of image samples, if the image samples used are The corresponding baseline is different from the baseline of the module corresponding to the target detection model, which will greatly reduce the accuracy of the model.
  • different object detection models can be trained for different modules to ensure the accuracy of the model, this method will lead to a large increase in the training cost.
  • the embodiments of the present disclosure provide an image processing method that can provide a general pseudo-baseline for different modules, and then can perform target recognition based on images collected by different modules.
  • Target object detection can reduce the influence of the baseline existing in the current module, and then in order to facilitate subsequent target recognition, a pseudo-baseline can be constructed based on the cooperation of external expansion and translation, so as to ensure that the influence of the original baseline is eliminated. Pseudo-baseline to achieve target recognition.
  • the two to-be-detected images collected by the binocular camera in the embodiment of the present disclosure may be determined based on the application scenario where the binocular camera is located.
  • the two to-be-detected images collected here may include An image of a human face; for another example, in an intelligent transportation application, the two images to be detected collected here may be images containing vehicles.
  • the object detection frame obtained by the target object detection on the to-be-detected images obtained by the two cameras respectively eliminates the influence of the original baseline caused by the parallax.
  • the object detection frame where the target object is located can be detected from the image to be detected based on the traditional target object detection method.
  • the target object detection method here may be a frame difference method, a background subtraction method, an optical flow method, and the like.
  • object detection in addition to using the above-mentioned traditional method to implement object detection, object detection can also be performed based on a trained detection model.
  • the detection model here can be obtained by training the image samples marked with the object detection frame, and the training can be the correspondence between the input image sample and the output object detection frame.
  • the object detection frame in the image to be detected can be determined.
  • the detection model may be a separate neural network, or may be included in the above target detection model.
  • the embodiments of the present disclosure can create a common pseudo-baseline based on the translation operation to eliminate the parallax.
  • the target object selected in the object detection frame is a complete object, such as a target face containing human facial features, hair, and neck. If the translation of the object detection frame is performed directly, It may lead to the incompleteness of the target face, which is not conducive to subsequent object recognition. Based on this, before the translation operation is performed in the embodiment of the present disclosure, the expansion operation may be performed first.
  • the above expansion operation can expand the image area framed by the object detection frame to a certain extent. Considering that the larger the image area, the more information content the area contains to a certain extent. This can improve the accuracy of subsequent target recognition, and on the other hand, can provide translation basis for subsequent translation operations.
  • the translation operation in this embodiment of the present disclosure may be a translation operation performed on one of the two object detection frames after the expansion processing.
  • the camera is used as a reference for translation, and the translation operation may also be performed for both object detection frames.
  • the center position of the two cameras may be used as a reference for translation.
  • the recognition result of the target object may be determined based on a translation-processed object detection frame and an object detection frame that has not been translated, or the recognition result of the target object may be determined based on two translation-processed object detection frames. After the object detection frame, the recognition result of the target object is determined.
  • the trained target recognition model can be used to perform target recognition on the object detection frame after translation processing, so as to determine the recognition result of the target object.
  • the target recognition model in the embodiment of the present disclosure may be a living body detection model related to face recognition, and the above-mentioned one translated object detection frame and one untranslated object detection frame are input into the trained living body detection model, and it can be determined that Determine whether the target face corresponding to the object detection frame is a real face, or input the above two translated object detection frames into the trained living detection model to determine whether the target face corresponding to the object detection frame is real. human face.
  • the above-mentioned target recognition model can also be a vehicle detection model related to vehicle recognition.
  • the above-mentioned one translated object detection frame and one untranslated object detection frame are input into the trained vehicle detection model, and the object detection model can be determined.
  • the type information of the target vehicle corresponding to the frame, or inputting the above two translated object detection frames into the trained vehicle detection model, can determine the type information of the target vehicle corresponding to the object detection frame.
  • two images to be detected are collected by the binocular camera set for the detection of human face living body.
  • the right image is the human body image 2 collected by the right eye camera (such as a near-infrared camera) included in the binocular camera, after the target face detection is performed for the above-mentioned two face images, the target face can be generated in the two human body images.
  • the object detection frame in , as shown in Figure 2(b).
  • the embodiment of the present disclosure may perform outreach processing, as shown in FIG. 2( c ).
  • the object detection frame in the right image of Fig. 2(c) can be subjected to translation processing to obtain the object detection frame after translation processing, as shown in the right image of Fig. 2(d), the left image of Fig. 2(d) It is the same as the image on the left of Figure 2(c), including the object detection frame after expansion processing without translation processing.
  • the two images shown in FIG. 2(d) can be cut based on the object detection frame to obtain two corresponding face images.
  • it can be determined whether the target face is a real face.
  • the image on the right in Figure 2(d) is an image collected by a near-infrared camera
  • the image on the right in Figure 2(d) is an image collected by a near-infrared camera
  • the embodiments of the present disclosure may perform translation processing on the images collected by the right-eye camera in the above-mentioned exemplary manner, and may also perform translation processing on the images collected by the left-eye camera, and the embodiments of the present disclosure may be based on different application scenarios. to choose, no specific restrictions are made here.
  • the expansion processing may be performed on the object detection frame of each to-be-detected image by the following operations:
  • the position coordinates of the corner points of the object detection frame are determined; the object detection frame is expanded based on the determined position coordinates of the corner points and the preset expansion ratio to obtain the expanded object detection frame.
  • the position coordinates of the corner points of the object detection frame in the image to be detected can be determined, and the position coordinates of the corner points can correspond to the image coordinates of the four corners of the object detection frame.
  • the object detection frame may be expanded based on the position coordinates of the corner points and the preset expansion ratio to obtain the processed object detection frame.
  • the above-mentioned preset outward expansion ratio may be determined based on the actual size of the target object. This is mainly considering that for different target objects, their actual sizes are different, and the corresponding imaging sizes are also different. In the same scene, using the binocular camera to capture the face and vehicle at the same position, the vehicle detection frame corresponding to the vehicle is much larger than the face detection frame corresponding to the face.
  • a larger scale expansion coefficient can be set for the target object with a larger size, so as to cover the surrounding area information of the target object with a larger imaging size, and a smaller scale expansion coefficient can be set for the target object with a smaller size. coefficient to be sufficient to cover the surrounding area information of the target object with smaller imaging size.
  • the expansion processing can also be realized by dragging the four corners of the object detection frame; it can also be determined based on the position coordinates of the four corners.
  • the length corresponding to the frame (corresponding to the upper frame, the lower frame, the left frame and the right frame), and then through the preset outward expansion ratio set for each frame to realize the outward expansion processing of the frame, in the specific application, it can be through the frame to achieve by dragging outwards.
  • the above-mentioned four frames can be expanded in their outward expansion directions according to a certain outward expansion ratio.
  • the four frames of the left frame, the right frame, the upper frame and the lower frame can be respectively 0.4 times, 0.4 times, 0.8 times, 0.4 times, the dimensions of the four borders of the expanded face object frame have all changed.
  • the reason why the part of the eye corresponding to the upper border is expanded by a larger multiple is mainly considering that the eye is the upper part of the facial features detected by the face detection frame.
  • the upper border can be appropriately expanded by a larger multiple.
  • the expansion processing may be performed on one frame of the four frames alone, or the expansion processing may be performed on a pair of frames (eg, the left frame and the right frame) among the four frames at the same time. , and may also be other external expansion processing methods, which will not be repeated here.
  • the binocular camera includes a first camera and a second camera.
  • the translation processing may be performed by the following operations:
  • a detection frame to be translated can be selected from it.
  • the translation direction of the right detection frame can be determined based on the relative positional relationship between the right-eye camera and the left-eye camera, and the right-eye camera is relatively right, and the translation direction of the right detection frame can be determined to be to the right ( That is, the direction away from the left-eye camera), if the object detection frame corresponding to the left-eye camera (ie, the left detection frame) is selected as the detection frame to be translated, the translation direction of the left detection frame is to the left.
  • the translation in addition to the translation according to the above-mentioned translation direction, the translation may also be performed in combination with the translation distance.
  • the translation distance may be determined based on the size information of the detection frame to be translated.
  • the size information of the detection frame to be translated when the size information of the detection frame to be translated is large, it means that the target object is closer to the camera, and the closer target object is.
  • the existing parallax is large, so the required translation distance is large.
  • the size information of the to-be-translated detection frame is small, it means that the target object is far away from the camera to a certain extent. small, so the required translation distance is small.
  • the translation distance can be determined based on a preset translation coefficient and the width value in the size information of the detection frame.
  • the translation distance is preset The product of the translation factor and the width value.
  • the preset translation coefficient is related to the distance between the two cameras in the binocular camera.
  • the width value of the detection frame corresponds to the size of the horizontal side of the detection frame.
  • the translation distance is proportional to the width, that is, the larger the translation distance determined by the detection frame with the larger width is, in order to balance the larger parallax of the large-sized target object, and the smaller the width of the detection frame is, the smaller the translation distance determined by the detection frame is. , in order to balance the small parallax of small-sized target objects, so that a common pseudo-baseline suitable for various target objects can be constructed.
  • each object detection frame in the two object detection frames after the expansion processing may be moved away from the other The direction of the object detection frame is translated to obtain the processed object detection frame.
  • the translation directions of the left detection frame and the right detection frame can be determined respectively based on the relative positional relationship between the right-eye camera and the left-eye camera included in the binocular cameras.
  • Right that is, the direction away from the left detection frame
  • the left-eye camera is relatively left, and it can be determined that the translation direction of the left detection frame is to the left (that is, the direction away from the right detection frame).
  • the translation processing may also be implemented in combination with the translation distance.
  • the translation processing may also be implemented in combination with the translation distance.
  • the image processing method provided by the embodiment of the present disclosure can overcome the problem of inaccurate recognition results caused by different baselines of different binocular cameras, and has strong robustness, so it can be widely used in various technical fields.
  • the above-mentioned image processing method can be applied to terminal control applications, as shown in FIG. 3 , and can be implemented according to the following steps:
  • S301 Acquire a set of face images captured by a binocular camera on a target face, where the set of face images includes a first face image captured by a first camera in the binocular camera, and a set of face images captured by a second camera in the binocular camera The second face image of;
  • the terminal can be controlled to perform the specified operation in combination with the successful result of the person identity authentication corresponding to the target face, otherwise, The terminal can refuse to perform the specified operation and issue an alarm prompt.
  • the terminal control method when the terminal control method is applied to different terminals, the corresponding specified operations may also be different.
  • the terminal here may be a user terminal, a gate device terminal, a payment terminal, or the like.
  • the target face corresponding to a group of face images captured by the binocular camera of the user terminal is a real face, and the identity of the person corresponding to the target face is legal.
  • the access switch connected to the gate equipment terminal to open, thereby realizing the gate's Automatic passage, if a non-real face is identified or the identity is illegal, it will not be able to pass.
  • the image processing method provided by the implementation of the present disclosure can not only be applied to the above unlocking applications and gate verification applications, but also can be applied to other scenarios, for example, pedestrian detection in a video surveillance scenario, or It is embedded in financial equipment to perform liveness detection of financial services, which will not be repeated here.
  • the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.
  • the embodiments of the present disclosure also provide an image processing apparatus corresponding to the image processing method and a terminal control apparatus corresponding to the terminal control method.
  • the above-mentioned methods are similar, so the implementation of the apparatus may refer to the implementation of the method, and repeated descriptions will not be repeated.
  • the apparatus includes: an acquisition module 401, a detection module 402, an external expansion module 403, and a determination module 404; wherein,
  • an acquisition module 401 configured to acquire two images to be detected obtained by shooting the target object by each camera in the binocular camera;
  • a detection module 402 configured to perform target object detection on the two to-be-detected images respectively, to obtain an object detection frame of the target object in each of the to-be-detected images;
  • the expansion module 403 is configured to perform expansion processing on each of the object detection frames, and translate at least one object detection frame after the expansion processing to obtain the translated object detection frame;
  • a determination module 404 configured to determine a recognition result of the target object based on at least the translated object detection frame.
  • the embodiments of the present disclosure can focus on the target object based on the target object detection, and can initially reduce the influence of the baseline, and consider that in the process of performing target object recognition on the two images to be detected collected by the binocular camera, it is necessary to refer to a module
  • the parallax formed by determines the depth information about the target object. Therefore, the embodiment of the present disclosure further achieves the effect of simulating the parallax of the human eye by performing the cooperative processing operation of expanding and translating the object detection frame. , to obtain the processed object detection frame for identifying the target object.
  • the embodiments of the present disclosure are highly versatile, and can achieve the purpose of sharing a set of pseudo-baselines for different modules to simulate human eye parallax, thereby improving the generalization ability of the modules and reducing the time cost of subsequent applications such as target object recognition.
  • the binocular camera includes a first camera and a second camera; in the case of performing translation processing on an object detection frame after the external expansion processing, the external expansion module 403 is used for external expansion according to the following steps.
  • the processed at least one object detection frame is subjected to translation processing: the object detection frame detected in the to-be-detected image collected by the first camera is subjected to expansion processing to obtain the to-be-translated detection frame; The direction of the second camera is translated to obtain the translated object detection frame.
  • the external expansion module 403 is configured to translate the detection frame to be translated in a direction away from the second camera according to the following steps to obtain the translated object detection frame: based on the size of the detection frame to be translated information to determine the translation distance; move the to-be-translated detection frame by the translation distance according to the direction away from the second camera to obtain the translated object detection frame.
  • the expansion module 403 is configured to determine the translation distance based on the size information of the detection frame of the object to be translated according to the following steps: based on the width value in the size information of the detection frame to be translated and A translation coefficient is preset to determine the translation distance.
  • the outer expansion module 403 is configured to translate the at least one object detection frame after the outer expansion processing according to the following steps to obtain The translated object detection frame: each object detection frame in the two object detection frames after the expansion processing is respectively translated in a direction away from the other object detection frame to obtain the translated two object detection frames .
  • the expansion module 403 is configured to perform expansion processing on each of the object detection frames according to the following steps: for each of the object detection frames, in the to-be-detected corresponding to the object detection frame In the image, the position coordinates of the corners of the object detection frame are determined; based on the determined position coordinates of the corners and the preset expansion ratio, the expansion processing is performed on the object detection frame, and the expansion processing corresponding to the object detection frame is obtained. After the object detection box.
  • the determining module 404 is configured to determine the recognition result of the target object based on at least the translated object detection frame according to the following steps: an object detection frame after external expansion processing In the case of translation, the recognition result of the target object is determined based on an object detection frame after the translation and an object detection frame after the expansion process that has not been translated; or after the expansion process When the two object detection frames are translated, the recognition result of the target object is determined based on the two translated object detection frames.
  • the target object is a target face
  • the determining module 404 is configured to determine the recognition result of the target object based on at least the translated object detection frame according to the following steps: using the trained The living body detection neural network at least performs target face recognition on the translated object detection frame, and determines whether the target face corresponding to the object detection frame is a real face.
  • the apparatus includes: an acquisition module 501 , a determination module 502 and a control module 503 ; wherein,
  • the acquisition module 501 is configured to acquire a group of face images captured by the binocular camera on the target face, where the group of face images includes a first face image captured by a first camera in the binocular camera, and a second face image captured by the second camera in the binocular camera;
  • a determination module 502 configured to obtain a recognition result corresponding to the group of face images through the above-mentioned image processing method, where the recognition result includes whether the target face is a real face;
  • the control module 503 is configured to control the terminal to perform a specified operation in response to the identification result of the person including that the target face is a real face and the person corresponding to the target face has passed identity authentication.
  • An embodiment of the present disclosure further provides an electronic device.
  • a schematic structural diagram of the electronic device provided by an embodiment of the present disclosure includes: a processor 601 , a memory 602 , and a bus 603 .
  • the memory 602 stores machine-readable instructions executable by the processor 601 (for example, the acquisition module 401, the detection module 402, the external expansion module 403, the execution instructions corresponding to the determination module 404 in the image processing apparatus in FIG.
  • the processor 601 and the memory 602 communicate through the bus 603, and the machine-readable instructions are executed by the processor 601 and perform the following processing: obtain two images to be detected obtained by shooting the target object by each camera in the binocular camera; Perform target object detection on the two to-be-detected images respectively, to obtain an object detection frame of the target object in each of the to-be-detected images; perform external expansion processing on each of the object detection frames, and perform external expansion processing The at least one object detection frame after the translation is performed to obtain a translated object detection frame; the recognition result of the target object is determined based on at least the translated object detection frame.
  • the binocular camera includes a first camera and a second camera; in the case of performing translation processing on an object detection frame after the external expansion processing, in the instructions executed by the above-mentioned processor 601, the external expansion processing is performed.
  • the at least one object detection frame after the expansion processing is translated to obtain the translated object detection frame, which includes: performing the translation in the to-be-detected image collected by the first camera.
  • the detected object detection frame is expanded to obtain a to-be-translated detection frame; the to-be-translated detection frame is translated in a direction away from the second camera to obtain the translated object detection frame.
  • the to-be-translated detection frame is translated in a direction away from the second camera to obtain the translated object detection frame, including: based on the The size information of the to-be-translated detection frame is used to determine the translation distance; the to-be-translated detection frame is moved by the translation distance according to the direction away from the second camera to obtain the translated object detection frame.
  • determining the translation distance based on the size information of the to-be-translated detection frame includes: based on the width value in the size information of the to-be-translated detection frame and A translation coefficient is preset to determine the translation distance.
  • At least one object detection frame after the expansion processing is translated to obtain the translation
  • the obtained object detection frame includes: respectively translating each object detection frame in the two object detection frames after the expansion processing in a direction away from the other object detection frame, so as to obtain the translated two object detection frames. frame.
  • performing an expansion process on each of the object detection frames includes: for each of the object detection frames, in the to-be-to-be-detected frame corresponding to the object detection frame In the detection image, determine the position coordinates of the corner points of the object detection frame; based on the determined position coordinates of the corner points and the preset expansion ratio, perform an expansion process on the object detection frame to obtain the expansion corresponding to the object detection frame.
  • the processed object detection box includes: for each of the object detection frames, in the to-be-to-be-detected frame corresponding to the object detection frame In the detection image, determine the position coordinates of the corner points of the object detection frame; based on the determined position coordinates of the corner points and the preset expansion ratio, perform an expansion process on the object detection frame to obtain the expansion corresponding to the object detection frame.
  • the processed object detection box includes: for each of the object detection frames, in the to-be-to-be-detected frame corresponding to the object detection frame In the detection image, determine the position coordinates of the corner points of the
  • determining the recognition result of the target object includes: an object detection frame after external expansion processing
  • the recognition result of the target object is determined based on an object detection frame after the translation and an object detection frame after the expansion process that has not been translated; or after the expansion process
  • the recognition result of the target object is determined based on the two translated object detection frames.
  • An embodiment of the present disclosure further provides an electronic device.
  • a schematic structural diagram of the electronic device provided by an embodiment of the present disclosure includes: a processor 701 , a memory 702 , and a bus 703 .
  • the memory 702 stores machine-readable instructions executable by the processor 701 (for example, the execution instructions corresponding to the acquisition module 501, the determination module 502, and the control module 503 in the terminal control apparatus in FIG. 5, etc.), and when the electronic device is running, processing
  • the communication between the processor 701 and the memory 702 is through the bus 703.
  • the following processing is performed: acquiring a group of face images captured by the binocular camera on the target face, the group of face images
  • the image includes a first face image captured by the first camera in the binocular camera, and a second face image captured by the second camera in the binocular camera; through the image processing method described in the above embodiment, Obtain a recognition result corresponding to the group of face images, where the recognition result includes whether the target face is a real face; the recognition result in response to the character includes that the target face is a real face, and all The person corresponding to the target face is authenticated, and the terminal is controlled to perform a specified operation.
  • Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the image processing method and the terminal control method described in the foregoing method embodiments are executed A step of.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • the computer program product of the image processing method and the terminal control method provided by the embodiments of the present disclosure includes a computer-readable storage medium storing program codes, and the instructions included in the program codes can be used to execute the images described in the above method embodiments.
  • the steps of the processing method and the terminal control method reference may be made to the foregoing method embodiments, which will not be repeated here.
  • Embodiments of the present disclosure also provide a computer program, which implements any one of the methods in the foregoing embodiments when the computer program is executed by a processor.
  • the computer program product can be specifically implemented by hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium.
  • the computer software products are stored in a storage medium, including Several instructions are used to cause an electronic device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Abstract

Provided are an image processing method and apparatus, and a terminal control method and apparatus. The image processing method comprises: acquiring two images to be detected which are obtained by means of cameras in a binocular camera photographing a target object; respectively performing target object detection on the two images to be detected, so as to obtain an object detection frame of the target object in each of the images to be detected; externally expanding each of the object detection frames and translating at least one externally expanded object detection frame so as to obtain translated object detection frames; and determining a recognition result for the target object at least on the basis of the translated object detection frames.

Description

图像处理方法及装置、终端控制方法及装置Image processing method and device, terminal control method and device
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求2020年11月30日递交的申请号为202011377063.0的中国专利申请的优先权,其全部内容通过引用并入本文。This application claims the priority of Chinese patent application No. 202011377063.0 filed on November 30, 2020, the entire contents of which are incorporated herein by reference.
技术领域technical field
本公开涉及图像处理技术领域,具体而言,涉及一种图像处理方法及装置、终端控制方法及装置。The present disclosure relates to the technical field of image processing, and in particular, to an image processing method and device, and a terminal control method and device.
背景技术Background technique
随着计算机视觉技术的不断发展和双目摄像头的广泛应用,基于双目摄像头的图像处理技术广泛应用于活体检测、智能交通等各个领域中。以活体检测应用为例,这里可以基于双目摄像头采集到的一组图像进行活体检测,例如,可以利用活体检测模型对每个模组(对应一个双目摄像头)采集的一组图像实现活体检测。With the continuous development of computer vision technology and the wide application of binocular cameras, the image processing technology based on binocular cameras is widely used in various fields such as living detection and intelligent transportation. Taking the application of living body detection as an example, live body detection can be performed based on a set of images collected by a binocular camera. For example, a set of images collected by each module (corresponding to a binocular camera) can be realized by using a live body detection model. Liveness detection.
发明内容SUMMARY OF THE INVENTION
本公开实施例至少提供一种图像处理方法及装置、终端控制方法及装置。The embodiments of the present disclosure provide at least an image processing method and apparatus, and a terminal control method and apparatus.
第一方面,本公开实施例提供了一种图像处理方法,包括:获取通过双目摄像头中各摄像头拍摄目标对象得到的两个待检测图像;对所述两个待检测图像分别进行目标对象检测,得到所述目标对象在每个所述待检测图像中的对象检测框;对每个所述对象检测框进行外扩处理,并对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框;至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果。In a first aspect, an embodiment of the present disclosure provides an image processing method, including: acquiring two images to be detected obtained by shooting a target object by each camera in a binocular camera; and performing target object detection on the two images to be detected respectively , obtain the object detection frame of the target object in each of the to-be-detected images; perform the expansion processing on each of the object detection frames, and translate at least one object detection frame after the expansion processing, and obtain the translated The object detection frame; at least based on the translated object detection frame, determine the recognition result of the target object.
采用上述图像处理方法,在获取到双目摄像头采集的两张待检测图像的情况下,首先可以对待检测图像进行目标对象检测以得到目标对象在每张待检测图像中的对象检测框。在对两个对象检测框进行外扩处理的前提下,可以对外扩处理后的两个对象检测框中的一个或两个对象检测框进行平移,以基于处理后的对象检测框来确定目标对象的识别结果。Using the above image processing method, when two images to be detected collected by the binocular camera are acquired, first, the target object detection can be performed on the images to be detected to obtain the object detection frame of the target object in each image to be detected. On the premise that the two object detection frames are expanded, one or two object detection frames in the expanded two object detection frames can be translated to determine the target object based on the processed object detection frames. recognition result.
上述图像处理方法基于目标对象检测可以聚焦到目标对象,可以初步减小基线的影响,又考虑到在针对双目摄像头采集的两张待检测图像进行目标对象识别的过程中,需要参考一个模组(对应一个双目摄像头)所形成的视差来确定有关目标对象的深度信息,因而,本公开实施例进一步通过对对象检测框进行外扩和平移的配合处理操作,来达到模拟人眼视差的效果,得到用于对目标对象进行识别的处理后的对象检测框。本公开实施例通用性较强,可以实现不同的模组共用一套伪基线以模拟人眼视差的目的,从而提升模组的泛化能力,降低后续进行目标对象识别等应用的时间成本。The above image processing method can focus on the target object based on the target object detection, which can initially reduce the influence of the baseline, and consider that in the process of target object recognition for the two images to be detected collected by the binocular camera, it is necessary to refer to a module. The parallax formed by (corresponding to a binocular camera) determines the depth information about the target object. Therefore, the embodiment of the present disclosure further achieves the effect of simulating the parallax of the human eye through the cooperative processing operation of expanding and translating the object detection frame. , to obtain the processed object detection frame for identifying the target object. The embodiments of the present disclosure are highly versatile, and can achieve the purpose of different modules sharing a set of pseudo-baselines to simulate human eye parallax, thereby improving the generalization capability of the modules and reducing the time cost of subsequent applications such as target object recognition.
在一种可能的实施方式中,所述双目摄像头包括第一摄像头和第二摄像头;在对外 扩处理后的一个对象检测框进行平移的情况下,所述对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框,包括:对所述第一摄像头采集的待检测图像中检测出的对象检测框进行外扩处理得到待平移检测框;将所述待平移检测框沿背离所述第二摄像头的方向平移,得到所述平移后的对象检测框。In a possible implementation manner, the binocular camera includes a first camera and a second camera; in the case of translating an object detection frame after the expansion processing, the at least one object detection frame after the expansion processing The frame is translated to obtain the translated object detection frame, which includes: expanding the object detection frame detected in the to-be-detected image collected by the first camera to obtain the to-be-translated detection frame; Translate away from the direction of the second camera to obtain the translated object detection frame.
这里,可以在选取出待平移检测框的情况下,基于双目摄像头包括的两个摄像头之间的相对位置关系,确定待平移检测框的平移方向,这里,在待平移检测框对应第一摄像头的情况下,可以将所述待平移检测框沿背离所述第二摄像头的方向平移,基于这一平移方向移动后的待平移检测框可以是符合伪基线的视差需求的。Here, when the detection frame to be translated is selected, the translation direction of the detection frame to be translated can be determined based on the relative positional relationship between the two cameras included in the binocular camera. Here, the detection frame to be translated corresponds to the first camera. In the case of , the to-be-translated detection frame may be translated in a direction away from the second camera, and the to-be-translated detection frame moved based on this translation direction may meet the parallax requirement of the pseudo baseline.
在一种可能的实施方式中,所述将所述待平移检测框沿背离所述第二摄像头的方向平移,得到所述平移后的对象检测框,包括:基于所述待平移检测框的尺寸信息,确定平移距离;按照背离所述第二摄像头的方向,将所述待平移检测框移动所述平移距离,得到所述平移后的对象检测框。In a possible implementation manner, the step of translating the detection frame to be translated in a direction away from the second camera to obtain the translated object detection frame includes: based on the size of the detection frame to be translated information to determine the translation distance; move the to-be-translated detection frame by the translation distance according to the direction away from the second camera to obtain the translated object detection frame.
在一种可能的实施方式中,所述基于所述待平移检测框的尺寸信息,确定平移距离,包括:基于所述待平移检测框的尺寸信息中的宽度值以及预设平移系数,确定所述平移距离。In a possible implementation manner, the determining the translation distance based on the size information of the detection frame to be translated includes: determining the translation distance based on the width value in the size information of the detection frame to be translated and a preset translation coefficient. the translation distance.
考虑到双目摄像头所确定的视差与目标对象距离双目摄像头的远近相关,又考虑到成像大小也与距离远近相关,因而,本公开实施例中,可以基于成像后的对象检测框的尺寸信息来确定用于模拟视差的平移距离,进而基于平移距离所构建的伪基线来达到双目视差的识别效果。Considering that the parallax determined by the binocular camera is related to the distance between the target object and the binocular camera, and considering that the imaging size is also related to the distance, therefore, in this embodiment of the present disclosure, the size information of the object detection frame after imaging can be used. To determine the translation distance for simulating parallax, and then based on the pseudo-baseline constructed by the translation distance to achieve the recognition effect of binocular parallax.
在一种可能的实施方式中,在对外扩处理后的两个对象检测框进行平移的情况下,所述对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框,包括:分别将所述外扩处理后的两个对象检测框中的每个对象检测框向背离另一对象检测框的方向平移,得到所述平移后的两个对象检测框。In a possible implementation manner, in the case of translating the two object detection frames after the outer expansion processing, the at least one object detection frame after the outer expansion processing is translated to obtain the translated object detection frame, including : respectively translate each object detection frame in the two object detection frames after the expansion processing in a direction away from the other object detection frame, so as to obtain the translated two object detection frames.
在一种可能的实施方式中,所述对每个所述对象检测框进行外扩处理,包括:对于每个所述对象检测框,在该对象检测框对应的待检测图像中,确定该对象检测框的角点位置坐标;基于确定的所述角点位置坐标和预设外扩比例,对该对象检测框进行外扩处理,得到该对象检测框对应的外扩处理后的对象检测框。In a possible implementation manner, the performing expansion processing on each of the object detection frames includes: for each of the object detection frames, in the to-be-detected image corresponding to the object detection frame, determining the object The position coordinates of the corner points of the detection frame; based on the determined position coordinates of the corner points and the preset expansion ratio, the expansion processing is performed on the object detection frame, and the expanded object detection frame corresponding to the object detection frame is obtained.
这里,考虑到除去对象检测框框选的目标对象对识别结果的影响,除去该目标对象之外其他图像区域(例如背景)也会对识别结果带来一定的影响,特别是针对活体检测等应用,背景信息一定程度上可以判断出识别结果,因而,在进行伪基线构建的过程中,可以基于对象检测框的角点位置坐标、以及预设外扩比例对对象检测框进行外扩处理,这样处理得到的对象检测框不仅可以助于构建伪基线,还可以提升后续结果识别的准确度。Here, considering the influence of the target object selected by the object detection frame on the recognition result, other image areas (such as the background) other than the target object will also have a certain influence on the recognition result, especially for applications such as living body detection, The background information can determine the recognition result to a certain extent. Therefore, in the process of constructing the pseudo-baseline, the object detection frame can be expanded based on the corner position coordinates of the object detection frame and the preset expansion ratio. The obtained object detection frame can not only help construct a pseudo-baseline, but also improve the accuracy of subsequent result recognition.
在一种可能的实施方式中,所述至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果包括:在对外扩处理后的一个对象检测框进行平移的情况下,基于一个所述平移后的对象检测框和一个未进行平移的、外扩处理后的对象检测框,确定对所述目标对象的所述识别结果;或在对外扩处理后的两个对象检测框进行平移的情况下,基于两个所述平移后的对象检测框,确定对所述目标对象的所述识别结果。In a possible implementation manner, the determining the recognition result of the target object based on at least the translated object detection frame includes: in the case of translating an object detection frame after expansion processing, based on A said translated object detection frame and an untranslated object detection frame after expansion processing, determine the recognition result of the target object; or perform the expansion processing on the two object detection frames In the case of translation, the recognition result of the target object is determined based on the two translated object detection frames.
在一种可能的实施方式中,所述目标对象为目标人脸;所述至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果,包括:利用训练好的活体检测神经网络至少对所述平移后的对象检测框进行目标人脸的识别,确定所述对象检测框对应的目标人脸是否为真实人脸。In a possible implementation manner, the target object is a target face; and determining the recognition result of the target object based on at least the translated object detection frame includes: using a trained living body detection nerve The network at least performs target face recognition on the translated object detection frame, and determines whether the target face corresponding to the object detection frame is a real face.
第二方面,本公开实施例还提供了一种终端控制方法,所述终端设置有双目摄像头,所述方法包括:获取所述双目摄像头对目标人脸拍摄的一组人脸图像,所述一组人脸图像包括通过所述双目摄像头中第一摄像头拍摄的第一人脸图像,以及通过所述双目摄像头中第二摄像头拍摄的第二人脸图像;通过第一方面及其各种实施方式任一项所述的图像处理方法,得到所述一组人脸图像对应的识别结果,所述识别结果包括所述目标人脸是否为真实人脸;响应于所述人物的识别结果包括所述目标人脸为真实人脸,且所述目标人脸对应的人物通过身份认证,控制所述终端执行指定操作。In a second aspect, an embodiment of the present disclosure further provides a terminal control method, the terminal is provided with a binocular camera, the method includes: acquiring a set of face images shot by the binocular camera on a target face, the The group of face images includes a first face image captured by a first camera in the binocular camera, and a second face image captured by a second camera in the binocular camera; through the first aspect and its The image processing method according to any one of the various embodiments obtains a recognition result corresponding to the set of face images, where the recognition result includes whether the target face is a real face; in response to the recognition of the person The result includes that the target face is a real face, and the person corresponding to the target face has passed identity authentication, and the terminal is controlled to perform the specified operation.
第三方面,本公开实施例还提供了一种图像处理装置,包括:获取模块,用于获取通过双目摄像头中各摄像头拍摄目标对象得到的两个待检测图像;检测模块,用于对所述两个待检测图像分别进行目标对象检测,得到所述目标对象在每个所述待检测图像中的对象检测框;外扩模块,用于对每个所述对象检测框进行外扩处理,并对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框;确定模块,用于至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果。In a third aspect, an embodiment of the present disclosure further provides an image processing apparatus, including: an acquisition module for acquiring two to-be-detected images obtained by shooting a target object by each camera in a binocular camera; a detection module for The two to-be-detected images are respectively subjected to target object detection to obtain an object detection frame of the target object in each of the to-be-detected images; an external expansion module is used to perform external expansion processing on each of the object detection frames, And at least one object detection frame after expansion processing is translated to obtain a translated object detection frame; a determining module is used to determine a recognition result of the target object based on at least the translated object detection frame.
第四方面,本公开实施例还提供了一种终端控制装置,包括:获取模块,用于获取所述双目摄像头对目标人脸拍摄的一组人脸图像,所述一组人脸图像包括通过所述双目摄像头中第一摄像头拍摄的第一人脸图像,以及通过所述双目摄像头中第二摄像头拍摄的第二人脸图像;确定模块,用于通过第一方面及其各种实施方式任一项所述的图像处理方法,得到所述一组人脸图像对应的识别结果,所述识别结果包括所述目标人脸是否为真实人脸;控制模块,用于响应于所述人物的识别结果包括所述目标人脸为真实人脸,且所述目标人脸对应的人物通过身份认证,控制所述终端执行指定操作。In a fourth aspect, an embodiment of the present disclosure further provides a terminal control device, including: an acquisition module configured to acquire a group of face images captured by the binocular camera on a target face, the group of face images including A first face image captured by the first camera in the binocular camera, and a second face image captured by the second camera in the binocular camera; a determination module, used for the first aspect and its various The image processing method according to any one of the embodiments obtains a recognition result corresponding to the group of face images, and the recognition result includes whether the target face is a real face; a control module is configured to respond to the The identification result of the person includes that the target face is a real face, and the person corresponding to the target face has passed identity authentication, and the terminal is controlled to perform a specified operation.
第五方面,本公开实施例还提供了一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如第一方面及其各种实施方式任一所述的图像处理方法的步骤或者第二方面所述的终端控制方法的步骤。In a fifth aspect, embodiments of the present disclosure further provide an electronic device, including: a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the A bus communicates between the processor and the memory, and the machine-readable instructions execute the steps or the second aspect of the image processing method according to any one of the first aspect and its various embodiments when the machine-readable instructions are executed by the processor The steps of the terminal control method.
第六方面,本公开实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被电子设备运行时,所述电子设备执行如第一方面及其各种实施方式任一所述的图像处理方法的步骤或者第二方面所述的终端控制方法的步骤。In a sixth aspect, an embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by an electronic device, the electronic device executes the first The steps of the image processing method described in any one of the aspect and its various embodiments or the steps of the terminal control method described in the second aspect.
第七方面,本公开实施例还提供了一种计算机程序,包括计算机可读代码,当所述代码在电子设备中执行时,促使所述电子设备中的处理器执行时执行如第一方面及其各种实施方式任一所述的图像处理方法的步骤或者第二方面所述的终端控制方法的步骤。In a seventh aspect, an embodiment of the present disclosure further provides a computer program, comprising computer-readable codes, when the codes are executed in an electronic device, causing a processor in the electronic device to execute the first aspect and the method when executed by a processor in the electronic device. The steps of the image processing method described in any of its various embodiments or the steps of the terminal control method described in the second aspect.
关于上述装置、电子设备、及计算机可读存储介质的效果描述参见上述方法的说明,这里不再赘述。For the description of the effects of the foregoing apparatus, electronic device, and computer-readable storage medium, reference may be made to the description of the foregoing method, and details are not repeated here.
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present disclosure more obvious and easy to understand, the preferred embodiments are exemplified below, and are described in detail as follows in conjunction with the accompanying drawings.
附图说明Description of drawings
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出 了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to explain the technical solutions of the embodiments of the present disclosure more clearly, the following briefly introduces the accompanying drawings required in the embodiments, which are incorporated into the specification and constitute a part of the specification. The drawings illustrate embodiments consistent with the present disclosure, and together with the description serve to explain the technical solutions of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. Other related figures are obtained from these figures.
图1示出了本公开实施例一所提供的一种图像处理方法的流程图;FIG. 1 shows a flowchart of an image processing method provided by Embodiment 1 of the present disclosure;
图2(a)示出了本公开实施例一所提供的一种图像处理方法的应用示意图;FIG. 2( a ) shows a schematic diagram of an application of an image processing method provided by Embodiment 1 of the present disclosure;
图2(b)示出了本公开实施例一所提供的一种图像处理方法的应用示意图;FIG. 2(b) shows a schematic diagram of the application of an image processing method provided by Embodiment 1 of the present disclosure;
图2(c)示出了本公开实施例一所提供的一种图像处理方法的应用示意图;Fig. 2(c) shows a schematic diagram of the application of an image processing method provided by Embodiment 1 of the present disclosure;
图2(d)示出了本公开实施例一所提供的一种图像处理方法的应用示意图;FIG. 2(d) shows an application schematic diagram of an image processing method provided by Embodiment 1 of the present disclosure;
图3示出了本公开实施例一所提供的一种终端控制方法的流程图;FIG. 3 shows a flowchart of a terminal control method provided by Embodiment 1 of the present disclosure;
图4示出了本公开实施例二所提供的一种图像处理装置的示意图;FIG. 4 shows a schematic diagram of an image processing apparatus provided by Embodiment 2 of the present disclosure;
图5示出了本公开实施例二所提供的一种终端控制装置的示意图;FIG. 5 shows a schematic diagram of a terminal control apparatus provided by Embodiment 2 of the present disclosure;
图6示出了本公开实施例三所提供的一种电子设备的示意图;FIG. 6 shows a schematic diagram of an electronic device according to Embodiment 3 of the present disclosure;
图7示出了本公开实施例三所提供的另一种电子设备的示意图。FIG. 7 shows a schematic diagram of another electronic device provided by Embodiment 3 of the present disclosure.
具体实施方式Detailed ways
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only These are some, but not all, embodiments of the present disclosure. The components of the disclosed embodiments generally described and illustrated herein may be arranged and designed in a variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the disclosure as claimed, but is merely representative of selected embodiments of the disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative work fall within the protection scope of the present disclosure.
目前,往往可以利用活体检测模型对每个模组(对应一个双目摄像头)采集的一组图像实现活体检测。例如,将双目摄像头的左目摄像头对物体拍摄获得的一幅图片和双目摄像头的右目摄像头对该物体拍摄获得的另一幅图片同时输入活体检测模型,从而最终得到对该物体是否为活体的结果。At present, a set of images collected by each module (corresponding to a binocular camera) can often be detected by using a living body detection model. For example, a picture obtained by the left eye camera of the binocular camera of the object and another picture obtained by the right eye camera of the binocular camera of the object are input into the living body detection model at the same time, so as to finally obtain whether the object is a living body. result.
通常,针对不同模组会训练不同的活体检测模型。这是由于不同的模组的基线不同,即,不同双目摄像头中的两个摄像头的相对距离不同,这将导致在一个模组上表现良好的模型,在另外一个基线不同的模组上的精度表现却较差。也就是,同一活体检测模型对不同模组的适配能力较弱,因此往往需要针对不同的模组训练不同的活体检测模型,模型训练的时间成本巨大。Usually, different live detection models are trained for different modules. This is due to the different baselines of different modules, i.e., the relative distances of the two cameras in different binocular cameras are different, which will result in a model that performs well on one module, and a model that performs well on another module with different baselines. Accuracy performance is poor. That is, the same living detection model has a weak adaptability to different modules, so it is often necessary to train different living detection models for different modules, and the time cost of model training is huge.
基于此,本公开提供了一种图像处理方法及装置、终端控制方法及装置,以提升模组的泛化能力,降低后续进行目标对象识别等应用的时间成本。Based on this, the present disclosure provides an image processing method and device, and a terminal control method and device, so as to improve the generalization capability of the module and reduce the time cost of subsequent applications such as target object recognition.
为便于对本实施例进行理解,首先对本公开实施例所公开的一种图像处理方法进行详细介绍,本公开实施例所提供的图像处理方法的执行主体一般为具有一定计算能力的电子设备,该电子设备例如包括:终端设备或服务器或其它处理设备,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、蜂窝电话、无绳电话、个人数字助理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿 戴设备等。在一些可能的实现方式中,该图像处理方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。In order to facilitate the understanding of this embodiment, an image processing method disclosed in the embodiment of the present disclosure is first introduced in detail. The execution body of the image processing method provided by the embodiment of the present disclosure is generally an electronic device with The devices include, for example, terminal devices or servers or other processing devices, and the terminal devices may be user equipment (User Equipment, UE), mobile devices, user terminals, cellular phones, cordless phones, personal digital assistants (Personal Digital Assistant, PDA), handheld devices, computing devices, in-vehicle devices, wearable devices, etc. In some possible implementations, the image processing method may be implemented by the processor calling computer-readable instructions stored in the memory.
下面对本公开实施例提供的图像处理方法加以说明。The image processing method provided by the embodiments of the present disclosure will be described below.
参见图1所示,为本公开实施例提供的图像处理方法的流程图,方法包括步骤S101~S104,其中:Referring to FIG. 1, which is a flowchart of an image processing method provided by an embodiment of the present disclosure, the method includes steps S101-S104, wherein:
S101、获取通过双目摄像头中各摄像头拍摄目标对象得到的两个待检测图像;S101. Acquire two images to be detected obtained by shooting the target object by each camera in the binocular camera;
S102、对所述两个待检测图像分别进行目标对象检测,得到目标对象在每个待检测图像中的对象检测框;S102, performing target object detection on the two images to be detected, respectively, to obtain an object detection frame of the target object in each image to be detected;
S103、对每个所述对象检测框进行外扩处理,并对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框;S103, performing external expansion processing on each of the object detection frames, and translating at least one object detection frame after the external expansion processing to obtain a translated object detection frame;
S104、至少基于平移后的对象检测框,确定对目标对象的识别结果。S104. Determine the recognition result of the target object based on at least the translated object detection frame.
在一些例子中,所述双目摄像头可以包括一个RGB(Red Green Blue)摄像头和一个近红外摄像头,即所述两个待检测图像可以分别为RGB图像和红外图像。在另一些例子中,还可以包括两个红外摄像头,即所述两个待检测图像都为红外图像。在再一些例子中,还可以包括两个RGB摄像头,即所述两个待检测图像都为RGB图像。本申请对双目摄像头的具体构成不做限定。In some examples, the binocular camera may include an RGB (Red Green Blue) camera and a near-infrared camera, that is, the two images to be detected may be an RGB image and an infrared image, respectively. In other examples, two infrared cameras may also be included, that is, the two images to be detected are both infrared images. In still other examples, two RGB cameras may also be included, that is, the two images to be detected are both RGB images. This application does not limit the specific structure of the binocular camera.
这里,为了便于理解本公开实施例提供的图像处理方法,可以首先对该图像处理方法的应用场景进行简单描述。上述图像处理方法主要可以应用于基于双目摄像头进行目标识别的相关应用中,例如,可以是针对双目摄像头抓拍的人脸进行活体检测,还可以是针对双目摄像头抓拍的车辆进行车牌识别,还可以是其它相关应用,在此不做具体的限制。Here, in order to facilitate understanding of the image processing method provided by the embodiments of the present disclosure, an application scenario of the image processing method may be briefly described first. The above image processing method can be mainly applied to related applications of target recognition based on binocular cameras. For example, it can be used to perform liveness detection on faces captured by binocular cameras, and it can also be used to perform license plate recognition on vehicles captured by binocular cameras. It may also be other related applications, which are not specifically limited here.
本公开实施例中,可以基于双目摄像头拍摄同一场景下的目标对象来获取两幅图像。运用立体匹配算法获取视差图,进而获取深度图实现目标识别。考虑到在不同模组(每个模组对应一个双目摄像头)所包括的两个摄像头之间的相对距离不同的情况下,即使利用同一识别方法针对同一目标进行目标识别,也可能会因为基线(两个摄像头之间的相对距离)的不同而导致识别结果的不同,特别是在利用目标检测模型进行目标识别的过程中,由于训练模型本身需要输入大量的图像样本,若采用的图像样本所对应的基线与目标检测模型所对应的模组的基线不同,将导致模型准确度大大降低。尽管可以针对不同的模组训练不同的目标检测模型以确保模型准确度,然而这种方式将导致训练成本大为增加。In the embodiment of the present disclosure, two images may be acquired based on the binocular camera shooting the target object in the same scene. The disparity map is obtained by using the stereo matching algorithm, and then the depth map is obtained to realize target recognition. Considering that the relative distance between the two cameras included in different modules (each module corresponds to a binocular camera) is different, even if the same recognition method is used for target recognition for the same target, it may be due to the baseline. (The relative distance between the two cameras) leads to different recognition results, especially in the process of using the target detection model for target recognition, since the training model itself needs to input a large number of image samples, if the image samples used are The corresponding baseline is different from the baseline of the module corresponding to the target detection model, which will greatly reduce the accuracy of the model. Although different object detection models can be trained for different modules to ensure the accuracy of the model, this method will lead to a large increase in the training cost.
正是为了解决这一问题,本公开实施例才提供了一种能够为不同模组提供通用的伪基线,进而可以基于不同模组采集的图像进行目标识别的图像处理方法,该方法首先可以利用目标对象检测来降低当前模组所存在的基线的影响,而后为了便于进行后续的目标识别,可以基于外扩和平移的配合处理来构建伪基线,从而确保在消除原始基线的影响下,基于统一的伪基线来实现目标识别。Just to solve this problem, the embodiments of the present disclosure provide an image processing method that can provide a general pseudo-baseline for different modules, and then can perform target recognition based on images collected by different modules. Target object detection can reduce the influence of the baseline existing in the current module, and then in order to facilitate subsequent target recognition, a pseudo-baseline can be constructed based on the cooperation of external expansion and translation, so as to ensure that the influence of the original baseline is eliminated. Pseudo-baseline to achieve target recognition.
其中,本公开实施例中双目摄像头采集的两张待检测图像可以是基于双目摄像头所在应用场景确定的,例如,在人脸识别应用中,这里所采集的两张待检测图像可以是包含人脸的图像;再如,在智能交通应用中,这里所采集的两张待检测图像可以是包含车辆的图像。Wherein, the two to-be-detected images collected by the binocular camera in the embodiment of the present disclosure may be determined based on the application scenario where the binocular camera is located. For example, in a face recognition application, the two to-be-detected images collected here may include An image of a human face; for another example, in an intelligent transportation application, the two images to be detected collected here may be images containing vehicles.
考虑到双目摄像头所包括的两个摄像头之间的相对距离(对应基线)使得两个摄像头摄取同一个目标对象时存在有方向差异(即视差),为了构建统一的伪基线,首先可以基于对两个摄像头获取的待检测图像分别进行目标对象检测所得到的对象检测框消 除视差所带来的原始基线的影响。Considering that the relative distance between the two cameras included in the binocular camera (corresponding to the baseline) causes the two cameras to capture the same target object, there is a directional difference (that is, parallax). The object detection frame obtained by the target object detection on the to-be-detected images obtained by the two cameras respectively eliminates the influence of the original baseline caused by the parallax.
本公开实施例中,可以基于传统的目标对象检测方法从待检测图像中检测出目标对象所在对象检测框。这里的目标对象检测方法可以是帧差法、背景减除法、光流法等。In the embodiment of the present disclosure, the object detection frame where the target object is located can be detected from the image to be detected based on the traditional target object detection method. The target object detection method here may be a frame difference method, a background subtraction method, an optical flow method, and the like.
本公开实施例除了可以采用上述传统方法实现对象检测,还可以基于训练好的检测模型进行对象检测。这里的检测模型可以是带有对象检测框标注的图像样本训练得到的,训练的可以是输入的图像样本与输出的对象检测框之间的对应关系,这样,在将待检测图像输入到训练好的检测模型的情况下,即可以确定出待检测图像中的对象检测框。该检测模型可以是单独的一个神经网络,也可以包含在上述目标检测模型之中。In the embodiment of the present disclosure, in addition to using the above-mentioned traditional method to implement object detection, object detection can also be performed based on a trained detection model. The detection model here can be obtained by training the image samples marked with the object detection frame, and the training can be the correspondence between the input image sample and the output object detection frame. In the case of the detection model of , the object detection frame in the image to be detected can be determined. The detection model may be a separate neural network, or may be included in the above target detection model.
在消除原始基线的影响下,考虑到由基线所产生的视差对于目标识别的关键作用,本公开实施例可以基于平移操作创建出共用的伪基线以消除视差。Under the influence of eliminating the original baseline, considering the critical role of the parallax generated by the baseline for target recognition, the embodiments of the present disclosure can create a common pseudo-baseline based on the translation operation to eliminate the parallax.
考虑到在进行对象检测框平移之前,对象检测框内框选的目标对象是完整的对象,如可以是一个包含人的五官、头发、脖子的目标人脸,若直接进行对象检测框的平移,可能导致目标人脸的不完整,从而不利于进行后续的对象识别。基于此,本公开实施例在进行平移操作之前,可以先进行外扩操作。Considering that before the translation of the object detection frame, the target object selected in the object detection frame is a complete object, such as a target face containing human facial features, hair, and neck. If the translation of the object detection frame is performed directly, It may lead to the incompleteness of the target face, which is not conducive to subsequent object recognition. Based on this, before the translation operation is performed in the embodiment of the present disclosure, the expansion operation may be performed first.
其中,上述外扩操作一定程度上可以扩大对象检测框所框选的图像区域。考虑到图像区域越大,一定程度上说明该区域包含的信息内容也越多,这一方面可以提升后续目标识别的准确率,另一方面还可以为后续的平移操作提供平移依据。Wherein, the above expansion operation can expand the image area framed by the object detection frame to a certain extent. Considering that the larger the image area, the more information content the area contains to a certain extent. This can improve the accuracy of subsequent target recognition, and on the other hand, can provide translation basis for subsequent translation operations.
需要说明的是,本公开实实施例中的平移操作可以是针对外扩处理后的两个对象检测框中的其中一个进行平移操作,这里,可以以另一个未平移的对象检测框所对应的摄像头作为平移的参考依据,还可以是针对两个对象检测框均进行平移操作,这里,可以以两个摄像头的中心位置作为参考依据进行平移。It should be noted that the translation operation in this embodiment of the present disclosure may be a translation operation performed on one of the two object detection frames after the expansion processing. The camera is used as a reference for translation, and the translation operation may also be performed for both object detection frames. Here, the center position of the two cameras may be used as a reference for translation.
本公开实施例提供的图像处理方法在经过平移处理之后,可以基于一个平移处理后的对象检测框和一个未进行平移的对象检测框,确定对目标对象的识别结果,也可以基于两个平移处理后的对象检测框,确定对目标对象的识别结果。After the image processing method provided by the embodiments of the present disclosure is subjected to translation processing, the recognition result of the target object may be determined based on a translation-processed object detection frame and an object detection frame that has not been translated, or the recognition result of the target object may be determined based on two translation-processed object detection frames. After the object detection frame, the recognition result of the target object is determined.
在具体应用中,可以利用训练好的目标识别模型对平移处理后的对象检测框进行目标识别,以确定目标对象的识别结果。In a specific application, the trained target recognition model can be used to perform target recognition on the object detection frame after translation processing, so as to determine the recognition result of the target object.
本公开实施例中的目标识别模型可以是有关人脸识别的活体检测模型,将上述一个平移处理后的对象检测框和一个未进行平移的对象检测框输入到训练好的活体检测模型,可以判断出对象检测框对应的目标人脸是否为真实人脸,或者将上述两个平移处理后的对象检测框输入到训练好的活体检测模型,可以判断出对象检测框对应的目标人脸是否为真实人脸。The target recognition model in the embodiment of the present disclosure may be a living body detection model related to face recognition, and the above-mentioned one translated object detection frame and one untranslated object detection frame are input into the trained living body detection model, and it can be determined that Determine whether the target face corresponding to the object detection frame is a real face, or input the above two translated object detection frames into the trained living detection model to determine whether the target face corresponding to the object detection frame is real. human face.
另外,上述目标识别模型还可以是有关车辆识别的车辆检测模型,将上述一个平移处理后的对象检测框和一个未进行平移的对象检测框输入到训练好的车辆检测模型,可以判断出对象检测框对应的目标车辆的类型信息,或者将上述两个平移处理后的对象检测框输入到训练好的车辆检测模型,可以判断出对象检测框对应的目标车辆的类型信息。In addition, the above-mentioned target recognition model can also be a vehicle detection model related to vehicle recognition. The above-mentioned one translated object detection frame and one untranslated object detection frame are input into the trained vehicle detection model, and the object detection model can be determined. The type information of the target vehicle corresponding to the frame, or inputting the above two translated object detection frames into the trained vehicle detection model, can determine the type information of the target vehicle corresponding to the object detection frame.
为了便于进一步理解上述目标识别的过程,接下来可以以人脸识别为例,结合图2(a)~2(d)对上述过程进行详细说明。In order to further understand the above process of target recognition, the above process can be described in detail with reference to Figures 2(a) to 2(d) by taking face recognition as an example.
如图2(a)所示为针对人脸活体检测所设置的双目摄像头采集的两张待检测图像,左边图像为双目摄像头所包括的左目摄像头(如RGB摄像头)所采集的人体图像1,右边图像为双目摄像头所包括的右目摄像头(如近红外摄像头)所采集的人体图像2,在针对上述两张人脸图像进行目标人脸检测之后,可以生成目标人脸在两张人体图像中 的对象检测框,如图2(b)所示。As shown in Figure 2 (a), two images to be detected are collected by the binocular camera set for the detection of human face living body. , the right image is the human body image 2 collected by the right eye camera (such as a near-infrared camera) included in the binocular camera, after the target face detection is performed for the above-mentioned two face images, the target face can be generated in the two human body images. The object detection frame in , as shown in Figure 2(b).
针对图2(b)所示的两个对象检测框,本公开实施例可以进行外扩处理,如图2(c)所示。这里,可以针对图2(c)的右边图像中的对象检测框进行平移处理,得到平移处理后的对象检测框,如图2(d)的右边图像所示,图2(d)的左边图像则与图2(c)的左边图像相同,包含有未平移处理的外扩处理后的对象检测框。For the two object detection frames shown in FIG. 2( b ), the embodiment of the present disclosure may perform outreach processing, as shown in FIG. 2( c ). Here, the object detection frame in the right image of Fig. 2(c) can be subjected to translation processing to obtain the object detection frame after translation processing, as shown in the right image of Fig. 2(d), the left image of Fig. 2(d) It is the same as the image on the left of Figure 2(c), including the object detection frame after expansion processing without translation processing.
本公开实施例中,针对图2(d)所示的两张图像可以基于对象检测框进行剪切,获得对应的两张人脸图像,在将两张人脸图像输入到活体检测模型的情况下,即可以确定出目标人脸是否为真实人脸。In the embodiment of the present disclosure, the two images shown in FIG. 2(d) can be cut based on the object detection frame to obtain two corresponding face images. In the case of inputting the two face images into the living body detection model Then, it can be determined whether the target face is a real face.
这里,考虑到图2(d)的右边图像是利用近红外摄像头所采集的图像,在确定对象检测框内存在纸张、屏幕等非活体元素的情况下,可以直接确定上述目标人脸不是真实人脸,对应的活体检测得分为0分。Here, considering that the image on the right in Figure 2(d) is an image collected by a near-infrared camera, in the case where it is determined that there are non-living elements such as paper and screen in the object detection frame, it can be directly determined that the above-mentioned target face is not a real person face, the corresponding live detection score is 0 points.
需要说明的是,本公开实施例可以按照上述示例方式,针对右目摄像头所采集的图像进行平移处理,还可以是针对左目摄像头所采集的图像进行平移处理,本公开实施例可以基于不同的应用场景来选取,这里不做具体的限制。It should be noted that the embodiments of the present disclosure may perform translation processing on the images collected by the right-eye camera in the above-mentioned exemplary manner, and may also perform translation processing on the images collected by the left-eye camera, and the embodiments of the present disclosure may be based on different application scenarios. to choose, no specific restrictions are made here.
考虑到外扩处理对于目标识别的关键影响,接下来可以详细描述有关检测框外扩处理的相关内容。本公开实施例中,具体可以通过如下操作针对每个待检测图像的对象检测框进行外扩处理:Considering the key influence of the expansion processing on target recognition, the relevant content of the expansion processing of the detection frame can be described in detail next. In the embodiment of the present disclosure, the expansion processing may be performed on the object detection frame of each to-be-detected image by the following operations:
在待检测图像中,确定对象检测框的角点位置坐标;基于确定的角点位置坐标、以及预设外扩比例对对象检测框进行外扩处理,得到外扩处理后的对象检测框。In the image to be detected, the position coordinates of the corner points of the object detection frame are determined; the object detection frame is expanded based on the determined position coordinates of the corner points and the preset expansion ratio to obtain the expanded object detection frame.
这里,首先可以确定对象检测框在待检测图像中的角点位置坐标,该角点位置坐标可以对应的是对象检测框四个边角的图像坐标,在确定这些角点位置坐标的情况下,可以基于角点位置坐标以及预设外扩比例对对象检测框进行外扩处理以得到处理后的对象检测框。Here, first, the position coordinates of the corner points of the object detection frame in the image to be detected can be determined, and the position coordinates of the corner points can correspond to the image coordinates of the four corners of the object detection frame. In the case of determining the position coordinates of these corner points, The object detection frame may be expanded based on the position coordinates of the corner points and the preset expansion ratio to obtain the processed object detection frame.
其中,上述预设外扩比例,可以是基于目标对象的实际尺寸所确定的。这主要是考虑到针对不同的目标对象,其实际尺寸不同,对应的成像尺寸也不同。在同一场景下利用双目摄像头抓拍同一位置处的人脸和车辆,车辆所对应的车辆检测框远远大于人脸所对应的人脸检测框。这里,可以针对较大尺寸的目标对象设置更大比例的外扩系数,以可以实现对较大成像尺寸的目标对象周边区域信息的覆盖,针对较小尺寸的目标对象设置更小比例的外扩系数,以足以实现对较小成像尺寸的目标对象周边区域信息的覆盖。Wherein, the above-mentioned preset outward expansion ratio may be determined based on the actual size of the target object. This is mainly considering that for different target objects, their actual sizes are different, and the corresponding imaging sizes are also different. In the same scene, using the binocular camera to capture the face and vehicle at the same position, the vehicle detection frame corresponding to the vehicle is much larger than the face detection frame corresponding to the face. Here, a larger scale expansion coefficient can be set for the target object with a larger size, so as to cover the surrounding area information of the target object with a larger imaging size, and a smaller scale expansion coefficient can be set for the target object with a smaller size. coefficient to be sufficient to cover the surrounding area information of the target object with smaller imaging size.
以对对象检测框外扩0.5倍为例,还可以是通过拖动对象检测框的四个边角的方式实现外扩处理;还可以是先基于四个边角的角点位置坐标确定四个边框(对应上边框、下边框、左边框和右边框)所对应的长度,进而通过为每个边框设置的预设外扩比例实现针对边框的外扩处理,在具体应用时,可以是通过边框的向外拖动方式来实现。Taking the expansion of the object detection frame by 0.5 times as an example, the expansion processing can also be realized by dragging the four corners of the object detection frame; it can also be determined based on the position coordinates of the four corners. The length corresponding to the frame (corresponding to the upper frame, the lower frame, the left frame and the right frame), and then through the preset outward expansion ratio set for each frame to realize the outward expansion processing of the frame, in the specific application, it can be through the frame to achieve by dragging outwards.
需要说明的是,针对对象检测框的四个边角所形成的四个边框而言,可以设置不同的比例系数以适应目标对象所包括关键部位的外扩需求。针对人脸对象框,可以分别针对上述四个边框在其外扩方向上,按照一定的外扩比例进行外扩,例如,可以将左边框、右边框、上边框、下边框四个边框分别外扩0.4倍、0.4倍、0.8倍、0.4倍,外扩后的人脸对象框的四个边框的尺寸均发生了变化。这里之所以针对上边框所对应的眼睛这一部位进行更大倍数的外扩,主要是考虑到眼睛是人脸检测框所检测到的五官中更为靠上的部位,这一部分之上还有额头等活体元素,为了提升后续的活体检测准确率,可以适当的对上边框外扩更大的倍数。It should be noted that, for the four frames formed by the four corners of the object detection frame, different scale coefficients can be set to meet the expansion requirements of the key parts included in the target object. For the face object frame, the above-mentioned four frames can be expanded in their outward expansion directions according to a certain outward expansion ratio. For example, the four frames of the left frame, the right frame, the upper frame and the lower frame can be respectively 0.4 times, 0.4 times, 0.8 times, 0.4 times, the dimensions of the four borders of the expanded face object frame have all changed. The reason why the part of the eye corresponding to the upper border is expanded by a larger multiple is mainly considering that the eye is the upper part of the facial features detected by the face detection frame. For living elements such as the forehead, in order to improve the accuracy of subsequent living detection, the upper border can be appropriately expanded by a larger multiple.
需要说明的是,本公开实施例中可以针对四个边框中的一个边框单独进行外扩处理, 还可以是针对四个边框中的一对边框(例如左边框和右边框)同时进行外扩处理,还可以是其它外扩处理方式,在此不再赘述。It should be noted that, in the embodiment of the present disclosure, the expansion processing may be performed on one frame of the four frames alone, or the expansion processing may be performed on a pair of frames (eg, the left frame and the right frame) among the four frames at the same time. , and may also be other external expansion processing methods, which will not be repeated here.
考虑到平移处理对于伪基线构建的关键作用,接下来可以详细描述有关检测框平移处理的相关内容。Considering the key role of translation processing in the construction of pseudo-baselines, the related content of detection frame translation processing can be described in detail next.
在本公开实施例中,所述双目摄像头包括第一摄像头和第二摄像头,在针对对外扩处理后的一个对象检测框进行平移处理的情况下,具体可以通过如下操作进行平移处理:In the embodiment of the present disclosure, the binocular camera includes a first camera and a second camera. In the case of performing translation processing on an object detection frame after expansion processing, the translation processing may be performed by the following operations:
对所述第一摄像头采集的待检测图像中检测出的对象检测框进行外扩处理得到待平移检测框;将所述待平移检测框沿背离第二摄像头的方向平移,得到平移后的对象检测框。Performing expansion processing on the object detection frame detected in the to-be-detected image collected by the first camera to obtain the to-be-translated detection frame; translate the to-be-translated detection frame in a direction away from the second camera to obtain the translated object detection frame frame.
本公开实施例中,针对外扩处理后的对象检测框,可以从中选取一个待平移检测框,这里,若选取双目摄像头包含的右目摄像头(对应第一摄像头)所对应的对象检测框(即右检测框)作为待平移检测框,可以基于右目摄像头与左目摄像头之间的相对位置关系来确定右检测框的平移方向,右目摄像头相对靠右,可以确定右检测框的平移方向为向右(即远离左目摄像头的方向),若选取左目摄像头所对应的对象检测框(即左检测框)作为待平移检测框,则左检测框的平移方向向左。In the embodiment of the present disclosure, for the object detection frame after expansion processing, a detection frame to be translated can be selected from it. Here, if the object detection frame corresponding to the right-eye camera (corresponding to the first camera) included in the binocular camera is selected (ie Right detection frame) as the detection frame to be translated, the translation direction of the right detection frame can be determined based on the relative positional relationship between the right-eye camera and the left-eye camera, and the right-eye camera is relatively right, and the translation direction of the right detection frame can be determined to be to the right ( That is, the direction away from the left-eye camera), if the object detection frame corresponding to the left-eye camera (ie, the left detection frame) is selected as the detection frame to be translated, the translation direction of the left detection frame is to the left.
本公开实施例中,除了可以按照上述平移方向进行平移,还可以结合平移距离进行平移,本公开实施例中可以基于待平移检测框的尺寸信息来确定平移距离。In the embodiment of the present disclosure, in addition to the translation according to the above-mentioned translation direction, the translation may also be performed in combination with the translation distance. In the embodiment of the present disclosure, the translation distance may be determined based on the size information of the detection frame to be translated.
考虑到距离摄像头远近不同的目标对象,其成像大小也存在一定的差异,在待平移检测框的尺寸信息较大时,一定程度上说明目标对象距离摄像头的距离较近,较近距离的目标对象所存在的视差较大,因此所需平移距离较大,在待平移检测框的尺寸信息较小时,一定程度上说明目标对象距离摄像头的距离较远,较远距离的目标对象所存在的视差较小,因此所需平移距离较小。Considering that the target objects with different distances from the camera have certain differences in their imaging sizes, when the size information of the detection frame to be translated is large, it means that the target object is closer to the camera, and the closer target object is. The existing parallax is large, so the required translation distance is large. When the size information of the to-be-translated detection frame is small, it means that the target object is far away from the camera to a certain extent. small, so the required translation distance is small.
为了构建出适应于大尺寸目标对象以及小尺寸目标对象均可用的伪基线,可以基于预设平移系数以及检测框的尺寸信息中的宽度值来确定平移距离,例如,所述平移距离为预设平移系数与宽度值的乘积。该预设平移系数与双目摄像头中两个摄像头的距离有关。其中,检测框的宽度值对应的是该检测框的横边的大小。In order to construct a pseudo-baseline suitable for both large-sized target objects and small-sized target objects, the translation distance can be determined based on a preset translation coefficient and the width value in the size information of the detection frame. For example, the translation distance is preset The product of the translation factor and the width value. The preset translation coefficient is related to the distance between the two cameras in the binocular camera. Wherein, the width value of the detection frame corresponds to the size of the horizontal side of the detection frame.
其中,平移距离与宽度呈正比,也即宽度越大的检测框所确定的平移距离越大,以均衡大尺寸的目标对象的较大视差,宽度越小的检测框所确定的平移距离越小,以均衡小尺寸的目标对象的较小视差,从而可以构建出适应于各种目标对象的可共用的伪基线。Among them, the translation distance is proportional to the width, that is, the larger the translation distance determined by the detection frame with the larger width is, in order to balance the larger parallax of the large-sized target object, and the smaller the width of the detection frame is, the smaller the translation distance determined by the detection frame is. , in order to balance the small parallax of small-sized target objects, so that a common pseudo-baseline suitable for various target objects can be constructed.
在针对对外扩处理后的两个对象检测框均进行平移处理的情况下,本公开实施例中,可以分别将外扩处理后的两个对象检测框中的每个对象检测框向背离另一对象检测框的方向平移,得到处理后的对象检测框。In the case where the translation processing is performed on both the two object detection frames after the expansion processing, in this embodiment of the present disclosure, each object detection frame in the two object detection frames after the expansion processing may be moved away from the other The direction of the object detection frame is translated to obtain the processed object detection frame.
这里,可以基于双目摄像头所包括的右目摄像头与左目摄像头之间的相对位置关系分别确定左检测框和右检测框的平移方向,右目摄像头相对靠右,可以确定右检测框的平移方向为向右(即远离左检测框的方向),左目摄像头相对靠左,可以确定左检测框的平移方向为向左(即远离右检测框的方向)。Here, the translation directions of the left detection frame and the right detection frame can be determined respectively based on the relative positional relationship between the right-eye camera and the left-eye camera included in the binocular cameras. Right (that is, the direction away from the left detection frame), and the left-eye camera is relatively left, and it can be determined that the translation direction of the left detection frame is to the left (that is, the direction away from the right detection frame).
这里,也可以结合平移距离实现平移处理,具体可以参照上述描述,在此不再赘述。Here, the translation processing may also be implemented in combination with the translation distance. For details, reference may be made to the above description, which will not be repeated here.
利用本公开实施例提供的图像处理方法能够克服不同双目摄像头所存在的基线不同问题所导致的识别结果不准确的问题,具有较强的鲁棒性,因而可以被广泛应用于各个技术领域。The image processing method provided by the embodiment of the present disclosure can overcome the problem of inaccurate recognition results caused by different baselines of different binocular cameras, and has strong robustness, so it can be widely used in various technical fields.
其一,上述图像处理方法可以应用于终端控制应用中,如图3所示,具体可以按照如下步骤来实现:First, the above-mentioned image processing method can be applied to terminal control applications, as shown in FIG. 3 , and can be implemented according to the following steps:
S301、获取双目摄像头对目标人脸拍摄的一组人脸图像,一组人脸图像包括通过双目摄像头中第一摄像头拍摄的第一人脸图像,以及通过双目摄像头中第二摄像头拍摄的第二人脸图像;S301. Acquire a set of face images captured by a binocular camera on a target face, where the set of face images includes a first face image captured by a first camera in the binocular camera, and a set of face images captured by a second camera in the binocular camera The second face image of;
S302、通过上述图像处理方法,得到所述一组人脸图像对应的识别结果,识别结果包括所述目标人脸是否为真实人脸;S302, through the above-mentioned image processing method, obtain a recognition result corresponding to the group of face images, and the recognition result includes whether the target face is a real face;
S303、响应于人物的识别结果包括所述目标人脸为真实人脸,且所述目标人脸对应的人物通过身份认证,控制终端执行指定操作。S303 , in response to the character identification result including that the target face is a real face, and the character corresponding to the target face has passed identity authentication, control the terminal to perform a specified operation.
这里,在基于上述图像处理方法确定出一组人脸图像对应的目标人脸为真实人脸的情况下,结合所述目标人脸对应的人物身份认证成功结果可以控制终端执行指定操作,否则,终端可以拒绝执行指定操作,并发出警报提示。Here, in the case that the target face corresponding to a group of face images is determined to be a real face based on the above image processing method, the terminal can be controlled to perform the specified operation in combination with the successful result of the person identity authentication corresponding to the target face, otherwise, The terminal can refuse to perform the specified operation and issue an alarm prompt.
本公开实施例中,在这一终端控制方法应用到不同的终端的情况下,所对应执行的指定操作也可以不同。这里的终端可以是用户终端,还可以是闸机设备终端,还可以是支付终端等。In this embodiment of the present disclosure, when the terminal control method is applied to different terminals, the corresponding specified operations may also be different. The terminal here may be a user terminal, a gate device terminal, a payment terminal, or the like.
例如,在应用到解锁场景的情况下,在确定出用户终端的双目摄像头拍摄的一组人脸图像对应的目标人脸为真实人脸,且该目标人脸对应的人物的身份合法的情况下,确定可以成功解锁,若识别出非真实人脸或身份不合法的情况下,则确定解锁失败,这时,可以返回解锁失败提醒信息至用户终端,以提醒用户。For example, when applied to the unlocking scene, it is determined that the target face corresponding to a group of face images captured by the binocular camera of the user terminal is a real face, and the identity of the person corresponding to the target face is legal. In the next step, it is determined that the unlocking can be successful. If a non-real face or an illegal identity is identified, it is determined that the unlocking fails. At this time, the unlocking failure reminder message can be returned to the user terminal to remind the user.
再如,在应用到闸机验证通行场景的情况下,可以在基于识别结果确定为真实人脸,且身份合法的情况下,控制闸机设备终端连接的通行开关打开,从而实现了闸机的自动通行,若识别出非真实人脸或身份不合法,则无法通行。For another example, in the case of applying to the gate verification pass scene, it can be determined as a real face based on the recognition result, and the identity is legal, control the access switch connected to the gate equipment terminal to open, thereby realizing the gate's Automatic passage, if a non-real face is identified or the identity is illegal, it will not be able to pass.
需要说明的是,本公开实施提供的图像处理方法不仅可以应用于上述解锁应用、闸机验证应用中,还可以应用于其它场景中,例如,可以是针对视频监控场景下的行人检测,还可以是嵌入到金融设备中,以进行金融服务的活体检测,在此不做赘述。It should be noted that the image processing method provided by the implementation of the present disclosure can not only be applied to the above unlocking applications and gate verification applications, but also can be applied to other scenarios, for example, pedestrian detection in a video surveillance scenario, or It is embedded in financial equipment to perform liveness detection of financial services, which will not be repeated here.
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。Those skilled in the art can understand that in the above method of the specific implementation, the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.
基于同一发明构思,本公开实施例中还提供了与图像处理方法对应的图像处理装置、与终端控制方法对应的终端控制装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。Based on the same inventive concept, the embodiments of the present disclosure also provide an image processing apparatus corresponding to the image processing method and a terminal control apparatus corresponding to the terminal control method. The above-mentioned methods are similar, so the implementation of the apparatus may refer to the implementation of the method, and repeated descriptions will not be repeated.
参照图4所示,为本公开实施例提供的一种图像处理装置的示意图,装置包括:获取模块401、检测模块402、外扩模块403和确定模块404;其中,Referring to FIG. 4 , which is a schematic diagram of an image processing apparatus provided by an embodiment of the present disclosure, the apparatus includes: an acquisition module 401, a detection module 402, an external expansion module 403, and a determination module 404; wherein,
获取模块401,用于获取通过双目摄像头中各摄像头拍摄目标对象得到的两个待检测图像;an acquisition module 401, configured to acquire two images to be detected obtained by shooting the target object by each camera in the binocular camera;
检测模块402,用于对所述两个待检测图像分别进行目标对象检测,得到所述目标对象在每个所述待检测图像中的对象检测框;A detection module 402, configured to perform target object detection on the two to-be-detected images respectively, to obtain an object detection frame of the target object in each of the to-be-detected images;
外扩模块403,用于对每个所述对象检测框进行外扩处理,并对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框;The expansion module 403 is configured to perform expansion processing on each of the object detection frames, and translate at least one object detection frame after the expansion processing to obtain the translated object detection frame;
确定模块404,用于至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果。A determination module 404, configured to determine a recognition result of the target object based on at least the translated object detection frame.
本公开实施例基于目标对象检测可以聚焦到目标对象,可以初步减小基线的影响,又考虑到在针对双目摄像头采集的两张待检测图像进行目标对象识别的过程中,需要参考一个模组(对应一个双目摄像头)所形成的视差来确定有关目标对象的深度信息,因而,本公开实施例进一步通过对对象检测框进行外扩和平移的配合处理操作,来达到模拟人眼视差的效果,得到用于对目标对象进行识别的处理后的对象检测框。本公开实施例通用性较强,可以实现不同的模组共用一套伪基线以模拟人眼视差的目的,从而提升模组的泛化能力,降低后续进行目标对象识别等应用的时间成本。The embodiments of the present disclosure can focus on the target object based on the target object detection, and can initially reduce the influence of the baseline, and consider that in the process of performing target object recognition on the two images to be detected collected by the binocular camera, it is necessary to refer to a module The parallax formed by (corresponding to a binocular camera) determines the depth information about the target object. Therefore, the embodiment of the present disclosure further achieves the effect of simulating the parallax of the human eye by performing the cooperative processing operation of expanding and translating the object detection frame. , to obtain the processed object detection frame for identifying the target object. The embodiments of the present disclosure are highly versatile, and can achieve the purpose of sharing a set of pseudo-baselines for different modules to simulate human eye parallax, thereby improving the generalization ability of the modules and reducing the time cost of subsequent applications such as target object recognition.
在一种可能的实施方式中,双目摄像头包括第一摄像头和第二摄像头;在对外扩处理后的一个对象检测框进行平移处理的情况下,外扩模块403,用于按照如下步骤对外扩处理后的至少一个对象检测框进行平移处理:对所述第一摄像头采集的待检测图像中检测出的对象检测框进行外扩处理得到待平移检测框;将所述待平移检测框沿背离所述第二摄像头的方向平移,得到所述平移后的对象检测框。In a possible implementation, the binocular camera includes a first camera and a second camera; in the case of performing translation processing on an object detection frame after the external expansion processing, the external expansion module 403 is used for external expansion according to the following steps. The processed at least one object detection frame is subjected to translation processing: the object detection frame detected in the to-be-detected image collected by the first camera is subjected to expansion processing to obtain the to-be-translated detection frame; The direction of the second camera is translated to obtain the translated object detection frame.
在一种可能的实施方式中,外扩模块403,用于按照如下步骤将待平移检测框沿背离第二摄像头的方向平移,得到平移后的对象检测框:基于所述待平移检测框的尺寸信息,确定平移距离;按照背离所述第二摄像头的方向,将所述待平移检测框移动所述平移距离,得到所述平移后的对象检测框。In a possible implementation, the external expansion module 403 is configured to translate the detection frame to be translated in a direction away from the second camera according to the following steps to obtain the translated object detection frame: based on the size of the detection frame to be translated information to determine the translation distance; move the to-be-translated detection frame by the translation distance according to the direction away from the second camera to obtain the translated object detection frame.
在一种可能的实施方式中,外扩模块403,用于按照如下步骤基所述待平移对象检测框的尺寸信息,确定平移距离:基于所述待平移检测框的尺寸信息中的宽度值以及预设平移系数,确定所述平移距离。In a possible implementation manner, the expansion module 403 is configured to determine the translation distance based on the size information of the detection frame of the object to be translated according to the following steps: based on the width value in the size information of the detection frame to be translated and A translation coefficient is preset to determine the translation distance.
在一种可能的实施方式中,在对外扩处理后的两个对象检测框进行平移的情况下,外扩模块403,用于按照如下步骤对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框:分别将所述外扩处理后的两个对象检测框中的每个对象检测框向背离另一对象检测框的方向平移,得到所述平移后的两个对象检测框。In a possible implementation manner, in the case of translating the two object detection frames after the outer expansion processing, the outer expansion module 403 is configured to translate the at least one object detection frame after the outer expansion processing according to the following steps to obtain The translated object detection frame: each object detection frame in the two object detection frames after the expansion processing is respectively translated in a direction away from the other object detection frame to obtain the translated two object detection frames .
在一种可能的实施方式中,外扩模块403,用于按照如下步骤对每个所述对象检测框进行外扩处理:对于每个所述对象检测框,在该对象检测框对应的待检测图像中,确定该对象检测框的角点位置坐标;基于确定的所述角点位置坐标和预设外扩比例,对该对象检测框进行外扩处理,得到该对象检测框对应的外扩处理后的对象检测框。In a possible implementation manner, the expansion module 403 is configured to perform expansion processing on each of the object detection frames according to the following steps: for each of the object detection frames, in the to-be-detected corresponding to the object detection frame In the image, the position coordinates of the corners of the object detection frame are determined; based on the determined position coordinates of the corners and the preset expansion ratio, the expansion processing is performed on the object detection frame, and the expansion processing corresponding to the object detection frame is obtained. After the object detection box.
在一种可能的实施方式中,所述确定模块404用于按照如下步骤至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果:在对外扩处理后的一个对象检测框进行平移的情况下,基于一个所述平移后的对象检测框和一个未进行平移的、外扩处理后的对象检测框,确定对所述目标对象的所述识别结果;或在对外扩处理后的两个对象检测框进行平移的情况下,基于两个所述平移后的对象检测框,确定对所述目标对象的所述识别结果。In a possible implementation manner, the determining module 404 is configured to determine the recognition result of the target object based on at least the translated object detection frame according to the following steps: an object detection frame after external expansion processing In the case of translation, the recognition result of the target object is determined based on an object detection frame after the translation and an object detection frame after the expansion process that has not been translated; or after the expansion process When the two object detection frames are translated, the recognition result of the target object is determined based on the two translated object detection frames.
在一种可能的实施方式中,目标对象为目标人脸;确定模块404,用于按照以下步骤至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果:利用训练好的活体检测神经网络至少对所述平移后的对象检测框进行目标人脸的识别,确定所述对象检测框对应的目标人脸是否为真实人脸。In a possible implementation manner, the target object is a target face; the determining module 404 is configured to determine the recognition result of the target object based on at least the translated object detection frame according to the following steps: using the trained The living body detection neural network at least performs target face recognition on the translated object detection frame, and determines whether the target face corresponding to the object detection frame is a real face.
参照图5所示,为本公开实施例提供的一种终端控制装置的示意图,装置包括:获取模块501、确定模块502和控制模块503;其中,Referring to FIG. 5 , which is a schematic diagram of a terminal control apparatus provided by an embodiment of the present disclosure, the apparatus includes: an acquisition module 501 , a determination module 502 and a control module 503 ; wherein,
获取模块501,用于获取所述双目摄像头对目标人脸拍摄的一组人脸图像,所述一组人脸图像包括通过所述双目摄像头中第一摄像头拍摄的第一人脸图像,以及通过所述双目摄像头中第二摄像头拍摄的第二人脸图像;The acquisition module 501 is configured to acquire a group of face images captured by the binocular camera on the target face, where the group of face images includes a first face image captured by a first camera in the binocular camera, and a second face image captured by the second camera in the binocular camera;
确定模块502,用于通过上述图像处理方法,得到所述一组人脸图像对应的识别结果,所述识别结果包括所述目标人脸是否为真实人脸;A determination module 502, configured to obtain a recognition result corresponding to the group of face images through the above-mentioned image processing method, where the recognition result includes whether the target face is a real face;
控制模块503,用于响应于所述人物的识别结果包括所述目标人脸为真实人脸,且所述目标人脸对应的人物通过身份认证,控制所述终端执行指定操作。The control module 503 is configured to control the terminal to perform a specified operation in response to the identification result of the person including that the target face is a real face and the person corresponding to the target face has passed identity authentication.
关于装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。For the description of the processing flow of each module in the apparatus and the interaction flow between the modules, reference may be made to the relevant descriptions in the foregoing method embodiments, which will not be described in detail here.
本公开实施例还提供了一种电子设备,如图6所示,为本公开实施例提供的电子设备结构示意图,包括:处理器601、存储器602、和总线603。存储器602存储有处理器601可执行的机器可读指令(比如,图4中的图像处理装置中获取模块401、检测模块402、外扩模块403、确定模块404对应的执行指令等),当电子设备运行时,处理器601与存储器602之间通过总线603通信,机器可读指令被处理器601执行时执行如下处理:获取通过双目摄像头中各摄像头拍摄目标对象得到的两个待检测图像;对所述两个待检测图像分别进行目标对象检测,得到所述目标对象在每个所述待检测图像中的对象检测框;对每个所述对象检测框进行外扩处理,并对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框;至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果。An embodiment of the present disclosure further provides an electronic device. As shown in FIG. 6 , a schematic structural diagram of the electronic device provided by an embodiment of the present disclosure includes: a processor 601 , a memory 602 , and a bus 603 . The memory 602 stores machine-readable instructions executable by the processor 601 (for example, the acquisition module 401, the detection module 402, the external expansion module 403, the execution instructions corresponding to the determination module 404 in the image processing apparatus in FIG. 4, etc.), when the electronic When the device is running, the processor 601 and the memory 602 communicate through the bus 603, and the machine-readable instructions are executed by the processor 601 and perform the following processing: obtain two images to be detected obtained by shooting the target object by each camera in the binocular camera; Perform target object detection on the two to-be-detected images respectively, to obtain an object detection frame of the target object in each of the to-be-detected images; perform external expansion processing on each of the object detection frames, and perform external expansion processing The at least one object detection frame after the translation is performed to obtain a translated object detection frame; the recognition result of the target object is determined based on at least the translated object detection frame.
在一种可能的实施方式中,双目摄像头包括第一摄像头和第二摄像头;在对外扩处理后的一个对象检测框进行平移处理的情况下,上述处理器601执行的指令中,在对外扩处理后的一个对象检测框进行平移的情况下,所述对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框,包括:对所述第一摄像头采集的待检测图像中检测出的对象检测框进行外扩处理得到待平移检测框;将所述待平移检测框沿背离所述第二摄像头的方向平移,得到所述平移后的对象检测框。In a possible implementation, the binocular camera includes a first camera and a second camera; in the case of performing translation processing on an object detection frame after the external expansion processing, in the instructions executed by the above-mentioned processor 601, the external expansion processing is performed. In the case where one processed object detection frame is translated, the at least one object detection frame after the expansion processing is translated to obtain the translated object detection frame, which includes: performing the translation in the to-be-detected image collected by the first camera. The detected object detection frame is expanded to obtain a to-be-translated detection frame; the to-be-translated detection frame is translated in a direction away from the second camera to obtain the translated object detection frame.
在一种可能的实施方式中,上述处理器601执行的指令中,将所述待平移检测框沿背离所述第二摄像头的方向平移,得到所述平移后的对象检测框,包括:基于所述待平移检测框的尺寸信息,确定平移距离;按照背离所述第二摄像头的方向,将所述待平移检测框移动所述平移距离,得到所述平移后的对象检测框。In a possible implementation manner, in the instructions executed by the processor 601, the to-be-translated detection frame is translated in a direction away from the second camera to obtain the translated object detection frame, including: based on the The size information of the to-be-translated detection frame is used to determine the translation distance; the to-be-translated detection frame is moved by the translation distance according to the direction away from the second camera to obtain the translated object detection frame.
在一种可能的实施方式中,上述处理器601执行的指令中,基于所述待平移检测框的尺寸信息,确定平移距离,包括:基于所述待平移检测框的尺寸信息中的宽度值以及预设平移系数,确定所述平移距离。In a possible implementation manner, in the instructions executed by the processor 601, determining the translation distance based on the size information of the to-be-translated detection frame includes: based on the width value in the size information of the to-be-translated detection frame and A translation coefficient is preset to determine the translation distance.
在一种可能的实施方式中,在对外扩处理后的两个对象检测框进行平移的情况下,上述处理器601执行的指令中,对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框,包括:分别将所述外扩处理后的两个对象检测框中的每个对象检测框向背离另一对象检测框的方向平移,得到所述平移后的两个对象检测框。In a possible implementation manner, in the case of translating the two object detection frames after the expansion processing, in the instructions executed by the processor 601, at least one object detection frame after the expansion processing is translated to obtain the translation The obtained object detection frame includes: respectively translating each object detection frame in the two object detection frames after the expansion processing in a direction away from the other object detection frame, so as to obtain the translated two object detection frames. frame.
在一种可能的实施方式中,上述处理器601执行的指令中,对每个所述对象检测框进行外扩处理,包括:对于每个所述对象检测框,在该对象检测框对应的待检测图像中,确定该对象检测框的角点位置坐标;基于确定的所述角点位置坐标和预设外扩比例,对该对象检测框进行外扩处理,得到该对象检测框对应的外扩处理后的对象检测框。In a possible implementation manner, in the instructions executed by the above-mentioned processor 601, performing an expansion process on each of the object detection frames includes: for each of the object detection frames, in the to-be-to-be-detected frame corresponding to the object detection frame In the detection image, determine the position coordinates of the corner points of the object detection frame; based on the determined position coordinates of the corner points and the preset expansion ratio, perform an expansion process on the object detection frame to obtain the expansion corresponding to the object detection frame. The processed object detection box.
在一种可能的实施方式中,上述处理器601执行的指令中,至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果包括:在对外扩处理后的一个对象检测框进行平移的情况下,基于一个所述平移后的对象检测框和一个未进行平移的、外扩处理后的对象检测框,确定对所述目标对象的所述识别结果;或在对外扩处理后的两个对象检测框进行平移的情况下,基于两个所述平移后的对象检测框,确定对所述目标对象的所述识别结果。In a possible implementation manner, in the instructions executed by the processor 601, at least based on the translated object detection frame, determining the recognition result of the target object includes: an object detection frame after external expansion processing In the case of translation, the recognition result of the target object is determined based on an object detection frame after the translation and an object detection frame after the expansion process that has not been translated; or after the expansion process When the two object detection frames are translated, the recognition result of the target object is determined based on the two translated object detection frames.
在一种可能的实施方式中,目标对象为目标人脸;所述至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果,包括:利用训练好的活体检测神经网络至少对所述平移后的对象检测框进行目标人脸的识别,确定所述对象检测框对应的目标人脸是否为真实人脸。In a possible implementation manner, the target object is a target face; and determining the recognition result of the target object based on at least the translated object detection frame includes: using a trained neural network for living body detection at least The target face is identified on the translated object detection frame, and it is determined whether the target face corresponding to the object detection frame is a real face.
本公开实施例还提供了一种电子设备,如图7所示,为本公开实施例提供的电子设备结构示意图,包括:处理器701、存储器702、和总线703。存储器702存储有处理器701可执行的机器可读指令(比如,图5中的终端控制装置中获取模块501、确定模块502、控制模块503对应的执行指令等),当电子设备运行时,处理器701与存储器702之间通过总线703通信,机器可读指令被处理器701执行时执行如下处理:获取所述双目摄像头对目标人脸拍摄的一组人脸图像,所述一组人脸图像包括通过所述双目摄像头中第一摄像头拍摄的第一人脸图像,以及通过所述双目摄像头中第二摄像头拍摄的第二人脸图像;通过上述实施例所述的图像处理方法,得到所述一组人脸图像对应的识别结果,所述识别结果包括所述目标人脸是否为真实人脸;响应于所述人物的识别结果包括所述目标人脸为真实人脸,且所述目标人脸对应的人物通过身份认证,控制所述终端执行指定操作。An embodiment of the present disclosure further provides an electronic device. As shown in FIG. 7 , a schematic structural diagram of the electronic device provided by an embodiment of the present disclosure includes: a processor 701 , a memory 702 , and a bus 703 . The memory 702 stores machine-readable instructions executable by the processor 701 (for example, the execution instructions corresponding to the acquisition module 501, the determination module 502, and the control module 503 in the terminal control apparatus in FIG. 5, etc.), and when the electronic device is running, processing The communication between the processor 701 and the memory 702 is through the bus 703. When the machine-readable instructions are executed by the processor 701, the following processing is performed: acquiring a group of face images captured by the binocular camera on the target face, the group of face images The image includes a first face image captured by the first camera in the binocular camera, and a second face image captured by the second camera in the binocular camera; through the image processing method described in the above embodiment, Obtain a recognition result corresponding to the group of face images, where the recognition result includes whether the target face is a real face; the recognition result in response to the character includes that the target face is a real face, and all The person corresponding to the target face is authenticated, and the terminal is controlled to perform a specified operation.
上述指令的具体执行过程可以参考本公开实施例中所述的方法的步骤,此处不再赘述。For the specific execution process of the above instruction, reference may be made to the steps of the methods described in the embodiments of the present disclosure, and details are not repeated here.
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的图像处理方法、终端控制方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the image processing method and the terminal control method described in the foregoing method embodiments are executed A step of. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.
本公开实施例所提供的图像处理方法、终端控制方法的计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令可用于执行上述方法实施例中所述的图像处理方法、终端控制方法的步骤,具体可参见上述方法实施例,在此不再赘述。The computer program product of the image processing method and the terminal control method provided by the embodiments of the present disclosure includes a computer-readable storage medium storing program codes, and the instructions included in the program codes can be used to execute the images described in the above method embodiments. For the steps of the processing method and the terminal control method, reference may be made to the foregoing method embodiments, which will not be repeated here.
本公开实施例还提供一种计算机程序,该计算机程序被处理器执行时实现前述实施例的任意一种方法。该计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。Embodiments of the present disclosure also provide a computer program, which implements any one of the methods in the foregoing embodiments when the computer program is executed by a processor. The computer program product can be specifically implemented by hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。Those skilled in the art can clearly understand that, for the convenience and brevity of description, for the specific working process of the system and device described above, reference may be made to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided by the present disclosure, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. The apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可 以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台电子设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium. Based on such understanding, the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the parts that contribute to the prior art or the parts of the technical solutions. The computer software products are stored in a storage medium, including Several instructions are used to cause an electronic device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。Finally, it should be noted that the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, but not to limit them. The protection scope of the present disclosure is not limited to this, although the aforementioned The embodiments describe the present disclosure in detail, and those skilled in the art should understand that: any person skilled in the art can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed by the present disclosure. Or can easily think of changes, or equivalently replace some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be covered in the present disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be based on the protection scope of the claims.

Claims (14)

  1. 一种图像处理方法,其特征在于,包括:An image processing method, comprising:
    获取通过双目摄像头中各摄像头拍摄目标对象得到的两个待检测图像;Acquiring two images to be detected obtained by shooting the target object with each camera in the binocular camera;
    对所述两个待检测图像分别进行目标对象检测,得到所述目标对象在每个所述待检测图像中的对象检测框;Performing target object detection on the two images to be detected, respectively, to obtain an object detection frame of the target object in each of the images to be detected;
    对每个所述对象检测框进行外扩处理,并对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框;Performing expansion processing on each of the object detection frames, and translating at least one object detection frame after the expansion processing to obtain a translated object detection frame;
    至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果。A recognition result of the target object is determined based at least on the translated object detection frame.
  2. 根据权利要求1所述的图像处理方法,其特征在于,所述双目摄像头包括第一摄像头和第二摄像头;The image processing method according to claim 1, wherein the binocular camera comprises a first camera and a second camera;
    在对外扩处理后的一个对象检测框进行平移的情况下,所述对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框,包括:In the case of translating an object detection frame after the outer expansion processing, the at least one object detection frame after the outer expansion processing is translated to obtain the translated object detection frame, including:
    对所述第一摄像头采集的待检测图像中检测出的对象检测框进行外扩处理得到待平移检测框;Performing expansion processing on the object detection frame detected in the to-be-detected image collected by the first camera to obtain the to-be-translated detection frame;
    将所述待平移检测框沿背离所述第二摄像头的方向平移,得到所述平移后的对象检测框。Translate the to-be-translated detection frame in a direction away from the second camera to obtain the translated object detection frame.
  3. 根据权利要求2所述的图像处理方法,其特征在于,所述将所述待平移检测框沿背离所述第二摄像头的方向平移,得到所述平移后的对象检测框,包括:The image processing method according to claim 2, wherein the translation of the to-be-translated detection frame in a direction away from the second camera to obtain the translated object detection frame comprises:
    基于所述待平移检测框的尺寸信息,确定平移距离;Determine the translation distance based on the size information of the to-be-translated detection frame;
    按照背离所述第二摄像头的方向,将所述待平移检测框移动所述平移距离,得到所述平移后的对象检测框。Move the to-be-translated detection frame by the translation distance in a direction away from the second camera to obtain the translated object detection frame.
  4. 根据权利要求3所述的图像处理方法,其特征在于,所述基于所述待平移检测框的尺寸信息,确定平移距离,包括:The image processing method according to claim 3, wherein the determining the translation distance based on the size information of the to-be-translated detection frame comprises:
    基于所述待平移检测框的尺寸信息中的宽度值以及预设平移系数,确定所述平移距离。The translation distance is determined based on the width value in the size information of the to-be-translated detection frame and a preset translation coefficient.
  5. 根据权利要求1所述的图像处理方法,其特征在于,在对外扩处理后的两个对象检测框进行平移的情况下,所述对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框,包括:The image processing method according to claim 1, characterized in that, in the case of translating the two object detection frames after the outer expansion processing, the at least one object detection frame after the outer expansion processing is translated to obtain the translated object detection frame, including:
    分别将所述外扩处理后的两个对象检测框中的每个对象检测框向背离另一对象检测框的方向平移,得到所述平移后的两个对象检测框。Each object detection frame in the two object detection frames after the expansion processing is respectively translated in a direction away from the other object detection frame, so as to obtain the translated two object detection frames.
  6. 根据权利要求1-5任一所述的图像处理方法,其特征在于,所述对每个所述对象检测框进行外扩处理,包括:The image processing method according to any one of claims 1-5, characterized in that, the performing expansion processing on each of the object detection frames comprises:
    对于每个所述对象检测框,For each of the object detection boxes,
    在该对象检测框对应的待检测图像中,确定该对象检测框的角点位置坐标;In the to-be-detected image corresponding to the object detection frame, determine the corner position coordinates of the object detection frame;
    基于确定的所述角点位置坐标和预设外扩比例,对该对象检测框进行外扩处理,得到该对象检测框对应的外扩处理后的对象检测框。Based on the determined position coordinates of the corner points and the preset expansion ratio, an expansion process is performed on the object detection frame to obtain an expanded object detection frame corresponding to the object detection frame.
  7. 根据权利要求1-6中任一项所述的图像处理方法,其特征在于,所述至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果包括:The image processing method according to any one of claims 1-6, wherein the determining, at least based on the translated object detection frame, the recognition result of the target object comprises:
    在对外扩处理后的一个对象检测框进行平移的情况下,基于一个所述平移后的对象检测框和一个未进行平移的、外扩处理后的对象检测框,确定对所述目标对象的所述识 别结果;或In the case of translating an object detection frame after the expansion processing, based on the translated object detection frame and an object detection frame after the expansion processing that has not been translated, determine all the object detection frames of the target object. the identification result; or
    在对外扩处理后的两个对象检测框进行平移的情况下,基于两个所述平移后的对象检测框,确定对所述目标对象的所述识别结果。In the case of translating the two object detection frames after the expansion processing, the recognition result of the target object is determined based on the two translated object detection frames.
  8. 根据权利要求1-7任一所述的图像处理方法,其特征在于,所述目标对象为目标人脸;所述至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果,包括:The image processing method according to any one of claims 1-7, wherein the target object is a target face; and the recognition result of the target object is determined at least based on the translated object detection frame ,include:
    利用训练好的活体检测神经网络至少对所述平移后的对象检测框进行目标人脸的识别,确定所述对象检测框对应的目标人脸是否为真实人脸。The trained living body detection neural network is used to identify the target face at least on the translated object detection frame, and determine whether the target face corresponding to the object detection frame is a real face.
  9. 一种终端控制方法,其特征在于,所述终端设置有双目摄像头,所述方法包括:A terminal control method, characterized in that the terminal is provided with a binocular camera, the method comprising:
    获取所述双目摄像头对目标人脸拍摄的一组人脸图像,所述一组人脸图像包括通过所述双目摄像头中第一摄像头拍摄的第一人脸图像,以及通过所述双目摄像头中第二摄像头拍摄的第二人脸图像;Acquiring a group of face images captured by the binocular camera on the target face, the group of face images including a first face image captured by a first camera in the binocular camera, and a set of face images captured by the binocular camera the second face image captured by the second camera in the camera;
    通过权利要求1-8任一项所述的图像处理方法,得到所述一组人脸图像对应的识别结果,所述识别结果包括所述目标人脸是否为真实人脸;Through the image processing method according to any one of claims 1-8, a recognition result corresponding to the group of face images is obtained, and the recognition result includes whether the target face is a real face;
    响应于所述识别结果包括所述目标人脸为真实人脸,且所述目标人脸对应的人物通过身份认证,控制所述终端执行指定操作。In response to the recognition result including that the target face is a real face, and the person corresponding to the target face has passed identity authentication, the terminal is controlled to perform a specified operation.
  10. 一种图像处理装置,其特征在于,包括:An image processing device, comprising:
    获取模块,用于获取通过双目摄像头中各摄像头拍摄目标对象得到的两个待检测图像;an acquisition module, used for acquiring two images to be detected obtained by shooting the target object by each camera in the binocular camera;
    检测模块,用于对所述两个待检测图像分别进行目标对象检测,得到所述目标对象在每个所述待检测图像中的对象检测框;a detection module, configured to perform target object detection on the two to-be-detected images respectively, to obtain an object detection frame of the target object in each of the to-be-detected images;
    外扩模块,用于对每个所述对象检测框进行外扩处理,并对外扩处理后的至少一个对象检测框进行平移,得到平移后的对象检测框;an external expansion module, configured to perform external expansion processing on each of the object detection frames, and translate at least one object detection frame after the external expansion processing to obtain the translated object detection frame;
    确定模块,用于至少基于所述平移后的对象检测框,确定对所述目标对象的识别结果。A determination module, configured to determine a recognition result of the target object based on at least the translated object detection frame.
  11. 一种终端控制装置,其特征在于,包括:A terminal control device, comprising:
    获取模块,用于获取所述双目摄像头对目标人脸拍摄的一组人脸图像,所述一组人脸图像包括通过所述双目摄像头中第一摄像头拍摄的第一人脸图像,以及通过所述双目摄像头中第二摄像头拍摄的第二人脸图像;an acquisition module, configured to acquire a group of face images captured by the binocular camera on the target face, where the group of face images includes a first face image captured by a first camera in the binocular camera, and A second face image captured by the second camera in the binocular camera;
    确定模块,用于通过权利要求1-8任一项所述的图像处理方法,得到所述一组人脸图像对应的识别结果,所述识别结果包括所述目标人脸是否为真实人脸;A determination module, used for obtaining the recognition result corresponding to the group of face images by the image processing method according to any one of claims 1-8, and the recognition result includes whether the target face is a real face;
    控制模块,用于响应于所述识别结果包括所述目标人脸为真实人脸,且所述目标人脸对应的人物通过身份认证,控制所述终端执行指定操作。A control module, configured to control the terminal to perform a specified operation in response to the recognition result including that the target face is a real face and the person corresponding to the target face has passed identity authentication.
  12. 一种电子设备,其特征在于,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至8任一所述的图像处理方法的步骤或者权利要求9所述的终端控制方法的步骤。An electronic device, characterized in that it includes: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processor and the memory are connected The machine-readable instructions perform the steps of the image processing method according to any one of claims 1 to 8 or the steps of the terminal control method according to claim 9 when the machine-readable instructions are executed by the processor.
  13. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被电子设备运行时,所述电子设备执行如权利要求1至8任一所述的图像处理方法的步骤或者权利要求9所述的终端控制方法的步骤。A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is run by an electronic device, the electronic device executes any one of claims 1 to 8. The steps of the image processing method described in claim 9 or the steps of the terminal control method described in claim 9 .
  14. 一种计算机程序,其特征在于,包括计算机可读代码,当所述代码在电子设备中执行时,促使所述电子设备中的处理器执行权利要求1至8任一项所述的图像处理方法的步骤或者权利要求9所述的终端控制方法的步骤。A computer program, comprising computer readable codes, when the codes are executed in an electronic device, the processor in the electronic device is urged to execute the image processing method according to any one of claims 1 to 8 or the steps of the terminal control method according to claim 9.
PCT/CN2021/121457 2020-11-30 2021-09-28 Image processing method and apparatus, and terminal control method and apparatus WO2022111044A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011377063.0 2020-11-30
CN202011377063.0A CN112560592A (en) 2020-11-30 2020-11-30 Image processing method and device, and terminal control method and device

Publications (1)

Publication Number Publication Date
WO2022111044A1 true WO2022111044A1 (en) 2022-06-02

Family

ID=75046804

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/121457 WO2022111044A1 (en) 2020-11-30 2021-09-28 Image processing method and apparatus, and terminal control method and apparatus

Country Status (2)

Country Link
CN (1) CN112560592A (en)
WO (1) WO2022111044A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117523431A (en) * 2023-11-17 2024-02-06 中国科学技术大学 Firework detection method and device, electronic equipment and storage medium

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560592A (en) * 2020-11-30 2021-03-26 深圳市商汤科技有限公司 Image processing method and device, and terminal control method and device
CN113159161A (en) * 2021-04-16 2021-07-23 深圳市商汤科技有限公司 Target matching method and device, equipment and storage medium
CN112949661B (en) * 2021-05-13 2021-08-06 北京世纪好未来教育科技有限公司 Detection frame self-adaptive external expansion method and device, electronic equipment and storage medium
CN113392800A (en) * 2021-06-30 2021-09-14 浙江商汤科技开发有限公司 Behavior detection method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080037838A1 (en) * 2006-08-11 2008-02-14 Fotonation Vision Limited Real-Time Face Tracking in a Digital Image Acquisition Device
CN106127170A (en) * 2016-07-01 2016-11-16 重庆中科云丛科技有限公司 A kind of merge the training method of key feature points, recognition methods and system
CN110619656A (en) * 2019-09-05 2019-12-27 杭州宇泛智能科技有限公司 Face detection tracking method and device based on binocular camera and electronic equipment
CN112560592A (en) * 2020-11-30 2021-03-26 深圳市商汤科技有限公司 Image processing method and device, and terminal control method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107813310B (en) * 2017-11-22 2020-10-20 浙江优迈德智能装备有限公司 Multi-gesture robot control method based on binocular vision

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080037838A1 (en) * 2006-08-11 2008-02-14 Fotonation Vision Limited Real-Time Face Tracking in a Digital Image Acquisition Device
CN106127170A (en) * 2016-07-01 2016-11-16 重庆中科云丛科技有限公司 A kind of merge the training method of key feature points, recognition methods and system
CN110619656A (en) * 2019-09-05 2019-12-27 杭州宇泛智能科技有限公司 Face detection tracking method and device based on binocular camera and electronic equipment
CN112560592A (en) * 2020-11-30 2021-03-26 深圳市商汤科技有限公司 Image processing method and device, and terminal control method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117523431A (en) * 2023-11-17 2024-02-06 中国科学技术大学 Firework detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112560592A (en) 2021-03-26

Similar Documents

Publication Publication Date Title
WO2022111044A1 (en) Image processing method and apparatus, and terminal control method and apparatus
US10956714B2 (en) Method and apparatus for detecting living body, electronic device, and storage medium
US10896518B2 (en) Image processing method, image processing apparatus and computer readable storage medium
CN108764091B (en) Living body detection method and apparatus, electronic device, and storage medium
TWI554976B (en) Surveillance systems and image processing methods thereof
TW505892B (en) System and method for promptly tracking multiple faces
WO2020063100A1 (en) Augmented reality image display method and apparatus, and device
WO2021027537A1 (en) Method and apparatus for taking identification photo, device and storage medium
WO2021147418A1 (en) Image dehazing method and apparatus, device and computer storage medium
CN106981078B (en) Sight line correction method and device, intelligent conference terminal and storage medium
TW202026948A (en) Methods and devices for biological testing and storage medium thereof
TWI669664B (en) Eye state detection system and method for operating an eye state detection system
WO2018082389A1 (en) Skin colour detection method and apparatus, and terminal
WO2020024737A1 (en) Method and apparatus for generating negative sample of face recognition, and computer device
WO2023011013A1 (en) Splicing seam search method and apparatus for video image, and video image splicing method and apparatus
CN111814564A (en) Multispectral image-based living body detection method, device, equipment and storage medium
KR102338984B1 (en) System for providing 3D model augmented reality service using AI and method thereof
CN114615480A (en) Projection picture adjusting method, projection picture adjusting device, projection picture adjusting apparatus, storage medium, and program product
CN113642639B (en) Living body detection method, living body detection device, living body detection equipment and storage medium
WO2021238163A1 (en) Image processing method and apparatus, electronic device, and storage medium
CN109726613B (en) Method and device for detection
US11620759B2 (en) Systems and methods for machine learning enhanced image registration
WO2024001617A1 (en) Method and apparatus for identifying behavior of playing with mobile phone
CN111383255B (en) Image processing method, device, electronic equipment and computer readable storage medium
US20210150745A1 (en) Image processing method, device, electronic apparatus, and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896546

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20.09.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21896546

Country of ref document: EP

Kind code of ref document: A1