WO2020151750A1 - Image processing method and device - Google Patents

Image processing method and device Download PDF

Info

Publication number
WO2020151750A1
WO2020151750A1 PCT/CN2020/073836 CN2020073836W WO2020151750A1 WO 2020151750 A1 WO2020151750 A1 WO 2020151750A1 CN 2020073836 W CN2020073836 W CN 2020073836W WO 2020151750 A1 WO2020151750 A1 WO 2020151750A1
Authority
WO
WIPO (PCT)
Prior art keywords
face area
image sample
image
area
face
Prior art date
Application number
PCT/CN2020/073836
Other languages
French (fr)
Chinese (zh)
Inventor
李圣喜
柴振华
孟欢欢
Original Assignee
北京三快在线科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京三快在线科技有限公司 filed Critical 北京三快在线科技有限公司
Publication of WO2020151750A1 publication Critical patent/WO2020151750A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Definitions

  • the embodiments of the present invention relate to the field of image processing technology, and in particular to an image processing method and device.
  • a pre-trained convolutional neural network can be used to recognize faces.
  • Convolutional neural networks require image sample sets for training.
  • the image samples can be augmented when generating the image sample set.
  • Image augmentation performs a series of random transformations on image samples to obtain similar but different image samples to expand the image sample set.
  • the random transformation specifically includes random cropping, random segmentation, random lighting and so on.
  • the above solution may lose part of the face information in the random transformation process, resulting in poor accuracy of the face recognition of the model obtained through the sample training.
  • the present invention provides an image processing method and device to solve the above-mentioned problems in the prior art.
  • an image processing method including:
  • a plurality of target regions are determined based on the face region to obtain a plurality of second image samples, wherein the target region is formed by expanding toward a preset direction based on the face region.
  • the step of determining target areas of multiple sizes based on the face area includes:
  • the first size and the second size are multiples of the vertical size of the face area
  • the third size and the fourth size are multiples of the horizontal size of the face area.
  • the step of obtaining the first image sample includes:
  • the target video is composed of multiple sub videos, and the sub videos are separated by a preset mark.
  • an image processing device comprising:
  • the first image sample obtaining module is used to obtain the first image sample
  • a face area recognition module configured to recognize the face area in the first image sample
  • the second image sample generation module is configured to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is expanded toward a preset direction based on the face area Forming.
  • the second image sample generating module includes:
  • the first target area expansion sub-module is configured to expand a plurality of first sizes upward based on the face area; and/or,
  • the second target area expansion sub-module is used to expand a plurality of second sizes downward based on the face area; and/or,
  • the third target area expansion submodule is used to expand a plurality of third sizes to the left based on the face area; and/or,
  • the fourth target area expansion sub-module is used to expand a plurality of fourth sizes to the right based on the face area.
  • the first size and the second size are multiples of the vertical size of the face area
  • the third size and the fourth size are multiples of the horizontal size of the face area.
  • the first image sample acquisition module includes:
  • the first image sample receiving sub-module is configured to receive the first image sample sent by the photographing device, wherein the environmental brightness of the first image sample, the distance and/or the angle between the photographing object and the photographing device are not completely the same.
  • the target video is composed of multiple sub videos, and the sub videos are separated by a preset mark.
  • an electronic device including:
  • a processor a memory, and a computer program that is stored on the memory and can run on the processor, and the processor implements the foregoing method when the program is executed.
  • a readable storage medium when instructions in the storage medium are executed by a processor of an electronic device, the electronic device can execute the aforementioned method.
  • the embodiment of the present invention provides an image processing method and device.
  • the image processing method includes: acquiring a first image sample; identifying a face area in the first image sample; determining the amount of information based on the face area.
  • a plurality of second image samples are obtained from a target area, where the target area is formed by expanding toward a preset direction based on the face area.
  • the target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.
  • FIG. 1 is a flowchart of specific steps of an image processing method according to Embodiment 1 of the present invention
  • FIG. 2 is a flowchart of specific steps of an image processing method according to the second embodiment of the present invention.
  • 3A, 3B, 3C, 3D, 3E, 3F, 3G, 3H are schematic diagrams of the target area in the second embodiment of the present invention.
  • FIG. 4 is a structural diagram of an image processing device according to Embodiment 3 of the present invention.
  • FIG. 5 is a structural diagram of an image processing apparatus provided by Embodiment 4 of the present invention.
  • Fig. 6 schematically shows a block diagram of a computing processing device for executing the method according to the present application.
  • Fig. 7 schematically shows a storage unit for holding or carrying program codes for implementing the method according to the present application.
  • Step 101 Obtain a first image sample.
  • the embodiment of the present invention can generate similar but different images for the original image, and can be used for the augmentation of image samples in deep learning.
  • the recognition device needs to obtain the face image and verify the facial features recognized from it. After the verification is passed, the user is allowed to perform operations; after the verification fails, the user is not allowed to perform operations .
  • the application scenarios of face authentication include: driver verification and access control verification of online car-hailing.
  • there is a risk an illegal user can obtain a recorded video or picture of a legitimate user in advance, and then use the playback device to play the video or picture, and use the camera of the authentication device to shoot the screen of the playback device. Perform identity verification, so that the face recognition identity verification does not achieve the verification effect, and there is a certain security risk.
  • the frame part of the playback device usually exists in the captured picture, the screen of the playback device also has reflections, and there may also be a certain angle between the playback device and the authentication device.
  • the image samples generated in the embodiment of the present invention can enable the trained model (for example, convolutional neural network) to recognize features such as borders, reflections, and angles, thereby identifying the above illegal authentication.
  • the first image sample is an original image sample, which may be a picture including various information, for example, including: the screen, frame, or other information around the playback device.
  • the first image sample may be obtained by shooting, and the shooting object may be a playback device that plays videos or pictures.
  • Step 102 Identify the face area in the first image sample.
  • a face recognition technology may be used to identify facial feature points (for example, facial features, contours, etc.) from the first image sample, so as to determine the face area.
  • facial feature points for example, facial features, contours, etc.
  • Template matching method establish a three-dimensional adjustable model frame according to the law of facial features. After locating the face area, use the model frame to locate and adjust the feature parts of the face to solve the observation angle and occlusion in the face recognition process. And facial expression changes.
  • Singular value feature method The singular value feature of the face image matrix reflects the essential attributes of the image and can be used for face recognition.
  • Subspace analysis method Because of its strong descriptiveness, low computational cost, easy implementation and good separability, it is widely used in facial feature extraction and has become one of the mainstream methods of face recognition.
  • LPP Locality Preserving Projections
  • Image feature-based method first match the overall size contour and three-dimensional space direction of the face; then, while keeping the posture fixed, perform local matching of different feature points of the face (these feature points are manually identified) .
  • the method based on the variable parameters of the model use the combination of the three-dimensional deformation of the general face model and the minimum iteration of the matrix based on the distance mapping to restore the head posture and the three-dimensional face.
  • the attitude parameters are continuously updated as the relationship between the deformation of the model changes, and this process is repeated until the minimum scale reaches the requirement.
  • the biggest difference between the method based on model variable parameters and the method based on image features is that the latter needs to search for the coordinates of the feature points every time the face pose changes, while the former only needs to adjust the parameters of the 3D deformed model.
  • Step 103 Determine multiple target regions based on the face region to obtain multiple second image samples, wherein the target region is formed by expanding toward a preset direction based on the face region.
  • the sizes of different sizes are expanded in one of up, down, left, and right directions or in multiple directions at the same time to obtain multiple target areas including the face area.
  • the shape of the face area and the target area may be rectangular, diamond, circular, trapezoidal, etc., selected according to the scene, and rectangular is preferred.
  • the shape of the embodiment of the present invention is not limited.
  • the embodiment of the present invention can determine target regions that contain face regions and have different sizes, so that the expanded samples not only include face regions, but also include as much information of other regions as possible, which helps the model learn more. Multi-information improves the accuracy of face recognition by the model. In this way, more image samples can be generated based on existing images, so as to improve the accuracy of the model during subsequent training of the model.
  • One of the fraudulent methods in face recognition is to play images or streaming media files through a playback device, and then shoot through a certified shooting device to perform face verification. Since the existing face authentication algorithm first extracts the face area in the obtained authentication image for authentication, the authentication will not be affected even when the camera captures the frame of the playback device. After expansion in the foregoing manner, at least a part of the image samples can include the frame of the playback device, so as to improve the accuracy of the trained model.
  • the image contained in the target area forms a second image sample.
  • the second image sample includes the face feature and information of other areas.
  • the target region when generating image samples for training, first, make a copy of the first image sample; then, crop the face area and the target area from the first image sample; finally, the cropped The image is saved as a second image sample, so that the image formed by the face region, multiple second image samples, and the first image sample can be used as image samples for the training model.
  • the target region since multiple target regions can be determined based on the face region, the target region can be cropped from the first image sample to obtain multiple second image samples; and The face area is located at different positions within the plurality of second image samples.
  • an embodiment of the present invention provides an image processing method, the method includes: acquiring a first image sample; identifying a face area in the first image sample; determining the face area based on the face area A plurality of second image samples are obtained from a plurality of target regions, wherein the target region is formed by expanding toward a preset direction based on the face region.
  • the target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.
  • Step 201 Acquire a first image sample through a photographing device, where the environmental brightness of the first image sample, the distance and/or the angle between the photographing object and the photographing device are not completely the same.
  • an image sample used as a training sample can be generated into a streaming media file that is played in a loop, and then played by a playback device.
  • the screen of the playback device is photographed by the shooting device to obtain multiple first image samples in the streaming media file; and the brightness of the environment where the playback device is located, the distance and/or the angle from the shooting device are continuously changed , You can get more training samples from these image samples.
  • the shooting device can be any device with a camera, for example, a mobile phone, a tablet computer, a camera, and the like.
  • the playback device is all playback devices that can automatically adjust the screen angle.
  • the playback device may use a pan-tilt to carry the playback device; the pan-tilt may continuously rotate to adjust the angle between the screen of the playback device and the shooting device.
  • the pan-tilt can be rotated in a two-dimensional plane to generate multiple different image samples according to the image; the pan-tilt can also be rotated in a three-dimensional space to continuously rotate the screen during streaming media file playback.
  • the target video may be an image sequence including a human face area. It can be understood that after the image sequence is captured by the shooting device, the image sequence is split into each frame of images to obtain multiple first image samples. That is, the multiple first image samples correspond to multiple frames of images formed by splitting the target video shot by the shooting device.
  • an image sample set composed of multiple first image samples with different angles, different brightnesses, and different distances can be obtained through device shooting, which realizes the diversification and automation of training sample augmentation, and is helpful compared with manual shooting. Reduce workload, diversified samples can improve the accuracy of training.
  • the target video is composed of multiple sub-videos, and the sub-videos are separated by a preset mark.
  • the mark can be an image frame with a designated mark, or a mark frame, or other forms of marks.
  • multiple small videos can be spliced into a video with a larger length, so that the first image samples can be generated in batches, which helps to further reduce the workload.
  • Step 202 Identify the face area in the first image sample.
  • step 102 For this step, refer to the detailed description of step 102, which will not be repeated here.
  • Step 203 Use the face area as a reference to expand multiple first sizes upward to form multiple target areas, and obtain multiple second image samples.
  • the size can be expressed in terms of the number of pixels, and the first size is the number of pixels expanded upward.
  • the embodiment of the present invention may extend the first size above the face area to obtain the target area, so that the target area includes the face area and the upper first size area.
  • the gray area F1 is the entire image area of the first image sample
  • the shaded area F2 is the face area in the first image sample.
  • the target area is expanded upward It is the area F3 (including the area F2) enclosed by the dotted line in FIG. 3(B).
  • the first size may be a plurality of different sizes, so as to achieve a target area of different sizes such as F3.
  • the target area can be obtained by upward expansion, so that the target area including the screen, or device or screen frame and angle above is used as the second image sample, and the screen, or device or screen frame and angle are learned during model training. It can identify the screen reflectance above the face area, the device or screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for identity verification, and improve the accuracy of face recognition.
  • Step 204 Expand a plurality of second sizes downward based on the face region to form a plurality of target regions, and obtain a plurality of second image samples.
  • the second size is the number of pixels expanded downward.
  • the embodiment of the present invention can extend the second size below the face area to obtain the target area, so that the target area includes the face area and the second size area below.
  • the target area obtained by downward expansion is the area F4 (including the area F2) enclosed by the dotted line in FIG. 3(C).
  • the second size can be a plurality of different sizes, so as to achieve a target area of different sizes such as F4.
  • the target area can be obtained by downward expansion, so that the target area below the screen, or the device or screen frame and the angle is used as the second image sample, and the screen, or device or screen frame and the angle are learned during model training.
  • the angle feature can identify the screen reflections under the face area, the device or screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for identity verification, and improve the accuracy of face recognition.
  • Step 205 Expanding multiple third sizes to the left based on the face area to form multiple target areas to obtain multiple second image samples.
  • the third size is the number of pixels extended to the left.
  • the embodiment of the present invention can extend the third size to the left of the face area to obtain the target area, so that the target area includes the face area and the third size area on the left.
  • the target area obtained by expanding to the left is the area F5 (including the area F2) enclosed by the dotted line in Fig. 3(D).
  • the third size may be a plurality of different sizes, so as to achieve a target area of different sizes such as F5.
  • the target area can be obtained by expanding to the left, so that the target area on the left including the screen, or device or screen frame and angle is used as the second image sample, and the screen, or device or screen is learned during model training.
  • the characteristics of the frame and the angle can identify the screen reflection on the left of the face area, the device or the screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for authentication, and improve the face recognition Accuracy.
  • Step 206 Expanding multiple fourth sizes to the right based on the face area to form multiple target areas to obtain multiple second image samples.
  • the fourth size is the number of pixels extended to the right.
  • the embodiment of the present invention may extend the fourth size to the right of the face area to obtain the target area, so that the target area includes the face area and the fourth size area on the right.
  • the target area expanded to the right is the area F6 (including the area F2) enclosed by the dotted line in FIG. 3(E).
  • the fourth size may be a plurality of different sizes, so as to achieve a target area of different sizes such as F6.
  • one of the directions can be selected to expand, or the up and down directions can be combined to obtain the target area as the area F7 enclosed by the dashed line in Figure 3(F); the left and right directions can also be combined to obtain the target area as The area F8 enclosed by the dashed line in Fig. 3(G); the up, down, left and right directions can also be combined to obtain the target area as the area F9 enclosed by the dashed line in Fig. 3(H). It is also possible to arbitrarily combine the up, down, left and right directions to expand to obtain a target area, for example, simultaneous expansion of the upper left, simultaneous expansion of the upper left and right, and simultaneous expansion of the upper and lower right.
  • the first size, the second size, the third size, and the fourth size may be the same or different.
  • the same first size and second size may be used when combining up and down.
  • the same third size and fourth size can be used when combining left and right.
  • the target area can be obtained by expanding to the right, so that the target area on the right including the screen, or device or screen frame and angle is used as the second image sample, and the screen, or device or screen frame is learned during model training.
  • the target area on the right including the screen, or device or screen frame and angle is used as the second image sample, and the screen, or device or screen frame is learned during model training.
  • the characteristics of the angle it can identify the screen reflections on the right side of the face area, the device or screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for authentication, and improve the accuracy of face recognition degree.
  • the first size and the second size are multiples of the vertical size of the face area
  • the third size and the fourth size are the person The multiple of the left and right dimensions of the face area.
  • the vertical size of the face area can be understood as the height of the face area
  • the left-right size of the face area can be understood as the width of the face area
  • the first size represents the distance between the upper boundary of the target area and the upper boundary of the face area
  • the first size can be 1 time, 2 times, 3 times, etc., the height of the face area
  • the second size represents The distance between the lower boundary of the target area and the lower boundary of the face area
  • the second size can be 1, 2, 3 times the height of the face area, etc.
  • the third size represents the left boundary of the target area and the face The distance between the left border of the area, the third size can be 1 time, 2 times, 3 times the width of the face area, etc.
  • the fourth size represents the distance between the right border of the target area and the right border of the face area
  • the fourth size can be 1, 2, 3, etc. the width of the face area.
  • the maximum multiple may be determined by the image size, and the multiple is continuously expanded until the boundary of the first image sample is reached.
  • the first size may determine the maximum multiple based on the upper boundary
  • the second size may determine the maximum multiple based on the lower boundary
  • the third size may determine the maximum multiple based on the left boundary
  • the fourth size may determine the maximum multiple based on the right boundary.
  • the embodiment of the present invention can expand the target area according to the multiple of the size of the face area, and can simply and effectively determine the target area including the screen, the screen frame, and the device frame.
  • an embodiment of the present invention provides an image processing method, the method includes: acquiring a first image sample; identifying a face area in the first image sample; determining the face area based on the face area A plurality of second image samples are obtained from a plurality of target regions, wherein the target region is formed by expanding toward a preset direction based on the face region.
  • the target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.
  • FIG. 4 shows a structural diagram of an image processing apparatus provided in Embodiment 3 of the present invention, which is specifically as follows.
  • the first image sample acquisition module 301 is used to acquire the first image sample.
  • the face area recognition module 302 is configured to recognize the face area in the first image sample.
  • the second image sample generation module 303 is configured to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is based on the face area as a reference to proceed toward a preset direction Expanded.
  • an embodiment of the present invention provides an image processing device, the device includes: a first image sample acquisition module for acquiring a first image sample; a face region recognition module for identifying the first image sample The face area in the image sample; the second image sample generation module is used to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is based on the face area It is formed by expanding the base toward the preset direction.
  • the target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.
  • the third embodiment is an apparatus embodiment corresponding to method embodiment 1.
  • the third embodiment is an apparatus embodiment corresponding to method embodiment 1.
  • FIG. 5 shows a structural diagram of an image processing apparatus provided by Embodiment 4 of the present invention, which is specifically as follows.
  • the first image sample acquisition module 401 is configured to acquire a first image sample; optionally, in another embodiment of the present invention, the first image sample acquisition module 401 includes:
  • the first image sample receiving sub-module 4011 is configured to receive a first image sample sent by a photographing device, wherein the environmental brightness of the first image sample, the distance and/or angle between the photographing object and the photographing device are not completely the same.
  • the face region recognition module 402 is used to identify the face region in the first image sample.
  • the second image sample generation module 403 is configured to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is based on the face area as a reference and moves toward a preset direction Formed by expansion, including:
  • the first target area expansion submodule 4031 is configured to expand a plurality of first sizes upward based on the face area; and/or,
  • the second target area expansion submodule 4032 is configured to expand a plurality of second sizes downward based on the face area; and/or,
  • the third target area expansion submodule 4033 is configured to expand a plurality of third sizes to the left based on the face area; and/or,
  • the fourth target area expansion sub-module 4034 is configured to expand a plurality of fourth sizes to the right based on the face area.
  • the first size and the second size are multiples of the vertical size of the face area
  • the third size and the fourth size are the person The multiple of the left and right dimensions of the face area.
  • the target video is composed of multiple sub-videos, and the sub-videos are separated by a preset mark.
  • an embodiment of the present invention provides an image processing device, the device includes: a first image sample acquisition module for acquiring a first image sample; a face region recognition module for identifying the first image sample The face area in the image sample; the second image sample generation module is used to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is based on the face area It is formed by expanding the base toward the preset direction.
  • the target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.
  • the fourth embodiment is a device embodiment corresponding to the second method embodiment.
  • the fourth embodiment is a device embodiment corresponding to the second method embodiment.
  • An embodiment of the present invention also provides an electronic device, including: a processor, a memory, and a computer program that is stored on the memory and can run on the processor. When the processor executes the program, the aforementioned method.
  • the embodiment of the present invention also provides a readable storage medium.
  • the instructions in the storage medium are executed by the processor of the electronic device, the electronic device can execute the aforementioned method.
  • the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
  • modules or units or components in the embodiments can be combined into one module or unit or component, and in addition, they can be divided into multiple sub-modules or sub-units or sub-components. Except that at least some of such features and/or processes or units are mutually exclusive, any combination can be used to compare all features disclosed in this specification (including the accompanying claims, abstract and drawings) and any method or methods disclosed in this manner or All the processes or units of the equipment are combined. Unless expressly stated otherwise, each feature disclosed in this specification (including the accompanying claims, abstract and drawings) may be replaced by an alternative feature providing the same, equivalent or similar purpose.
  • the various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by a combination of them.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some or all components in the image processing device according to the embodiments of the present invention.
  • DSP digital signal processor
  • the present invention can also be implemented as a device or device program for executing part or all of the methods described herein.
  • Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals. Such signals can be downloaded from Internet websites, or provided on carrier signals, or provided in any other form.
  • FIG. 6 shows a computing processing device that can implement the method according to the present application.
  • the computing processing device traditionally includes a processor 1010 and a computer program product in the form of a memory 1020 or a computer readable medium.
  • the memory 1020 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • the memory 1020 has a storage space 1030 for executing program codes 1031 of any method steps in the above methods.
  • the storage space 1030 for program codes may include various program codes 1031 for implementing various steps in the above method. These program codes can be read from or written into one or more computer program products.
  • These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks.
  • Such a computer program product is usually a portable or fixed storage unit as described with reference to FIG. 7.
  • the storage unit may have storage segments, storage spaces, etc. arranged similarly to the memory 1020 in the computing processing device of FIG. 6.
  • the program code can be compressed in a suitable form, for example.
  • the storage unit includes computer-readable codes 1031', that is, codes that can be read by, for example, a processor such as 1010. These codes, when run by a computing processing device, cause the computing processing device to execute the method described above. The various steps.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The present invention provides an image processing method and device. The method comprises: obtaining a first image sample; recognizing a face region in the first image sample; a second image sample generation module determining a plurality of target regions by using the face region as a reference so as to obtain a plurality of second image samples, wherein the target regions are formed by using the face region as a reference and expanding therefrom in preset directions. The target regions are formed by using the face region as a reference and expanding therefrom so as to obtain the second image samples, such that the generated second image samples definitely comprise the face region, thereby improving the accuracy of facial recognition models.

Description

图像处理方法及装置Image processing method and device
本申请要求在2019年1月24日提交中国专利局、申请号为201910069268.3、发明名称为“图像处理方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office with the application number 201910069268.3 and the invention title "Image Processing Method and Apparatus" on January 24, 2019, the entire content of which is incorporated into this application by reference.
技术领域Technical field
本发明实施例涉及图像处理技术领域,尤其涉及一种图像处理方法及装置。The embodiments of the present invention relate to the field of image processing technology, and in particular to an image processing method and device.
背景技术Background technique
人脸识别已经成为一种很重要的身份验证方式,可以采用预先训练好的卷积神经网络识别人脸。而卷积神经网络需要图像样本集进行训练。Face recognition has become a very important way of identity verification. A pre-trained convolutional neural network can be used to recognize faces. Convolutional neural networks require image sample sets for training.
现有技术中,为了提高图像样本的数量,在生成图像样本集时可以对图像样本进行增广。图像增广对图像样本进行一系列随机变换得到相似但不相同的图像样本,以扩大图像样本集。其中,随机变换具体包括随机剪裁、随机分割、随机变换光照等。In the prior art, in order to increase the number of image samples, the image samples can be augmented when generating the image sample set. Image augmentation performs a series of random transformations on image samples to obtain similar but different image samples to expand the image sample set. Among them, the random transformation specifically includes random cropping, random segmentation, random lighting and so on.
然而,上述方案有可能在随机变换过程中丢失人脸部分信息,导致通过该样本训练得到的模型识别人脸的准确性较差。However, the above solution may lose part of the face information in the random transformation process, resulting in poor accuracy of the face recognition of the model obtained through the sample training.
发明内容Summary of the invention
本发明提供一种图像处理方法及装置,以解决现有技术中的上述问题。The present invention provides an image processing method and device to solve the above-mentioned problems in the prior art.
根据本发明的第一方面,提供了一种图像处理方法,所述方法包括:According to a first aspect of the present invention, there is provided an image processing method, the method including:
获取第一图像样本;Obtain the first image sample;
识别所述第一图像样本中的人脸区域;Identifying the face area in the first image sample;
以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的。A plurality of target regions are determined based on the face region to obtain a plurality of second image samples, wherein the target region is formed by expanding toward a preset direction based on the face region.
可选地,所述以所述人脸区域为基准确定多个尺寸的目标区域的步骤,包括:Optionally, the step of determining target areas of multiple sizes based on the face area includes:
以所述人脸区域为基准向上扩展多个第一尺寸;和/或,Expanding a plurality of first sizes upward based on the face area; and/or,
以所述人脸区域为基准向下扩展多个第二尺寸;和/或,Expand a plurality of second sizes downward based on the face area; and/or,
以所述人脸区域为基准向左扩展多个第三尺寸;和/或,Expand a plurality of third sizes to the left based on the face area; and/or,
以所述人脸区域为基准向右扩展多个第四尺寸。Expanding multiple fourth sizes to the right based on the face area.
可选地,所述第一尺寸、第二尺寸为所述人脸区域的上下方向尺寸的倍数,所述第三尺寸、第四尺寸为所述人脸区域的左右方向尺寸的倍数。Optionally, the first size and the second size are multiples of the vertical size of the face area, and the third size and the fourth size are multiples of the horizontal size of the face area.
可选地,所述获取第一图像样本的步骤,包括:Optionally, the step of obtaining the first image sample includes:
接收拍摄设备发送的第一图像样本,其中所述第一图像样本的环境亮度、拍摄对象与拍摄设备之间的距离和/或角度不完全相同。Receive a first image sample sent by a photographing device, wherein the environmental brightness of the first image sample, the distance and/or the angle between the photographing object and the photographing device are not completely the same.
可选地,所述目标视频由多个子视频组成,所述子视频之间以预设标记分隔。Optionally, the target video is composed of multiple sub videos, and the sub videos are separated by a preset mark.
根据本发明的第二方面,提供了一种图像处理装置,所述装置包括:According to a second aspect of the present invention, there is provided an image processing device, the device comprising:
第一图像样本获取模块,用于获取第一图像样本;The first image sample obtaining module is used to obtain the first image sample;
人脸区域识别模块,用于识别所述第一图像样本中的人脸区域;A face area recognition module, configured to recognize the face area in the first image sample;
第二图像样本生成模块,用于以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的。The second image sample generation module is configured to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is expanded toward a preset direction based on the face area Forming.
可选地,所述第二图像样本生成模块,包括:Optionally, the second image sample generating module includes:
第一目标区域扩展子模块,用于以所述人脸区域为基准向上扩展多个第一尺寸;和/或,The first target area expansion sub-module is configured to expand a plurality of first sizes upward based on the face area; and/or,
第二目标区域扩展子模块,用于以所述人脸区域为基准向下扩展多个第二尺寸;和/或,The second target area expansion sub-module is used to expand a plurality of second sizes downward based on the face area; and/or,
第三目标区域扩展子模块,用于以所述人脸区域为基准向左扩展多个第三尺寸;和/或,The third target area expansion submodule is used to expand a plurality of third sizes to the left based on the face area; and/or,
第四目标区域扩展子模块,用于以所述人脸区域为基准向右扩展多个第四尺寸。The fourth target area expansion sub-module is used to expand a plurality of fourth sizes to the right based on the face area.
可选地,所述第一尺寸、第二尺寸为所述人脸区域的上下方向尺寸的倍数,所述第三尺寸、第四尺寸为所述人脸区域的左右方向尺寸的倍数。Optionally, the first size and the second size are multiples of the vertical size of the face area, and the third size and the fourth size are multiples of the horizontal size of the face area.
可选地,所述第一图像样本获取模块,包括:Optionally, the first image sample acquisition module includes:
第一图像样本接收子模块,用于接收拍摄设备发送的第一图像样本,其中所述第一图像样本的环境亮度、拍摄对象与拍摄设备之间的距离和/或角度不完全相同。The first image sample receiving sub-module is configured to receive the first image sample sent by the photographing device, wherein the environmental brightness of the first image sample, the distance and/or the angle between the photographing object and the photographing device are not completely the same.
可选地,所述目标视频由多个子视频组成,所述子视频之间以预设标记分隔。Optionally, the target video is composed of multiple sub videos, and the sub videos are separated by a preset mark.
根据本发明的第三方面,提供了一种电子设备,包括:According to a third aspect of the present invention, there is provided an electronic device, including:
处理器、存储器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述程序时实现前述的方法。A processor, a memory, and a computer program that is stored on the memory and can run on the processor, and the processor implements the foregoing method when the program is executed.
根据本发明的第四方面,提供了一种可读存储介质,当所述存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行前述的方法。According to a fourth aspect of the present invention, there is provided a readable storage medium, when instructions in the storage medium are executed by a processor of an electronic device, the electronic device can execute the aforementioned method.
本发明实施例提供了一种图像处理方法及装置,所述图像处理方法包括:获取第一图像样本;识别所述第一图像样本中的人脸区域;以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的。可以以人脸区域为基准扩展形成目标区域,得到第二图像样本,使得生成的第二图像样本一定包括人脸区域,有助于提高模型识别人脸的准确性。The embodiment of the present invention provides an image processing method and device. The image processing method includes: acquiring a first image sample; identifying a face area in the first image sample; determining the amount of information based on the face area. A plurality of second image samples are obtained from a target area, where the target area is formed by expanding toward a preset direction based on the face area. The target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。The above description is only an overview of the technical solutions of this application. In order to understand the technical means of this application more clearly, it can be implemented in accordance with the content of the specification, and in order to make the above and other objectives, features and advantages of this application more obvious and understandable. , The specific implementations of this application are cited below.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description These are some embodiments of the application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1是本发明实施例一提供的一种图像处理方法的具体步骤流程图;FIG. 1 is a flowchart of specific steps of an image processing method according to Embodiment 1 of the present invention;
图2是本发明实施例二提供的一种图像处理方法的具体步骤流程图;2 is a flowchart of specific steps of an image processing method according to the second embodiment of the present invention;
图3A、3B、3C、3D、3E、3F、3G、3H分别是本发明实施例二中目标区域的示意图;3A, 3B, 3C, 3D, 3E, 3F, 3G, 3H are schematic diagrams of the target area in the second embodiment of the present invention;
图4是本发明实施例三提供的一种图像处理装置的结构图;FIG. 4 is a structural diagram of an image processing device according to Embodiment 3 of the present invention;
图5是本发明实施例四提供的一种图像处理装置的结构图。FIG. 5 is a structural diagram of an image processing apparatus provided by Embodiment 4 of the present invention.
图6示意性地示出了用于执行根据本申请的方法的计算处理设备的框图;以及Fig. 6 schematically shows a block diagram of a computing processing device for executing the method according to the present application; and
图7示意性地示出了用于保持或者携带实现根据本申请的方法的程序代码的存储单元。Fig. 7 schematically shows a storage unit for holding or carrying program codes for implementing the method according to the present application.
具体实施例Specific embodiment
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
实施例一Example one
参照图1,其示出了本发明实施例一提供的一种图像处理方法的具体步骤流程图。1, which shows a flowchart of specific steps of an image processing method according to Embodiment 1 of the present invention.
步骤101,获取第一图像样本。Step 101: Obtain a first image sample.
本发明实施例可以针对原图像生成类似但不相同的图像,可以用于深度学习中图像样本的增广。在人脸识别的身份验证过程中,识别设备需要获取人脸图像,并对从中识别出的人脸特征进行验证,在验证通过之后,允许用户进行操作;在验证失败之后,不允许用户进行操作。人脸认证的应用场景包括:网约车的司机验身份验证、门禁验证。而在一个实施例中存在一种风险:非法用户可以预先获得合法用户的录制视频或图片,从而使用播放设备播放该视频或图片,并用认证设备的摄像头拍摄该播放设备的屏幕,通过拍摄的图片进行身份验证,使得人脸识别的身份验证未达到验证效果,存在一定的安全风险。The embodiment of the present invention can generate similar but different images for the original image, and can be used for the augmentation of image samples in deep learning. In the authentication process of face recognition, the recognition device needs to obtain the face image and verify the facial features recognized from it. After the verification is passed, the user is allowed to perform operations; after the verification fails, the user is not allowed to perform operations . The application scenarios of face authentication include: driver verification and access control verification of online car-hailing. In one embodiment, there is a risk: an illegal user can obtain a recorded video or picture of a legitimate user in advance, and then use the playback device to play the video or picture, and use the camera of the authentication device to shoot the screen of the playback device. Perform identity verification, so that the face recognition identity verification does not achieve the verification effect, and there is a certain security risk.
在上述非法认证过程中,播放设备的边框部分通常会存在于拍摄的图片中,播放设备的屏幕也存在反光,播放设备和认证设备之间也可能存在一定角度。基于上述特征,本发明实施例生成的图像样本可以使得训练得到的模型(例如卷积神经网络)识别边框、反光、角度等特征,从而识别出上述非法认证。In the above illegal authentication process, the frame part of the playback device usually exists in the captured picture, the screen of the playback device also has reflections, and there may also be a certain angle between the playback device and the authentication device. Based on the above features, the image samples generated in the embodiment of the present invention can enable the trained model (for example, convolutional neural network) to recognize features such as borders, reflections, and angles, thereby identifying the above illegal authentication.
其中,第一图像样本为原始图像样本,可以为包括各种信息的图片,例如,包括:播放设备的屏幕、边框、或播放设备周边的其他信息。Wherein, the first image sample is an original image sample, which may be a picture including various information, for example, including: the screen, frame, or other information around the playback device.
在本发明实施例中,第一图像样本可以通过拍摄得到,拍摄对象可以为播放视频或图片的播放设备。In the embodiment of the present invention, the first image sample may be obtained by shooting, and the shooting object may be a playback device that plays videos or pictures.
步骤102,识别所述第一图像样本中的人脸区域。Step 102: Identify the face area in the first image sample.
具体地,可以采用人脸识别技术从第一图像样本中识别得到人脸特征点(例如,五官、轮廓等),从而确定人脸区域。Specifically, a face recognition technology may be used to identify facial feature points (for example, facial features, contours, etc.) from the first image sample, so as to determine the face area.
现有技术中,人脸识别技术已经很成熟,对于二维人脸识别算法,包括:In the prior art, face recognition technology is already very mature. For two-dimensional face recognition algorithms, it includes:
1、模板匹配法:根据人脸特征规律建立立体可调的模型框架,在定位出人脸区域之后用模型框架定位和调整人脸中的特征部位,解决人脸识别过程中的观察角度、遮挡和表情变化等因素影响。1. Template matching method: establish a three-dimensional adjustable model frame according to the law of facial features. After locating the face area, use the model frame to locate and adjust the feature parts of the face to solve the observation angle and occlusion in the face recognition process. And facial expression changes.
2、奇异值特征法:人脸图像矩阵的奇异值特征反映了图像的本质属性,可以用于进行人脸识别。2. Singular value feature method: The singular value feature of the face image matrix reflects the essential attributes of the image and can be used for face recognition.
3、子空间分析法:因其具有描述性强、计算代价小、易实现及可分性好等特点,被广泛应用于人脸特征提取,成为了人脸识别的主流方法之一。3. Subspace analysis method: Because of its strong descriptiveness, low computational cost, easy implementation and good separability, it is widely used in facial feature extraction and has become one of the mainstream methods of face recognition.
4、局部保持投影(Locality Preserving Projections,LPP):是一种新的子空间分析方法,它是非线性方法的线性近似,既解决了主成分分析法难以保持原始数据非线性流形的缺点,又解决了非线性方法难以获得新样本点低维投影的缺点。4. Locality Preserving Projections (LPP): It is a new subspace analysis method. It is a linear approximation of the nonlinear method. It not only solves the shortcomings of the principal component analysis method that is difficult to maintain the nonlinear manifold of the original data, but also It solves the shortcomings of nonlinear methods that are difficult to obtain low-dimensional projections of new sample points.
5、主成分分析法:通过降维降低数据处理的复杂度,提高计算速度。5. Principal component analysis method: reduce the complexity of data processing through dimensionality reduction and increase the calculation speed.
对于三维人脸识别算法,包括:For 3D face recognition algorithms, including:
1、基于图像特征的方法:首先匹配人脸整体的尺寸轮廓和三维空间方向;然后,在保持姿态固定的情况下,进行脸部不同特征点(这些特征点是人工的鉴别出来)的局部匹配。1. Image feature-based method: first match the overall size contour and three-dimensional space direction of the face; then, while keeping the posture fixed, perform local matching of different feature points of the face (these feature points are manually identified) .
2、基于模型可变参数的方法:使用通用人脸模型的三维变形和基于距离映射的矩阵迭代最小相结合,以恢复头部姿态和三维人脸。随着模型形变的关联关系的改变不断更新姿态参数,重复此过程直到最小化尺度达到要求。基于模型可变参数的方法与基于图像特征的方法的最大区别在于:后者在人脸姿态每变化一次后,需要重新搜索特征点的坐标,而前者只需调整3D变形模型的参数。2. The method based on the variable parameters of the model: use the combination of the three-dimensional deformation of the general face model and the minimum iteration of the matrix based on the distance mapping to restore the head posture and the three-dimensional face. The attitude parameters are continuously updated as the relationship between the deformation of the model changes, and this process is repeated until the minimum scale reaches the requirement. The biggest difference between the method based on model variable parameters and the method based on image features is that the latter needs to search for the coordinates of the feature points every time the face pose changes, while the former only needs to adjust the parameters of the 3D deformed model.
可以理解,本发明实施例对采用的人脸识别算法不加以限制。It can be understood that the embodiment of the present invention does not impose restrictions on the face recognition algorithm used.
步骤103,以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的。Step 103: Determine multiple target regions based on the face region to obtain multiple second image samples, wherein the target region is formed by expanding toward a preset direction based on the face region.
具体地,以人脸区域为中心,向上、下、左、右其中一个方向或同时朝多个方向扩展不同大小的尺寸,得到包含人脸区域的多个目标区域。Specifically, with the face area as the center, the sizes of different sizes are expanded in one of up, down, left, and right directions or in multiple directions at the same time to obtain multiple target areas including the face area.
可以理解,人脸区域和目标区域的形状可以根据场景选择矩形、菱形、圆形、梯形等,优选矩形。本发明实施例对其形状不加以限制。It can be understood that the shape of the face area and the target area may be rectangular, diamond, circular, trapezoidal, etc., selected according to the scene, and rectangular is preferred. The shape of the embodiment of the present invention is not limited.
本发明实施例可以确定包含人脸区域,且尺寸不同的目标区域,从而可 以使得扩展得到的样本不仅包括人脸区域,还尽可能多的包括其他区域的信息,有助于帮助模型学习到更多信息,提高模型对人脸识别的准确度。这样可以根据现有的图像生成更多的图像样本,以在后续训练模型时提高模型的准确性。The embodiment of the present invention can determine target regions that contain face regions and have different sizes, so that the expanded samples not only include face regions, but also include as much information of other regions as possible, which helps the model learn more. Multi-information improves the accuracy of face recognition by the model. In this way, more image samples can be generated based on existing images, so as to improve the accuracy of the model during subsequent training of the model.
由于人脸识别时的一个欺诈手段是:通过播放设备播放图像或流媒体文件,然后通过认证的拍摄设备进行拍摄以进行人脸认证。由于现有的人脸认证算法会将获取的认证图像中的人脸区域首先提取出来进行认证,这样即使拍摄设备拍摄到播放设备的边框时也不会对认证有影响。采用前述方式进行扩展后,能够使得至少一部分图像样本中包括播放设备的边框,以提高训练得到的模型具有更好的准确性。One of the fraudulent methods in face recognition is to play images or streaming media files through a playback device, and then shoot through a certified shooting device to perform face verification. Since the existing face authentication algorithm first extracts the face area in the obtained authentication image for authentication, the authentication will not be affected even when the camera captures the frame of the playback device. After expansion in the foregoing manner, at least a part of the image samples can include the frame of the playback device, so as to improve the accuracy of the trained model.
可以理解,目标区域包含的图像形成第二图像样本。It can be understood that the image contained in the target area forms a second image sample.
在本发明实施例中,由于目标区域包括人脸区域,且向四周扩展,从而第二图像样本包括了人脸特征、以及其他区域的信息。In the embodiment of the present invention, since the target area includes the face area and extends to the surroundings, the second image sample includes the face feature and information of other areas.
在一个实施例中,在生成训练用的图像样本时,首先,将第一图像样本复制一份;然后,从第一图像样本中剪裁出人脸区域、以及目标区域;最后,将剪裁出的图像保存为第二图像样本,从而可以将人脸区域构成的图像、多个第二图像样本以及第一图像样本作为训练模型的图像样本。在一个实施例中,由于可以以所述人脸区域为基准确定多个目标区域,因此可以从所述第一图像样本中裁剪出所述目标区域得到多个第二图像样本;且所述人脸区域位于所述多个第二图像样本内的不同位置。In one embodiment, when generating image samples for training, first, make a copy of the first image sample; then, crop the face area and the target area from the first image sample; finally, the cropped The image is saved as a second image sample, so that the image formed by the face region, multiple second image samples, and the first image sample can be used as image samples for the training model. In one embodiment, since multiple target regions can be determined based on the face region, the target region can be cropped from the first image sample to obtain multiple second image samples; and The face area is located at different positions within the plurality of second image samples.
综上所述,本发明实施例提供了一种图像处理方法,所述方法包括:获取第一图像样本;识别所述第一图像样本中的人脸区域;以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的。可以以人脸区域为基准扩展形成目标区域,得到第二图像样本,使得生成的第二图像样本一定包括人脸区域,有助于提高模型识别人脸的准确性。In summary, an embodiment of the present invention provides an image processing method, the method includes: acquiring a first image sample; identifying a face area in the first image sample; determining the face area based on the face area A plurality of second image samples are obtained from a plurality of target regions, wherein the target region is formed by expanding toward a preset direction based on the face region. The target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.
实施例二Example two
参照图2,其示出了本发明实施例二提供的一种图像处理方法的具体步骤流程图。2, which shows a flowchart of specific steps of an image processing method according to Embodiment 2 of the present invention.
步骤201,通过拍摄设备获取第一图像样本,其中所述第一图像样本的环境亮度、拍摄对象与拍摄设备之间的距离和/或角度不完全相同。Step 201: Acquire a first image sample through a photographing device, where the environmental brightness of the first image sample, the distance and/or the angle between the photographing object and the photographing device are not completely the same.
具体的,可以将作为训练样本的图像样本生成一个循环播放的流媒体文件,以通过播放设备播放。通过拍摄设备对播放设备的屏幕进行拍摄,以获取流媒体文件中的多个第一图像样本;且通过将所述播放设备所在的环境亮度、与所述拍摄设备的距离和/或角度不断变化,可以通过这些图像样本得到更多的训练样本。Specifically, an image sample used as a training sample can be generated into a streaming media file that is played in a loop, and then played by a playback device. The screen of the playback device is photographed by the shooting device to obtain multiple first image samples in the streaming media file; and the brightness of the environment where the playback device is located, the distance and/or the angle from the shooting device are continuously changed , You can get more training samples from these image samples.
其中,拍摄设备可以为具有摄像头的任何设备,例如,手机、平板电脑、照相机等。Among them, the shooting device can be any device with a camera, for example, a mobile phone, a tablet computer, a camera, and the like.
播放设备为可以自动调整屏幕角度的所有播放设备,在一个实施例中播放设备可以采用云台来承载播放设备;云台可以不断转动,以调整播放设备的屏幕与拍摄设备之间的角度。其中云台可以是在两维平面内转动,以根据图像生成多个不同的图像样本;云台还可以是在三维空间内转动,以在播放流媒体文件过程中不断转动屏幕。此外,为了实现距离的变化,可以将拍摄设备或播放设备置于可移动的设备上,从而在播放过程中调整两者之间的距离;最后,还可以将播放设备和拍摄设备置于亮度不断变化的灯光环境下,实现亮度的不断变化。The playback device is all playback devices that can automatically adjust the screen angle. In one embodiment, the playback device may use a pan-tilt to carry the playback device; the pan-tilt may continuously rotate to adjust the angle between the screen of the playback device and the shooting device. The pan-tilt can be rotated in a two-dimensional plane to generate multiple different image samples according to the image; the pan-tilt can also be rotated in a three-dimensional space to continuously rotate the screen during streaming media file playback. In addition, in order to change the distance, you can place the shooting device or playback device on a movable device, so as to adjust the distance between the two during playback; finally, you can also set the playback device and the shooting device to constant brightness Under the changing lighting environment, continuous changes in brightness are realized.
目标视频可以为包括人脸区域的图像序列,可以理解,拍摄设备拍摄图像序列之后将图像序列拆分为每帧图像,得到多个第一图像样本。即,多个第一图像样本对应于所述拍摄设备所拍摄的目标视频的拆分形成的多帧图像。The target video may be an image sequence including a human face area. It can be understood that after the image sequence is captured by the shooting device, the image sequence is split into each frame of images to obtain multiple first image samples. That is, the multiple first image samples correspond to multiple frames of images formed by splitting the target video shot by the shooting device.
本发明实施例可以通过设备拍摄得到角度不同、亮度不同、距离不同的多个第一图像样本组成的图像样本集,实现了训练样本增广的多样化和自动化,相对于人工拍摄,有助于降低工作量,多样化的样本可以提高训练的准确度。In the embodiment of the present invention, an image sample set composed of multiple first image samples with different angles, different brightnesses, and different distances can be obtained through device shooting, which realizes the diversification and automation of training sample augmentation, and is helpful compared with manual shooting. Reduce workload, diversified samples can improve the accuracy of training.
可选地,在本发明的另一种实施例中,所述目标视频由多个子视频组成,所述子视频之间以预设标记分隔。Optionally, in another embodiment of the present invention, the target video is composed of multiple sub-videos, and the sub-videos are separated by a preset mark.
其中,标记可以为具有指定标识的图像帧、或标识帧、或其他形式的标记。Wherein, the mark can be an image frame with a designated mark, or a mark frame, or other forms of marks.
本发明实施例可以将多个小视频拼接为长度较大的视频,从而可以批量生成第一图像样本,有助于进一步降低工作量。In the embodiment of the present invention, multiple small videos can be spliced into a video with a larger length, so that the first image samples can be generated in batches, which helps to further reduce the workload.
步骤202,识别所述第一图像样本中的人脸区域。Step 202: Identify the face area in the first image sample.
该步骤可以参照步骤102的详细说明,在此不再赘述。For this step, refer to the detailed description of step 102, which will not be repeated here.
步骤203,以所述人脸区域为基准向上扩展多个第一尺寸形成多个目标 区域,得到多个第二图像样本。Step 203: Use the face area as a reference to expand multiple first sizes upward to form multiple target areas, and obtain multiple second image samples.
其中,尺寸可以按照像素点的数目表示,第一尺寸为向上扩展的像素数目。Wherein, the size can be expressed in terms of the number of pixels, and the first size is the number of pixels expanded upward.
本发明实施例可以向人脸区域的上方扩展第一尺寸,得到目标区域,从而目标区域包含人脸区域以及上方第一尺寸区域。如图3(A)所示,灰色区域F1为第一图像样本的整个图像区域,阴影部分区域F2为第一图像样本中的人脸区域,以人脸区域F2为基准,向上扩展得到目标区域为图3(B)中由虚线围起来的区域F3(包含区域F2)。The embodiment of the present invention may extend the first size above the face area to obtain the target area, so that the target area includes the face area and the upper first size area. As shown in Figure 3(A), the gray area F1 is the entire image area of the first image sample, and the shaded area F2 is the face area in the first image sample. Based on the face area F2, the target area is expanded upward It is the area F3 (including the area F2) enclosed by the dotted line in FIG. 3(B).
可以理解,第一尺寸可以为多个不同的尺寸,从而实现不同大小如F3的目标区域。It can be understood that the first size may be a plurality of different sizes, so as to achieve a target area of different sizes such as F3.
本发明实施例可以通过向上扩展得到目标区域,从而将上方包括屏幕、或设备或屏幕边框以及角度的目标区域作为第二图像样本,并在模型训练时学习到屏幕、或设备或屏幕边框以及角度的特点,可以识别到人脸区域上方的屏幕反光、设备或屏幕边框以及屏幕角度等信息,从而可以区分采用预先录制的视频或图片进行身份验证的场景,提高了人脸识别的准确度。In the embodiment of the present invention, the target area can be obtained by upward expansion, so that the target area including the screen, or device or screen frame and angle above is used as the second image sample, and the screen, or device or screen frame and angle are learned during model training. It can identify the screen reflectance above the face area, the device or screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for identity verification, and improve the accuracy of face recognition.
步骤204,以所述人脸区域为基准向下扩展多个第二尺寸形成多个目标区域,得到多个第二图像样本。Step 204: Expand a plurality of second sizes downward based on the face region to form a plurality of target regions, and obtain a plurality of second image samples.
其中,第二尺寸为向下扩展的像素数目。Among them, the second size is the number of pixels expanded downward.
本发明实施例可以向人脸区域的下方扩展第二尺寸,得到目标区域,从而目标区域包含人脸区域以及下方第二尺寸区域。如图3(A)所示,以人脸区域F2为基准,向下扩展得到目标区域为图3(C)中由虚线围起来的区域F4(包含区域F2)。The embodiment of the present invention can extend the second size below the face area to obtain the target area, so that the target area includes the face area and the second size area below. As shown in FIG. 3(A), taking the face area F2 as a reference, the target area obtained by downward expansion is the area F4 (including the area F2) enclosed by the dotted line in FIG. 3(C).
可以理解,第二尺寸可以为多个不同的尺寸,从而实现不同大小如F4的目标区域。It can be understood that the second size can be a plurality of different sizes, so as to achieve a target area of different sizes such as F4.
本发明实施例可以通过向下扩展得到目标区域,从而将下方包括屏幕、或设备或屏幕边框以及角度的目标区域作为第二图像样本,并在模型训练时学习到屏幕、或设备或屏幕边框以及角度的特点,可以识别到人脸区域下方的屏幕反光、设备或屏幕边框以及屏幕角度等信息,从而可以区分采用预先录制的视频或图片进行身份验证的场景,提高了人脸识别的准确度。In the embodiment of the present invention, the target area can be obtained by downward expansion, so that the target area below the screen, or the device or screen frame and the angle is used as the second image sample, and the screen, or device or screen frame and the angle are learned during model training. The angle feature can identify the screen reflections under the face area, the device or screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for identity verification, and improve the accuracy of face recognition.
步骤205,以所述人脸区域为基准向左扩展多个第三尺寸形成多个目标区域,得到多个第二图像样本。Step 205: Expanding multiple third sizes to the left based on the face area to form multiple target areas to obtain multiple second image samples.
其中,第三尺寸为向左扩展的像素数目。Among them, the third size is the number of pixels extended to the left.
本发明实施例可以向人脸区域的左方扩展第三尺寸,得到目标区域,从而目标区域包含人脸区域以及左方第三尺寸区域。如图3(A)所示,以人脸区域F2为基准,向左扩展得到目标区域为图3(D)中由虚线围起来的区域F5(包含区域F2)。The embodiment of the present invention can extend the third size to the left of the face area to obtain the target area, so that the target area includes the face area and the third size area on the left. As shown in Fig. 3(A), taking the face area F2 as a reference, the target area obtained by expanding to the left is the area F5 (including the area F2) enclosed by the dotted line in Fig. 3(D).
可以理解,第三尺寸可以为多个不同的尺寸,从而实现不同大小如F5的目标区域。It can be understood that the third size may be a plurality of different sizes, so as to achieve a target area of different sizes such as F5.
本发明实施例可以通过向左方扩展得到目标区域,从而将左方包括屏幕、或设备或屏幕边框以及角度的目标区域作为第二图像样本,并在模型训练时学习到屏幕、或设备或屏幕边框以及角度的特点,可以识别到人脸区域左方的屏幕反光、设备或屏幕边框以及屏幕角度等信息,从而可以区分采用预先录制的视频或图片进行身份验证的场景,提高了人脸识别的准确度。In the embodiment of the present invention, the target area can be obtained by expanding to the left, so that the target area on the left including the screen, or device or screen frame and angle is used as the second image sample, and the screen, or device or screen is learned during model training. The characteristics of the frame and the angle can identify the screen reflection on the left of the face area, the device or the screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for authentication, and improve the face recognition Accuracy.
步骤206,以所述人脸区域为基准向右扩展多个第四尺寸形成多个目标区域,得到多个第二图像样本。Step 206: Expanding multiple fourth sizes to the right based on the face area to form multiple target areas to obtain multiple second image samples.
其中,第四尺寸为向右扩展的像素数目。Among them, the fourth size is the number of pixels extended to the right.
本发明实施例可以向人脸区域的右方扩展第四尺寸,得到目标区域,从而目标区域包含人脸区域以及右方第四尺寸区域。如图3(A)所示,以人脸区域F2为基准,向右扩展得到目标区域为图3(E)中由虚线围起来的区域F6(包含区域F2)。The embodiment of the present invention may extend the fourth size to the right of the face area to obtain the target area, so that the target area includes the face area and the fourth size area on the right. As shown in FIG. 3(A), taking the face area F2 as a reference, the target area expanded to the right is the area F6 (including the area F2) enclosed by the dotted line in FIG. 3(E).
可以理解,第四尺寸可以为多个不同的尺寸,从而实现不同大小如F6的目标区域。It can be understood that the fourth size may be a plurality of different sizes, so as to achieve a target area of different sizes such as F6.
在一个实施例中,可以选择其中一个方向扩展,也可以将上下方向结合,得到目标区域为如图3(F)中虚线围成的区域F7;还可以将左右方向结合,得到目标区域为如图3(G)中虚线围成的区域F8;还可以将上下左右方向结合,得到目标区域为如图3(H)中虚线围成的区域F9。还可以将上下左右方向任意结合进行扩展,得到目标区域,例如,左上同时扩展、上左右同时扩展、上下右同时扩展等。其中,第一尺寸、第二尺寸、第三尺寸、第四尺寸可以相同也可以不同,特别地,为了实现简单且对称,在上下结合时,可以采用相同的第一尺寸和第二尺寸,在左右结合时,可以采用相同的第三尺寸和第四尺寸。In one embodiment, one of the directions can be selected to expand, or the up and down directions can be combined to obtain the target area as the area F7 enclosed by the dashed line in Figure 3(F); the left and right directions can also be combined to obtain the target area as The area F8 enclosed by the dashed line in Fig. 3(G); the up, down, left and right directions can also be combined to obtain the target area as the area F9 enclosed by the dashed line in Fig. 3(H). It is also possible to arbitrarily combine the up, down, left and right directions to expand to obtain a target area, for example, simultaneous expansion of the upper left, simultaneous expansion of the upper left and right, and simultaneous expansion of the upper and lower right. Among them, the first size, the second size, the third size, and the fourth size may be the same or different. In particular, in order to achieve simplicity and symmetry, the same first size and second size may be used when combining up and down. When combining left and right, the same third size and fourth size can be used.
本发明实施例可以通过向右扩展得到目标区域,从而将右方包括屏幕、或设备或屏幕边框以及角度的目标区域作为第二图像样本,并在模型训练时学习到屏幕、或设备或屏幕边框以及角度的特点,可以识别到人脸区域右方 的屏幕反光、设备或屏幕边框以及屏幕角度等信息,从而可以区分采用预先录制的视频或图片进行身份验证的场景,提高了人脸识别的准确度。In the embodiment of the present invention, the target area can be obtained by expanding to the right, so that the target area on the right including the screen, or device or screen frame and angle is used as the second image sample, and the screen, or device or screen frame is learned during model training. As well as the characteristics of the angle, it can identify the screen reflections on the right side of the face area, the device or screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for authentication, and improve the accuracy of face recognition degree.
可选地,在本发明的另一种实施例中,所述第一尺寸、第二尺寸为所述人脸区域的上下方向尺寸的倍数,所述第三尺寸、第四尺寸为所述人脸区域的左右方向尺寸的倍数。Optionally, in another embodiment of the present invention, the first size and the second size are multiples of the vertical size of the face area, and the third size and the fourth size are the person The multiple of the left and right dimensions of the face area.
其中,人脸区域的上下方向尺寸可以理解为人脸区域高度,人脸区域的左右方向尺寸可以理解为人脸区域宽度。Among them, the vertical size of the face area can be understood as the height of the face area, and the left-right size of the face area can be understood as the width of the face area.
具体地,若第一尺寸表示目标区域的上边界与人脸区域的上边界之间的距离,则第一尺寸可以为人脸区域高度的1倍、2倍、3倍等;若第二尺寸表示目标区域的下边界与人脸区域的下边界之间的距离,则第二尺寸可以为人脸区域高度的1倍、2倍、3倍等;若第三尺寸表示目标区域的左边界与人脸区域的左边界之间的距离,则第三尺寸可以为人脸区域宽度的1倍、2倍、3倍等;若第四尺寸表示目标区域的右边界与人脸区域的右边界之间的距离,则第四尺寸可以为人脸区域宽度的1倍、2倍、3倍等。Specifically, if the first size represents the distance between the upper boundary of the target area and the upper boundary of the face area, the first size can be 1 time, 2 times, 3 times, etc., the height of the face area; if the second size represents The distance between the lower boundary of the target area and the lower boundary of the face area, the second size can be 1, 2, 3 times the height of the face area, etc.; if the third size represents the left boundary of the target area and the face The distance between the left border of the area, the third size can be 1 time, 2 times, 3 times the width of the face area, etc.; if the fourth size represents the distance between the right border of the target area and the right border of the face area , The fourth size can be 1, 2, 3, etc. the width of the face area.
在一个实施例中,最大倍数可以以图像尺寸确定,不断扩展倍数直至到达第一图像样本的边界。例如,第一尺寸可以根据上方边界确定最大倍数,第二尺寸可以根据下方边界确定最大倍数,第三尺寸可以根据左方边界确定最大倍数,第四尺寸可以根据右方边界确定最大倍数。In one embodiment, the maximum multiple may be determined by the image size, and the multiple is continuously expanded until the boundary of the first image sample is reached. For example, the first size may determine the maximum multiple based on the upper boundary, the second size may determine the maximum multiple based on the lower boundary, the third size may determine the maximum multiple based on the left boundary, and the fourth size may determine the maximum multiple based on the right boundary.
本发明实施例可以根据人脸区域的尺寸的倍数扩展目标区域,可以简单有效的确定包含屏幕以及屏幕边框、设备边框的目标区域。The embodiment of the present invention can expand the target area according to the multiple of the size of the face area, and can simply and effectively determine the target area including the screen, the screen frame, and the device frame.
综上所述,本发明实施例提供了一种图像处理方法,所述方法包括:获取第一图像样本;识别所述第一图像样本中的人脸区域;以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的。可以以人脸区域为基准扩展形成目标区域,得到第二图像样本,使得生成的第二图像样本一定包括人脸区域,有助于提高模型识别人脸的准确性。In summary, an embodiment of the present invention provides an image processing method, the method includes: acquiring a first image sample; identifying a face area in the first image sample; determining the face area based on the face area A plurality of second image samples are obtained from a plurality of target regions, wherein the target region is formed by expanding toward a preset direction based on the face region. The target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.
实施例三Example three
参照图4,其示出了本发明实施例三提供的一种图像处理装置的结构图,具体如下。Referring to FIG. 4, it shows a structural diagram of an image processing apparatus provided in Embodiment 3 of the present invention, which is specifically as follows.
第一图像样本获取模块301,用于获取第一图像样本。The first image sample acquisition module 301 is used to acquire the first image sample.
人脸区域识别模块302,用于识别所述第一图像样本中的人脸区域。The face area recognition module 302 is configured to recognize the face area in the first image sample.
第二图像样本生成模块303,用于以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的。The second image sample generation module 303 is configured to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is based on the face area as a reference to proceed toward a preset direction Expanded.
综上所述,本发明实施例提供了一种图像处理装置,所述装置包括:第一图像样本获取模块,用于获取第一图像样本;人脸区域识别模块,用于识别所述第一图像样本中的人脸区域;第二图像样本生成模块,用于以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的。可以以人脸区域为基准扩展形成目标区域,得到第二图像样本,使得生成的第二图像样本一定包括人脸区域,有助于提高模型识别人脸的准确性。In summary, an embodiment of the present invention provides an image processing device, the device includes: a first image sample acquisition module for acquiring a first image sample; a face region recognition module for identifying the first image sample The face area in the image sample; the second image sample generation module is used to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is based on the face area It is formed by expanding the base toward the preset direction. The target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.
实施例三为方法实施例一对应的装置实施例,详细信息可以参照实施例一的详细说明,在此不再赘述。The third embodiment is an apparatus embodiment corresponding to method embodiment 1. For detailed information, please refer to the detailed description of embodiment 1, which will not be repeated here.
实施例四Example four
参照图5,其示出了本发明实施例四提供的一种图像处理装置的结构图,具体如下。Referring to FIG. 5, it shows a structural diagram of an image processing apparatus provided by Embodiment 4 of the present invention, which is specifically as follows.
第一图像样本获取模块401,用于获取第一图像样本;可选地,在本发明的另一种实施例中,所述第一图像样本获取模块401,包括:The first image sample acquisition module 401 is configured to acquire a first image sample; optionally, in another embodiment of the present invention, the first image sample acquisition module 401 includes:
第一图像样本接收子模块4011,用于接收拍摄设备发送的第一图像样本,其中所述第一图像样本的环境亮度、拍摄对象与拍摄设备之间的距离和/或角度不完全相同。The first image sample receiving sub-module 4011 is configured to receive a first image sample sent by a photographing device, wherein the environmental brightness of the first image sample, the distance and/or angle between the photographing object and the photographing device are not completely the same.
人脸区域识别模块402,用于识别所述第一图像样本中的人脸区域。The face region recognition module 402 is used to identify the face region in the first image sample.
第二图像样本生成模块403,用于以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的,包括:The second image sample generation module 403 is configured to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is based on the face area as a reference and moves toward a preset direction Formed by expansion, including:
第一目标区域扩展子模块4031,用于以所述人脸区域为基准向上扩展多个第一尺寸;和/或,The first target area expansion submodule 4031 is configured to expand a plurality of first sizes upward based on the face area; and/or,
第二目标区域扩展子模块4032,用于以所述人脸区域为基准向下扩展多个第二尺寸;和/或,The second target area expansion submodule 4032 is configured to expand a plurality of second sizes downward based on the face area; and/or,
第三目标区域扩展子模块4033,用于以所述人脸区域为基准向左扩展多个第三尺寸;和/或,The third target area expansion submodule 4033 is configured to expand a plurality of third sizes to the left based on the face area; and/or,
第四目标区域扩展子模块4034,用于以所述人脸区域为基准向右扩展 多个第四尺寸。The fourth target area expansion sub-module 4034 is configured to expand a plurality of fourth sizes to the right based on the face area.
可选地,在本发明的另一种实施例中,所述第一尺寸、第二尺寸为所述人脸区域的上下方向尺寸的倍数,所述第三尺寸、第四尺寸为所述人脸区域的左右方向尺寸的倍数。Optionally, in another embodiment of the present invention, the first size and the second size are multiples of the vertical size of the face area, and the third size and the fourth size are the person The multiple of the left and right dimensions of the face area.
可选地,在本发明的另一种实施例中,所述目标视频由多个子视频组成,所述子视频之间以预设标记分隔。Optionally, in another embodiment of the present invention, the target video is composed of multiple sub-videos, and the sub-videos are separated by a preset mark.
综上所述,本发明实施例提供了一种图像处理装置,所述装置包括:第一图像样本获取模块,用于获取第一图像样本;人脸区域识别模块,用于识别所述第一图像样本中的人脸区域;第二图像样本生成模块,用于以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的。可以以人脸区域为基准扩展形成目标区域,得到第二图像样本,使得生成的第二图像样本一定包括人脸区域,有助于提高模型识别人脸的准确性。In summary, an embodiment of the present invention provides an image processing device, the device includes: a first image sample acquisition module for acquiring a first image sample; a face region recognition module for identifying the first image sample The face area in the image sample; the second image sample generation module is used to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is based on the face area It is formed by expanding the base toward the preset direction. The target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.
实施例四为方法实施例二对应的装置实施例,详细信息可以参照实施例二的详细说明,在此不再赘述。The fourth embodiment is a device embodiment corresponding to the second method embodiment. For detailed information, please refer to the detailed description of the second embodiment, which will not be repeated here.
本发明实施例还提供了一种电子设备,包括:处理器、存储器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述程序时实现前述的方法。An embodiment of the present invention also provides an electronic device, including: a processor, a memory, and a computer program that is stored on the memory and can run on the processor. When the processor executes the program, the aforementioned method.
本发明实施例还提供了一种可读存储介质,当所述存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行前述的方法。The embodiment of the present invention also provides a readable storage medium. When the instructions in the storage medium are executed by the processor of the electronic device, the electronic device can execute the aforementioned method.
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。As for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
在此提供的算法和显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本发明也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本发明的内容,并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。The algorithms and displays provided here are not inherently related to any particular computer, virtual system or other equipment. Various general-purpose systems can also be used with the teaching based on this. From the above description, the structure required to construct this type of system is obvious. In addition, the present invention is not directed to any specific programming language. It should be understood that various programming languages can be used to implement the content of the present invention described herein, and the above description of a specific language is to disclose the best embodiment of the present invention.
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the instructions provided here, a lot of specific details are explained. However, it can be understood that the embodiments of the present invention can be practiced without these specific details. In some instances, well-known methods, structures and technologies are not shown in detail so as not to obscure the understanding of this specification.
类似地,应当理解,为了精简本公开并帮助理解各个发明方面中的一个或多个,在上面对本发明的示例性实施例的描述中,本发明的各个特征有时 被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。Similarly, it should be understood that in order to simplify the present disclosure and help understand one or more of the various inventive aspects, in the above description of the exemplary embodiments of the present invention, the various features of the present invention are sometimes grouped together into a single embodiment, Figure, or its description. However, the disclosed method should not be construed as reflecting the intention that the claimed invention requires more features than those explicitly stated in each claim. More precisely, as reflected in the following claims, the inventive aspect lies in less than all the features of a single embodiment previously disclosed. Therefore, the claims following the specific embodiment are thus explicitly incorporated into the specific embodiment, wherein each claim itself serves as a separate embodiment of the present invention.
本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art can understand that it is possible to adaptively change the modules in the device in the embodiment and set them in one or more devices different from the embodiment. The modules or units or components in the embodiments can be combined into one module or unit or component, and in addition, they can be divided into multiple sub-modules or sub-units or sub-components. Except that at least some of such features and/or processes or units are mutually exclusive, any combination can be used to compare all features disclosed in this specification (including the accompanying claims, abstract and drawings) and any method or methods disclosed in this manner or All the processes or units of the equipment are combined. Unless expressly stated otherwise, each feature disclosed in this specification (including the accompanying claims, abstract and drawings) may be replaced by an alternative feature providing the same, equivalent or similar purpose.
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的图像处理设备中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。The various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by a combination of them. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some or all components in the image processing device according to the embodiments of the present invention. The present invention can also be implemented as a device or device program for executing part or all of the methods described herein. Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals. Such signals can be downloaded from Internet websites, or provided on carrier signals, or provided in any other form.
例如,图6示出了可以实现根据本申请的方法的计算处理设备。该计算处理设备传统上包括处理器1010和以存储器1020形式的计算机程序产品或者计算机可读介质。存储器1020可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储器1020具有用于执行上述方法中的任何方法步骤的程序代码1031的存储空间1030。例如,用于程序代码的存储空间1030可以包括分别用于实现上面的方法中的各种步骤的各个程序代码1031。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,紧致盘(CD)、存储卡或者软盘之类的程序 代码载体。这样的计算机程序产品通常为如参考图7所述的便携式或者固定存储单元。该存储单元可以具有与图6的计算处理设备中的存储器1020类似布置的存储段、存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元包括计算机可读代码1031’,即可以由例如诸如1010之类的处理器读取的代码,这些代码当由计算处理设备运行时,导致该计算处理设备执行上面所描述的方法中的各个步骤。For example, FIG. 6 shows a computing processing device that can implement the method according to the present application. The computing processing device traditionally includes a processor 1010 and a computer program product in the form of a memory 1020 or a computer readable medium. The memory 1020 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM. The memory 1020 has a storage space 1030 for executing program codes 1031 of any method steps in the above methods. For example, the storage space 1030 for program codes may include various program codes 1031 for implementing various steps in the above method. These program codes can be read from or written into one or more computer program products. These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks. Such a computer program product is usually a portable or fixed storage unit as described with reference to FIG. 7. The storage unit may have storage segments, storage spaces, etc. arranged similarly to the memory 1020 in the computing processing device of FIG. 6. The program code can be compressed in a suitable form, for example. Generally, the storage unit includes computer-readable codes 1031', that is, codes that can be read by, for example, a processor such as 1010. These codes, when run by a computing processing device, cause the computing processing device to execute the method described above. The various steps.
应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。It should be noted that the above-mentioned embodiments illustrate the present invention rather than limit the present invention, and those skilled in the art can design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses should not be constructed as a limitation to the claims. The word "comprising" does not exclude the presence of elements or steps not listed in the claims. The word "a" or "an" preceding an element does not exclude the presence of multiple such elements. The invention can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the unit claims enumerating several devices, several of these devices may be embodied in the same hardware item. The use of the words first, second, and third does not indicate any order. These words can be interpreted as names.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the above-described system, device, and unit can refer to the corresponding process in the foregoing method embodiment, which is not repeated here.
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above are only the preferred embodiments of the present invention and are not intended to limit the present invention. Any modification, equivalent replacement and improvement made within the spirit and principle of the present invention shall be included in the protection of the present invention. Within range.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited thereto. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed by the present invention. It should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (18)

  1. 一种图像处理方法,其特征在于,所述方法包括:An image processing method, characterized in that the method includes:
    获取第一图像样本;Obtain the first image sample;
    识别所述第一图像样本中的人脸区域;Identifying the face area in the first image sample;
    以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的。A plurality of target regions are determined based on the face region to obtain a plurality of second image samples, wherein the target region is formed by expanding toward a preset direction based on the face region.
  2. 根据权利要求1所述的方法,其特征在于,所述以所述人脸区域为基准朝向预设方向进行扩展的步骤,包括:The method according to claim 1, wherein the step of expanding toward a predetermined direction based on the face area comprises:
    以所述人脸区域为基准向上扩展多个第一尺寸;和/或,Expanding a plurality of first sizes upward based on the face area; and/or,
    以所述人脸区域为基准向下扩展多个第二尺寸;和/或,Expand a plurality of second sizes downward based on the face area; and/or,
    以所述人脸区域为基准向左扩展多个第三尺寸;和/或,Expand a plurality of third sizes to the left based on the face area; and/or,
    以所述人脸区域为基准向右扩展多个第四尺寸。Expanding multiple fourth sizes to the right based on the face area.
  3. 根据权利要求1所述的方法,其特征在于,所述第一尺寸、第二尺寸为所述人脸区域的上下方向尺寸的倍数,所述第三尺寸、第四尺寸为所述人脸区域的左右方向尺寸的倍数。The method according to claim 1, wherein the first size and the second size are multiples of the vertical size of the face area, and the third size and the fourth size are the face area The multiple of the left and right dimensions.
  4. 根据权利要求1所述的方法,其特征在于,所述获取第一图像样本的步骤,包括:The method according to claim 1, wherein the step of obtaining a first image sample comprises:
    通过拍摄设备获取多个第一图像样本,Acquire multiple first image samples through the shooting device,
    其中所述多个第一图像样本的环境亮度、拍摄对象与拍摄设备之间的距离和/或角度不完全相同。Wherein, the environmental brightness, the distance and/or the angle between the shooting object and the shooting device of the plurality of first image samples are not completely the same.
  5. 根据权利要求4所述的方法,其特征在于,所述多个第一图像样本对应于所述拍摄设备所拍摄的目标视频拆分形成的多帧图像。The method according to claim 4, wherein the plurality of first image samples correspond to multiple frames of images formed by splitting the target video shot by the shooting device.
  6. 根据权利要求5所述的方法,其特征在于,所述目标视频由多个子视频组成,所述子视频之间以预设标记分隔。The method according to claim 5, wherein the target video is composed of multiple sub-videos, and the sub-videos are separated by a preset mark.
  7. 根据权利要求4所述的方法,其特征在于,所述拍摄对象是播放设备,该播放设备由能够转动的云台承载。The method according to claim 4, wherein the shooting object is a playback device, and the playback device is carried by a pan-tilt that can rotate.
  8. 根据权利要求1所述的方法,其特征在于,所述第一图像样本包括:播放设备的屏幕或边框;The method according to claim 1, wherein the first image sample comprises: a screen or frame of a playback device;
    在以人脸区域为基准朝向预设方向进行扩展以形成目标区域时,所述播放设备的屏幕或边框包括在所述目标区域内。When expanding toward a preset direction based on the face area to form a target area, the screen or frame of the playback device is included in the target area.
  9. 一种图像处理装置,其特征在于,所述装置包括:An image processing device, characterized in that the device includes:
    第一图像样本获取模块,用于获取第一图像样本;The first image sample obtaining module is used to obtain the first image sample;
    人脸区域识别模块,用于识别所述第一图像样本中的人脸区域;A face area recognition module, configured to recognize the face area in the first image sample;
    第二图像样本生成模块,用于以所述人脸区域为基准确定多个目标区域得到多个第二图像样本,其中所述目标区域为以所述人脸区域为基准朝向预设方向进行扩展形成的。The second image sample generation module is configured to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is expanded toward a preset direction based on the face area Forming.
  10. 根据权利要求9所述的装置,其特征在于,所述第二图像样本生成模块,包括:The device according to claim 9, wherein the second image sample generating module comprises:
    第一目标区域扩展子模块,用于以所述人脸区域为基准向上扩展多个第一尺寸;和/或,The first target area expansion sub-module is configured to expand a plurality of first sizes upward based on the face area; and/or,
    第二目标区域扩展子模块,用于以所述人脸区域为基准向下扩展多个第二尺寸;和/或,The second target area expansion sub-module is used to expand a plurality of second sizes downward based on the face area; and/or,
    第三目标区域扩展子模块,用于以所述人脸区域为基准向左扩展多个第三尺寸;和/或,The third target area expansion submodule is used to expand a plurality of third sizes to the left based on the face area; and/or,
    第四目标区域扩展子模块,用于以所述人脸区域为基准向右扩展多个第四尺寸。The fourth target area expansion sub-module is used to expand a plurality of fourth sizes to the right based on the face area.
  11. 根据权利要求9所述的装置,其特征在于,所述第一尺寸、第二尺寸为所述人脸区域的上下方向尺寸的倍数,所述第三尺寸、第四尺寸为所述人脸区域的左右方向尺寸的倍数。The device according to claim 9, wherein the first size and the second size are multiples of the vertical size of the face area, and the third size and the fourth size are the face area The multiple of the left and right dimensions.
  12. 根据权利要求9所述的装置,其特征在于,所述第一图像样本获取模块,包括:The device according to claim 9, wherein the first image sample acquisition module comprises:
    第一图像样本接收子模块,用于接收拍摄设备发送的第一图像样本,其中所述第一图像样本的环境亮度、拍摄对象与拍摄设备之间的距离和/或角度不完全相同。The first image sample receiving sub-module is configured to receive the first image sample sent by the photographing device, wherein the environmental brightness of the first image sample, the distance and/or the angle between the photographing object and the photographing device are not completely the same.
  13. 根据权利要求12所述的装置,其特征在于,所述多个第一图像样本对应于所述拍摄设备所拍摄的目标视频拆分形成的多帧图像。The apparatus according to claim 12, wherein the plurality of first image samples correspond to multiple frames of images formed by splitting the target video shot by the shooting device.
  14. 根据权利要求13所述的装置,其特征在于,所述目标视频由多个子视频组成,所述子视频之间以预设标记分隔。The device according to claim 13, wherein the target video is composed of a plurality of sub-videos, and the sub-videos are separated by a preset mark.
  15. 根据权利要求12所述的方法,其特征在于,所述拍摄对象是播放设备,该播放设备由能够转动的云台承载。The method according to claim 12, wherein the shooting object is a playback device, and the playback device is carried by a pan-tilt that can rotate.
  16. 根据权利要求9所述的方法,其特征在于,所述第一图像样本包括:播放设备的屏幕或边框;The method according to claim 9, wherein the first image sample comprises: a screen or frame of a playback device;
    在以人脸区域为基准朝向预设方向进行扩展以形成目标区域时,所述播放设备的屏幕或边框包括在所述目标区域内。When expanding toward a preset direction based on the face area to form a target area, the screen or frame of the playback device is included in the target area.
  17. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    处理器、存储器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1至8中一个或多个所述的方法。A processor, a memory, and a computer program that is stored on the memory and can run on the processor, wherein the processor executes the program as described in one or more of claims 1 to 8. The method described.
  18. 一种可读存储介质,其特征在于,当所述存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行如方法权利要求1至8中一个或多个所述的方法。A readable storage medium, characterized in that, when the instructions in the storage medium are executed by the processor of the electronic device, the electronic device can execute the method according to one or more of the method claims 1 to 8.
PCT/CN2020/073836 2019-01-24 2020-01-22 Image processing method and device WO2020151750A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910069268.3 2019-01-24
CN201910069268.3A CN109919010A (en) 2019-01-24 2019-01-24 Image processing method and device

Publications (1)

Publication Number Publication Date
WO2020151750A1 true WO2020151750A1 (en) 2020-07-30

Family

ID=66960672

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/073836 WO2020151750A1 (en) 2019-01-24 2020-01-22 Image processing method and device

Country Status (2)

Country Link
CN (1) CN109919010A (en)
WO (1) WO2020151750A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111915642A (en) * 2020-09-14 2020-11-10 北京百度网讯科技有限公司 Image sample generation method, device, equipment and readable storage medium
CN112308758A (en) * 2020-10-30 2021-02-02 上海禾儿盟智能科技有限公司 Near-infrared image data online augmentation device, system and method

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919010A (en) * 2019-01-24 2019-06-21 北京三快在线科技有限公司 Image processing method and device
CN110490054B (en) * 2019-07-08 2021-03-09 北京三快在线科技有限公司 Target area detection method and device, electronic equipment and readable storage medium
CN110619600B (en) * 2019-09-17 2023-12-08 南京旷云科技有限公司 Neural network model training method and device, storage medium and electronic equipment
CN110956147B (en) * 2019-12-05 2022-09-30 京东科技控股股份有限公司 Method and device for generating training data
CN112529097B (en) * 2020-12-23 2024-03-26 北京百度网讯科技有限公司 Sample image generation method and device and electronic equipment
CN112738404B (en) * 2020-12-30 2022-11-25 维沃移动通信(杭州)有限公司 Electronic equipment control method and electronic equipment
CN113052166A (en) * 2021-02-05 2021-06-29 杭州依图医疗技术有限公司 Pathological image display method and device
CN113011468B (en) * 2021-02-25 2022-12-13 上海皓桦科技股份有限公司 Image feature extraction method and device
CN113420597A (en) * 2021-05-24 2021-09-21 北京三快在线科技有限公司 Method and device for identifying roundabout, electronic equipment and storage medium
CN113938615B (en) * 2021-08-26 2023-04-18 秒针信息技术有限公司 Method and device for acquiring human face anti-counterfeiting data set and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120177293A1 (en) * 2009-09-18 2012-07-12 Kabushiki Kaisha Toshiba Feature extraction device
CN104573657A (en) * 2015-01-09 2015-04-29 安徽清新互联信息科技有限公司 Blind driving detection method based on head lowing characteristics
CN107203754A (en) * 2017-05-26 2017-09-26 北京邮电大学 A kind of license plate locating method and device based on deep learning
CN108154134A (en) * 2018-01-11 2018-06-12 天格科技(杭州)有限公司 Internet live streaming pornographic image detection method based on depth convolutional neural networks
CN109919010A (en) * 2019-01-24 2019-06-21 北京三快在线科技有限公司 Image processing method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1282943C (en) * 2002-12-30 2006-11-01 佳能株式会社 Image processing method and device
CN108875730B (en) * 2017-05-16 2023-08-08 中兴通讯股份有限公司 Deep learning sample collection method, device, equipment and storage medium
CN107194376A (en) * 2017-06-21 2017-09-22 北京市威富安防科技有限公司 Mask fraud convolutional neural networks training method and human face in-vivo detection method
CN107220635A (en) * 2017-06-21 2017-09-29 北京市威富安防科技有限公司 Human face in-vivo detection method based on many fraud modes
CN108182409B (en) * 2017-12-29 2020-11-10 智慧眼科技股份有限公司 Living body detection method, living body detection device, living body detection equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120177293A1 (en) * 2009-09-18 2012-07-12 Kabushiki Kaisha Toshiba Feature extraction device
CN104573657A (en) * 2015-01-09 2015-04-29 安徽清新互联信息科技有限公司 Blind driving detection method based on head lowing characteristics
CN107203754A (en) * 2017-05-26 2017-09-26 北京邮电大学 A kind of license plate locating method and device based on deep learning
CN108154134A (en) * 2018-01-11 2018-06-12 天格科技(杭州)有限公司 Internet live streaming pornographic image detection method based on depth convolutional neural networks
CN109919010A (en) * 2019-01-24 2019-06-21 北京三快在线科技有限公司 Image processing method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111915642A (en) * 2020-09-14 2020-11-10 北京百度网讯科技有限公司 Image sample generation method, device, equipment and readable storage medium
CN111915642B (en) * 2020-09-14 2024-05-14 北京百度网讯科技有限公司 Image sample generation method, device, equipment and readable storage medium
CN112308758A (en) * 2020-10-30 2021-02-02 上海禾儿盟智能科技有限公司 Near-infrared image data online augmentation device, system and method
CN112308758B (en) * 2020-10-30 2023-11-03 上海禾儿盟智能科技有限公司 Near infrared image data on-line amplification device, system and method

Also Published As

Publication number Publication date
CN109919010A (en) 2019-06-21

Similar Documents

Publication Publication Date Title
WO2020151750A1 (en) Image processing method and device
US11809998B2 (en) Maintaining fixed sizes for target objects in frames
US11842458B2 (en) Image feature combination for image-based object recognition
CN106897675B (en) Face living body detection method combining binocular vision depth characteristic and apparent characteristic
CN109583285B (en) Object recognition method
JP2020523665A (en) Biological detection method and device, electronic device, and storage medium
CN107148632B (en) Robust feature recognition for image-based object recognition
CN106716450B (en) Image-based feature detection using edge vectors
KR100714724B1 (en) Apparatus and method for estimating facial pose, and face recognition system by the method
KR102290392B1 (en) Method and apparatus for registering face, method and apparatus for recognizing face
WO2019133403A1 (en) Multi-resolution feature description for object recognition
CN111476709B (en) Face image processing method and device and electronic equipment
US20170116705A1 (en) Method for automatic facial impression transformation, recording medium and device for performing the method
Raghavendra et al. Exploring the usefulness of light field cameras for biometrics: An empirical study on face and iris recognition
WO2018232837A1 (en) Tracking photography method and tracking apparatus for moving target
JP4597391B2 (en) Facial region detection apparatus and method, and computer-readable recording medium
CN111091075B (en) Face recognition method and device, electronic equipment and storage medium
WO2019200719A1 (en) Three-dimensional human face model-generating method and apparatus, and electronic device
US9953247B2 (en) Method and apparatus for determining eye position information
US10650234B2 (en) Eyeball movement capturing method and device, and storage medium
WO2023011013A1 (en) Splicing seam search method and apparatus for video image, and video image splicing method and apparatus
WO2020172870A1 (en) Method and apparatus for determining motion trajectory of target object
US11861806B2 (en) End-to-end camera calibration for broadcast video
JP2023523745A (en) Character string recognition method, apparatus, equipment and medium based on computer vision
Choi et al. Data insufficiency in sketch versus photo face recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20745603

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20745603

Country of ref document: EP

Kind code of ref document: A1