WO2020151750A1

WO2020151750A1 - Image processing method and device

Info

Publication number: WO2020151750A1
Application number: PCT/CN2020/073836
Authority: WO
Inventors: 李圣喜; 柴振华; 孟欢欢
Original assignee: 北京三快在线科技有限公司
Priority date: 2019-01-24
Filing date: 2020-01-22
Publication date: 2020-07-30
Also published as: CN109919010A

Abstract

The present invention provides an image processing method and device. The method comprises: obtaining a first image sample; recognizing a face region in the first image sample; a second image sample generation module determining a plurality of target regions by using the face region as a reference so as to obtain a plurality of second image samples, wherein the target regions are formed by using the face region as a reference and expanding therefrom in preset directions. The target regions are formed by using the face region as a reference and expanding therefrom so as to obtain the second image samples, such that the generated second image samples definitely comprise the face region, thereby improving the accuracy of facial recognition models.

Description

Image processing method and device

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office with the application number 201910069268.3 and the invention title "Image Processing Method and Apparatus" on January 24, 2019, the entire content of which is incorporated into this application by reference.

Technical field

The embodiments of the present invention relate to the field of image processing technology, and in particular to an image processing method and device.

Background technique

Face recognition has become a very important way of identity verification. A pre-trained convolutional neural network can be used to recognize faces. Convolutional neural networks require image sample sets for training.

In the prior art, in order to increase the number of image samples, the image samples can be augmented when generating the image sample set. Image augmentation performs a series of random transformations on image samples to obtain similar but different image samples to expand the image sample set. Among them, the random transformation specifically includes random cropping, random segmentation, random lighting and so on.

However, the above solution may lose part of the face information in the random transformation process, resulting in poor accuracy of the face recognition of the model obtained through the sample training.

Summary of the invention

The present invention provides an image processing method and device to solve the above-mentioned problems in the prior art.

According to a first aspect of the present invention, there is provided an image processing method, the method including:

Obtain the first image sample;

Identifying the face area in the first image sample;

A plurality of target regions are determined based on the face region to obtain a plurality of second image samples, wherein the target region is formed by expanding toward a preset direction based on the face region.

Optionally, the step of determining target areas of multiple sizes based on the face area includes:

Expanding a plurality of first sizes upward based on the face area; and/or,

Expand a plurality of second sizes downward based on the face area; and/or,

Expand a plurality of third sizes to the left based on the face area; and/or,

Expanding multiple fourth sizes to the right based on the face area.

Optionally, the first size and the second size are multiples of the vertical size of the face area, and the third size and the fourth size are multiples of the horizontal size of the face area.

Optionally, the step of obtaining the first image sample includes:

Receive a first image sample sent by a photographing device, wherein the environmental brightness of the first image sample, the distance and/or the angle between the photographing object and the photographing device are not completely the same.

Optionally, the target video is composed of multiple sub videos, and the sub videos are separated by a preset mark.

According to a second aspect of the present invention, there is provided an image processing device, the device comprising:

The first image sample obtaining module is used to obtain the first image sample;

A face area recognition module, configured to recognize the face area in the first image sample;

The second image sample generation module is configured to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is expanded toward a preset direction based on the face area Forming.

Optionally, the second image sample generating module includes:

The first target area expansion sub-module is configured to expand a plurality of first sizes upward based on the face area; and/or,

The second target area expansion sub-module is used to expand a plurality of second sizes downward based on the face area; and/or,

The third target area expansion submodule is used to expand a plurality of third sizes to the left based on the face area; and/or,

The fourth target area expansion sub-module is used to expand a plurality of fourth sizes to the right based on the face area.

Optionally, the first image sample acquisition module includes:

The first image sample receiving sub-module is configured to receive the first image sample sent by the photographing device, wherein the environmental brightness of the first image sample, the distance and/or the angle between the photographing object and the photographing device are not completely the same.

According to a third aspect of the present invention, there is provided an electronic device, including:

A processor, a memory, and a computer program that is stored on the memory and can run on the processor, and the processor implements the foregoing method when the program is executed.

According to a fourth aspect of the present invention, there is provided a readable storage medium, when instructions in the storage medium are executed by a processor of an electronic device, the electronic device can execute the aforementioned method.

The embodiment of the present invention provides an image processing method and device. The image processing method includes: acquiring a first image sample; identifying a face area in the first image sample; determining the amount of information based on the face area. A plurality of second image samples are obtained from a target area, where the target area is formed by expanding toward a preset direction based on the face area. The target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.

The above description is only an overview of the technical solutions of this application. In order to understand the technical means of this application more clearly, it can be implemented in accordance with the content of the specification, and in order to make the above and other objectives, features and advantages of this application more obvious and understandable. , The specific implementations of this application are cited below.

Description of the drawings

In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description These are some embodiments of the application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.

FIG. 1 is a flowchart of specific steps of an image processing method according to Embodiment 1 of the present invention;

2 is a flowchart of specific steps of an image processing method according to the second embodiment of the present invention;

3A, 3B, 3C, 3D, 3E, 3F, 3G, 3H are schematic diagrams of the target area in the second embodiment of the present invention;

FIG. 4 is a structural diagram of an image processing device according to Embodiment 3 of the present invention;

FIG. 5 is a structural diagram of an image processing apparatus provided by Embodiment 4 of the present invention.

Fig. 6 schematically shows a block diagram of a computing processing device for executing the method according to the present application; and

Fig. 7 schematically shows a storage unit for holding or carrying program codes for implementing the method according to the present application.

Specific embodiment

The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

Example one

1, which shows a flowchart of specific steps of an image processing method according to Embodiment 1 of the present invention.

Step 101: Obtain a first image sample.

The embodiment of the present invention can generate similar but different images for the original image, and can be used for the augmentation of image samples in deep learning. In the authentication process of face recognition, the recognition device needs to obtain the face image and verify the facial features recognized from it. After the verification is passed, the user is allowed to perform operations; after the verification fails, the user is not allowed to perform operations . The application scenarios of face authentication include: driver verification and access control verification of online car-hailing. In one embodiment, there is a risk: an illegal user can obtain a recorded video or picture of a legitimate user in advance, and then use the playback device to play the video or picture, and use the camera of the authentication device to shoot the screen of the playback device. Perform identity verification, so that the face recognition identity verification does not achieve the verification effect, and there is a certain security risk.

In the above illegal authentication process, the frame part of the playback device usually exists in the captured picture, the screen of the playback device also has reflections, and there may also be a certain angle between the playback device and the authentication device. Based on the above features, the image samples generated in the embodiment of the present invention can enable the trained model (for example, convolutional neural network) to recognize features such as borders, reflections, and angles, thereby identifying the above illegal authentication.

Wherein, the first image sample is an original image sample, which may be a picture including various information, for example, including: the screen, frame, or other information around the playback device.

In the embodiment of the present invention, the first image sample may be obtained by shooting, and the shooting object may be a playback device that plays videos or pictures.

Step 102: Identify the face area in the first image sample.

Specifically, a face recognition technology may be used to identify facial feature points (for example, facial features, contours, etc.) from the first image sample, so as to determine the face area.

In the prior art, face recognition technology is already very mature. For two-dimensional face recognition algorithms, it includes:

1. Template matching method: establish a three-dimensional adjustable model frame according to the law of facial features. After locating the face area, use the model frame to locate and adjust the feature parts of the face to solve the observation angle and occlusion in the face recognition process. And facial expression changes.

2. Singular value feature method: The singular value feature of the face image matrix reflects the essential attributes of the image and can be used for face recognition.

3. Subspace analysis method: Because of its strong descriptiveness, low computational cost, easy implementation and good separability, it is widely used in facial feature extraction and has become one of the mainstream methods of face recognition.

4. Locality Preserving Projections (LPP): It is a new subspace analysis method. It is a linear approximation of the nonlinear method. It not only solves the shortcomings of the principal component analysis method that is difficult to maintain the nonlinear manifold of the original data, but also It solves the shortcomings of nonlinear methods that are difficult to obtain low-dimensional projections of new sample points.

5. Principal component analysis method: reduce the complexity of data processing through dimensionality reduction and increase the calculation speed.

For 3D face recognition algorithms, including:

1. Image feature-based method: first match the overall size contour and three-dimensional space direction of the face; then, while keeping the posture fixed, perform local matching of different feature points of the face (these feature points are manually identified) .

2. The method based on the variable parameters of the model: use the combination of the three-dimensional deformation of the general face model and the minimum iteration of the matrix based on the distance mapping to restore the head posture and the three-dimensional face. The attitude parameters are continuously updated as the relationship between the deformation of the model changes, and this process is repeated until the minimum scale reaches the requirement. The biggest difference between the method based on model variable parameters and the method based on image features is that the latter needs to search for the coordinates of the feature points every time the face pose changes, while the former only needs to adjust the parameters of the 3D deformed model.

It can be understood that the embodiment of the present invention does not impose restrictions on the face recognition algorithm used.

Step 103: Determine multiple target regions based on the face region to obtain multiple second image samples, wherein the target region is formed by expanding toward a preset direction based on the face region.

Specifically, with the face area as the center, the sizes of different sizes are expanded in one of up, down, left, and right directions or in multiple directions at the same time to obtain multiple target areas including the face area.

It can be understood that the shape of the face area and the target area may be rectangular, diamond, circular, trapezoidal, etc., selected according to the scene, and rectangular is preferred. The shape of the embodiment of the present invention is not limited.

The embodiment of the present invention can determine target regions that contain face regions and have different sizes, so that the expanded samples not only include face regions, but also include as much information of other regions as possible, which helps the model learn more. Multi-information improves the accuracy of face recognition by the model. In this way, more image samples can be generated based on existing images, so as to improve the accuracy of the model during subsequent training of the model.

One of the fraudulent methods in face recognition is to play images or streaming media files through a playback device, and then shoot through a certified shooting device to perform face verification. Since the existing face authentication algorithm first extracts the face area in the obtained authentication image for authentication, the authentication will not be affected even when the camera captures the frame of the playback device. After expansion in the foregoing manner, at least a part of the image samples can include the frame of the playback device, so as to improve the accuracy of the trained model.

It can be understood that the image contained in the target area forms a second image sample.

In the embodiment of the present invention, since the target area includes the face area and extends to the surroundings, the second image sample includes the face feature and information of other areas.

In one embodiment, when generating image samples for training, first, make a copy of the first image sample; then, crop the face area and the target area from the first image sample; finally, the cropped The image is saved as a second image sample, so that the image formed by the face region, multiple second image samples, and the first image sample can be used as image samples for the training model. In one embodiment, since multiple target regions can be determined based on the face region, the target region can be cropped from the first image sample to obtain multiple second image samples; and The face area is located at different positions within the plurality of second image samples.

In summary, an embodiment of the present invention provides an image processing method, the method includes: acquiring a first image sample; identifying a face area in the first image sample; determining the face area based on the face area A plurality of second image samples are obtained from a plurality of target regions, wherein the target region is formed by expanding toward a preset direction based on the face region. The target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.

Example two

2, which shows a flowchart of specific steps of an image processing method according to Embodiment 2 of the present invention.

Step 201: Acquire a first image sample through a photographing device, where the environmental brightness of the first image sample, the distance and/or the angle between the photographing object and the photographing device are not completely the same.

Specifically, an image sample used as a training sample can be generated into a streaming media file that is played in a loop, and then played by a playback device. The screen of the playback device is photographed by the shooting device to obtain multiple first image samples in the streaming media file; and the brightness of the environment where the playback device is located, the distance and/or the angle from the shooting device are continuously changed , You can get more training samples from these image samples.

Among them, the shooting device can be any device with a camera, for example, a mobile phone, a tablet computer, a camera, and the like.

The playback device is all playback devices that can automatically adjust the screen angle. In one embodiment, the playback device may use a pan-tilt to carry the playback device; the pan-tilt may continuously rotate to adjust the angle between the screen of the playback device and the shooting device. The pan-tilt can be rotated in a two-dimensional plane to generate multiple different image samples according to the image; the pan-tilt can also be rotated in a three-dimensional space to continuously rotate the screen during streaming media file playback. In addition, in order to change the distance, you can place the shooting device or playback device on a movable device, so as to adjust the distance between the two during playback; finally, you can also set the playback device and the shooting device to constant brightness Under the changing lighting environment, continuous changes in brightness are realized.

The target video may be an image sequence including a human face area. It can be understood that after the image sequence is captured by the shooting device, the image sequence is split into each frame of images to obtain multiple first image samples. That is, the multiple first image samples correspond to multiple frames of images formed by splitting the target video shot by the shooting device.

In the embodiment of the present invention, an image sample set composed of multiple first image samples with different angles, different brightnesses, and different distances can be obtained through device shooting, which realizes the diversification and automation of training sample augmentation, and is helpful compared with manual shooting. Reduce workload, diversified samples can improve the accuracy of training.

Optionally, in another embodiment of the present invention, the target video is composed of multiple sub-videos, and the sub-videos are separated by a preset mark.

Wherein, the mark can be an image frame with a designated mark, or a mark frame, or other forms of marks.

In the embodiment of the present invention, multiple small videos can be spliced into a video with a larger length, so that the first image samples can be generated in batches, which helps to further reduce the workload.

Step 202: Identify the face area in the first image sample.

For this step, refer to the detailed description of step 102, which will not be repeated here.

Step 203: Use the face area as a reference to expand multiple first sizes upward to form multiple target areas, and obtain multiple second image samples.

Wherein, the size can be expressed in terms of the number of pixels, and the first size is the number of pixels expanded upward.

The embodiment of the present invention may extend the first size above the face area to obtain the target area, so that the target area includes the face area and the upper first size area. As shown in Figure 3(A), the gray area F1 is the entire image area of the first image sample, and the shaded area F2 is the face area in the first image sample. Based on the face area F2, the target area is expanded upward It is the area F3 (including the area F2) enclosed by the dotted line in FIG. 3(B).

It can be understood that the first size may be a plurality of different sizes, so as to achieve a target area of different sizes such as F3.

In the embodiment of the present invention, the target area can be obtained by upward expansion, so that the target area including the screen, or device or screen frame and angle above is used as the second image sample, and the screen, or device or screen frame and angle are learned during model training. It can identify the screen reflectance above the face area, the device or screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for identity verification, and improve the accuracy of face recognition.

Step 204: Expand a plurality of second sizes downward based on the face region to form a plurality of target regions, and obtain a plurality of second image samples.

Among them, the second size is the number of pixels expanded downward.

The embodiment of the present invention can extend the second size below the face area to obtain the target area, so that the target area includes the face area and the second size area below. As shown in FIG. 3(A), taking the face area F2 as a reference, the target area obtained by downward expansion is the area F4 (including the area F2) enclosed by the dotted line in FIG. 3(C).

It can be understood that the second size can be a plurality of different sizes, so as to achieve a target area of different sizes such as F4.

In the embodiment of the present invention, the target area can be obtained by downward expansion, so that the target area below the screen, or the device or screen frame and the angle is used as the second image sample, and the screen, or device or screen frame and the angle are learned during model training. The angle feature can identify the screen reflections under the face area, the device or screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for identity verification, and improve the accuracy of face recognition.

Step 205: Expanding multiple third sizes to the left based on the face area to form multiple target areas to obtain multiple second image samples.

Among them, the third size is the number of pixels extended to the left.

The embodiment of the present invention can extend the third size to the left of the face area to obtain the target area, so that the target area includes the face area and the third size area on the left. As shown in Fig. 3(A), taking the face area F2 as a reference, the target area obtained by expanding to the left is the area F5 (including the area F2) enclosed by the dotted line in Fig. 3(D).

It can be understood that the third size may be a plurality of different sizes, so as to achieve a target area of different sizes such as F5.

In the embodiment of the present invention, the target area can be obtained by expanding to the left, so that the target area on the left including the screen, or device or screen frame and angle is used as the second image sample, and the screen, or device or screen is learned during model training. The characteristics of the frame and the angle can identify the screen reflection on the left of the face area, the device or the screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for authentication, and improve the face recognition Accuracy.

Step 206: Expanding multiple fourth sizes to the right based on the face area to form multiple target areas to obtain multiple second image samples.

Among them, the fourth size is the number of pixels extended to the right.

The embodiment of the present invention may extend the fourth size to the right of the face area to obtain the target area, so that the target area includes the face area and the fourth size area on the right. As shown in FIG. 3(A), taking the face area F2 as a reference, the target area expanded to the right is the area F6 (including the area F2) enclosed by the dotted line in FIG. 3(E).

It can be understood that the fourth size may be a plurality of different sizes, so as to achieve a target area of different sizes such as F6.

In one embodiment, one of the directions can be selected to expand, or the up and down directions can be combined to obtain the target area as the area F7 enclosed by the dashed line in Figure 3(F); the left and right directions can also be combined to obtain the target area as The area F8 enclosed by the dashed line in Fig. 3(G); the up, down, left and right directions can also be combined to obtain the target area as the area F9 enclosed by the dashed line in Fig. 3(H). It is also possible to arbitrarily combine the up, down, left and right directions to expand to obtain a target area, for example, simultaneous expansion of the upper left, simultaneous expansion of the upper left and right, and simultaneous expansion of the upper and lower right. Among them, the first size, the second size, the third size, and the fourth size may be the same or different. In particular, in order to achieve simplicity and symmetry, the same first size and second size may be used when combining up and down. When combining left and right, the same third size and fourth size can be used.

In the embodiment of the present invention, the target area can be obtained by expanding to the right, so that the target area on the right including the screen, or device or screen frame and angle is used as the second image sample, and the screen, or device or screen frame is learned during model training. As well as the characteristics of the angle, it can identify the screen reflections on the right side of the face area, the device or screen frame, and the screen angle, so that it can distinguish scenes that use pre-recorded videos or pictures for authentication, and improve the accuracy of face recognition degree.

Optionally, in another embodiment of the present invention, the first size and the second size are multiples of the vertical size of the face area, and the third size and the fourth size are the person The multiple of the left and right dimensions of the face area.

Among them, the vertical size of the face area can be understood as the height of the face area, and the left-right size of the face area can be understood as the width of the face area.

Specifically, if the first size represents the distance between the upper boundary of the target area and the upper boundary of the face area, the first size can be 1 time, 2 times, 3 times, etc., the height of the face area; if the second size represents The distance between the lower boundary of the target area and the lower boundary of the face area, the second size can be 1, 2, 3 times the height of the face area, etc.; if the third size represents the left boundary of the target area and the face The distance between the left border of the area, the third size can be 1 time, 2 times, 3 times the width of the face area, etc.; if the fourth size represents the distance between the right border of the target area and the right border of the face area , The fourth size can be 1, 2, 3, etc. the width of the face area.

In one embodiment, the maximum multiple may be determined by the image size, and the multiple is continuously expanded until the boundary of the first image sample is reached. For example, the first size may determine the maximum multiple based on the upper boundary, the second size may determine the maximum multiple based on the lower boundary, the third size may determine the maximum multiple based on the left boundary, and the fourth size may determine the maximum multiple based on the right boundary.

The embodiment of the present invention can expand the target area according to the multiple of the size of the face area, and can simply and effectively determine the target area including the screen, the screen frame, and the device frame.

Example three

Referring to FIG. 4, it shows a structural diagram of an image processing apparatus provided in Embodiment 3 of the present invention, which is specifically as follows.

The first image sample acquisition module 301 is used to acquire the first image sample.

The face area recognition module 302 is configured to recognize the face area in the first image sample.

The second image sample generation module 303 is configured to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is based on the face area as a reference to proceed toward a preset direction Expanded.

In summary, an embodiment of the present invention provides an image processing device, the device includes: a first image sample acquisition module for acquiring a first image sample; a face region recognition module for identifying the first image sample The face area in the image sample; the second image sample generation module is used to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is based on the face area It is formed by expanding the base toward the preset direction. The target area can be expanded based on the face area to obtain the second image sample, so that the generated second image sample must include the face area, which helps improve the accuracy of the model in recognizing the face.

The third embodiment is an apparatus embodiment corresponding to method embodiment 1. For detailed information, please refer to the detailed description of embodiment 1, which will not be repeated here.

Example four

Referring to FIG. 5, it shows a structural diagram of an image processing apparatus provided by Embodiment 4 of the present invention, which is specifically as follows.

The first image sample acquisition module 401 is configured to acquire a first image sample; optionally, in another embodiment of the present invention, the first image sample acquisition module 401 includes:

The first image sample receiving sub-module 4011 is configured to receive a first image sample sent by a photographing device, wherein the environmental brightness of the first image sample, the distance and/or angle between the photographing object and the photographing device are not completely the same.

The face region recognition module 402 is used to identify the face region in the first image sample.

The second image sample generation module 403 is configured to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is based on the face area as a reference and moves toward a preset direction Formed by expansion, including:

The first target area expansion submodule 4031 is configured to expand a plurality of first sizes upward based on the face area; and/or,

The second target area expansion submodule 4032 is configured to expand a plurality of second sizes downward based on the face area; and/or,

The third target area expansion submodule 4033 is configured to expand a plurality of third sizes to the left based on the face area; and/or,

The fourth target area expansion sub-module 4034 is configured to expand a plurality of fourth sizes to the right based on the face area.

The fourth embodiment is a device embodiment corresponding to the second method embodiment. For detailed information, please refer to the detailed description of the second embodiment, which will not be repeated here.

An embodiment of the present invention also provides an electronic device, including: a processor, a memory, and a computer program that is stored on the memory and can run on the processor. When the processor executes the program, the aforementioned method.

The embodiment of the present invention also provides a readable storage medium. When the instructions in the storage medium are executed by the processor of the electronic device, the electronic device can execute the aforementioned method.

As for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.

The algorithms and displays provided here are not inherently related to any particular computer, virtual system or other equipment. Various general-purpose systems can also be used with the teaching based on this. From the above description, the structure required to construct this type of system is obvious. In addition, the present invention is not directed to any specific programming language. It should be understood that various programming languages can be used to implement the content of the present invention described herein, and the above description of a specific language is to disclose the best embodiment of the present invention.

In the instructions provided here, a lot of specific details are explained. However, it can be understood that the embodiments of the present invention can be practiced without these specific details. In some instances, well-known methods, structures and technologies are not shown in detail so as not to obscure the understanding of this specification.

Similarly, it should be understood that in order to simplify the present disclosure and help understand one or more of the various inventive aspects, in the above description of the exemplary embodiments of the present invention, the various features of the present invention are sometimes grouped together into a single embodiment, Figure, or its description. However, the disclosed method should not be construed as reflecting the intention that the claimed invention requires more features than those explicitly stated in each claim. More precisely, as reflected in the following claims, the inventive aspect lies in less than all the features of a single embodiment previously disclosed. Therefore, the claims following the specific embodiment are thus explicitly incorporated into the specific embodiment, wherein each claim itself serves as a separate embodiment of the present invention.

Those skilled in the art can understand that it is possible to adaptively change the modules in the device in the embodiment and set them in one or more devices different from the embodiment. The modules or units or components in the embodiments can be combined into one module or unit or component, and in addition, they can be divided into multiple sub-modules or sub-units or sub-components. Except that at least some of such features and/or processes or units are mutually exclusive, any combination can be used to compare all features disclosed in this specification (including the accompanying claims, abstract and drawings) and any method or methods disclosed in this manner or All the processes or units of the equipment are combined. Unless expressly stated otherwise, each feature disclosed in this specification (including the accompanying claims, abstract and drawings) may be replaced by an alternative feature providing the same, equivalent or similar purpose.

The various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by a combination of them. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some or all components in the image processing device according to the embodiments of the present invention. The present invention can also be implemented as a device or device program for executing part or all of the methods described herein. Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals. Such signals can be downloaded from Internet websites, or provided on carrier signals, or provided in any other form.

For example, FIG. 6 shows a computing processing device that can implement the method according to the present application. The computing processing device traditionally includes a processor 1010 and a computer program product in the form of a memory 1020 or a computer readable medium. The memory 1020 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM. The memory 1020 has a storage space 1030 for executing program codes 1031 of any method steps in the above methods. For example, the storage space 1030 for program codes may include various program codes 1031 for implementing various steps in the above method. These program codes can be read from or written into one or more computer program products. These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks. Such a computer program product is usually a portable or fixed storage unit as described with reference to FIG. 7. The storage unit may have storage segments, storage spaces, etc. arranged similarly to the memory 1020 in the computing processing device of FIG. 6. The program code can be compressed in a suitable form, for example. Generally, the storage unit includes computer-readable codes 1031', that is, codes that can be read by, for example, a processor such as 1010. These codes, when run by a computing processing device, cause the computing processing device to execute the method described above. The various steps.

It should be noted that the above-mentioned embodiments illustrate the present invention rather than limit the present invention, and those skilled in the art can design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses should not be constructed as a limitation to the claims. The word "comprising" does not exclude the presence of elements or steps not listed in the claims. The word "a" or "an" preceding an element does not exclude the presence of multiple such elements. The invention can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the unit claims enumerating several devices, several of these devices may be embodied in the same hardware item. The use of the words first, second, and third does not indicate any order. These words can be interpreted as names.

Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the above-described system, device, and unit can refer to the corresponding process in the foregoing method embodiment, which is not repeated here.

The above are only the preferred embodiments of the present invention and are not intended to limit the present invention. Any modification, equivalent replacement and improvement made within the spirit and principle of the present invention shall be included in the protection of the present invention. Within range.

The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited thereto. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed by the present invention. It should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

An image processing method, characterized in that the method includes:

Obtain the first image sample;

Identifying the face area in the first image sample;

A plurality of target regions are determined based on the face region to obtain a plurality of second image samples, wherein the target region is formed by expanding toward a preset direction based on the face region.
The method according to claim 1, wherein the step of expanding toward a predetermined direction based on the face area comprises:

Expanding a plurality of first sizes upward based on the face area; and/or,

Expand a plurality of second sizes downward based on the face area; and/or,

Expand a plurality of third sizes to the left based on the face area; and/or,

Expanding multiple fourth sizes to the right based on the face area.
The method according to claim 1, wherein the first size and the second size are multiples of the vertical size of the face area, and the third size and the fourth size are the face area The multiple of the left and right dimensions.
The method according to claim 1, wherein the step of obtaining a first image sample comprises:

Acquire multiple first image samples through the shooting device,

Wherein, the environmental brightness, the distance and/or the angle between the shooting object and the shooting device of the plurality of first image samples are not completely the same.
The method according to claim 4, wherein the plurality of first image samples correspond to multiple frames of images formed by splitting the target video shot by the shooting device.
The method according to claim 5, wherein the target video is composed of multiple sub-videos, and the sub-videos are separated by a preset mark.
The method according to claim 4, wherein the shooting object is a playback device, and the playback device is carried by a pan-tilt that can rotate.
The method according to claim 1, wherein the first image sample comprises: a screen or frame of a playback device;

When expanding toward a preset direction based on the face area to form a target area, the screen or frame of the playback device is included in the target area.
An image processing device, characterized in that the device includes:

The first image sample obtaining module is used to obtain the first image sample;

A face area recognition module, configured to recognize the face area in the first image sample;

The second image sample generation module is configured to determine multiple target areas based on the face area to obtain multiple second image samples, wherein the target area is expanded toward a preset direction based on the face area Forming.
The device according to claim 9, wherein the second image sample generating module comprises:

The first target area expansion sub-module is configured to expand a plurality of first sizes upward based on the face area; and/or,

The second target area expansion sub-module is used to expand a plurality of second sizes downward based on the face area; and/or,

The third target area expansion submodule is used to expand a plurality of third sizes to the left based on the face area; and/or,

The fourth target area expansion sub-module is used to expand a plurality of fourth sizes to the right based on the face area.
The device according to claim 9, wherein the first size and the second size are multiples of the vertical size of the face area, and the third size and the fourth size are the face area The multiple of the left and right dimensions.
The device according to claim 9, wherein the first image sample acquisition module comprises:

The first image sample receiving sub-module is configured to receive the first image sample sent by the photographing device, wherein the environmental brightness of the first image sample, the distance and/or the angle between the photographing object and the photographing device are not completely the same.
The apparatus according to claim 12, wherein the plurality of first image samples correspond to multiple frames of images formed by splitting the target video shot by the shooting device.
The device according to claim 13, wherein the target video is composed of a plurality of sub-videos, and the sub-videos are separated by a preset mark.
The method according to claim 12, wherein the shooting object is a playback device, and the playback device is carried by a pan-tilt that can rotate.
The method according to claim 9, wherein the first image sample comprises: a screen or frame of a playback device;

When expanding toward a preset direction based on the face area to form a target area, the screen or frame of the playback device is included in the target area.
An electronic device, characterized in that it comprises:

A processor, a memory, and a computer program that is stored on the memory and can run on the processor, wherein the processor executes the program as described in one or more of claims 1 to 8. The method described.
A readable storage medium, characterized in that, when the instructions in the storage medium are executed by the processor of the electronic device, the electronic device can execute the method according to one or more of the method claims 1 to 8.