WO2018137623A1

WO2018137623A1 - Image processing method and apparatus, and electronic device

Info

Publication number: WO2018137623A1
Application number: PCT/CN2018/073882
Authority: WO
Inventors: 刘建博; 严琼; 鲍旭; 王子彬
Original assignee: 深圳市商汤科技有限公司
Priority date: 2017-01-24
Filing date: 2018-01-23
Publication date: 2018-08-02
Also published as: CN108230252B; CN108230252A

Abstract

Provided are an image processing method and apparatus, and an electronic device. The image processing method comprises: determining target object information from an image to be processed; determining a foreground area and a background area in the image according to the target object information and a pre-set object profile template; and performing blurring processing on the foreground area and/or the background area. The technical solution provided in the embodiments of the present application improves the convenience and accuracy of an image blurring processing.

Description

Image processing method, device and electronic device

This application claims the priority of the Chinese Patent Application filed on Jan. 24, 2017, the application number of which is CN201710060426.X, and the application name is "user image processing method, device and electronic device", the entire contents of which are incorporated by reference. In this application.

Technical field

The embodiments of the present application relate to image processing technologies, and in particular, to an image processing method, apparatus, and electronic device.

Background technique

When processing an image, it is often necessary to blur the background of the subject to highlight the subject and create a SLR camera. In the existing ambiguous processing method, it is usually required for the user to manually specify an area (usually a background area) to be falsified, and then ambiguize the area.

On the other hand, when posting or playing photos or videos on media such as the Internet, television, newspapers, etc., in order to protect personal privacy, some content in the photos or videos needs to be blurred. For example, when playing a news photo or video about a crime, it is necessary to blur the face of the witness or teenager appearing therein. In the existing processing method, it is usually also adopted to manually specify an area (usually a face area) to be processed, and then perform corresponding blurring processing on the area.

Summary of the invention

The embodiment of the present application provides an image processing technical solution.

According to an aspect of an embodiment of the present application, an image processing method includes: determining target object information from an image to be processed; determining a foreground region and the foreground region in the image according to the target object information and a predetermined object contour template. a background area; performing blurring processing on the foreground area and/or the background area.

Optionally, the determining the foreground area and the background area in the image according to the target object information and the predetermined object contour template comprises: matching at least a partial area in the object outline template with the target object information; Determining difference information between an object contour in the object contour template and a contour of the target object in the image according to the matching result; adjusting an object contour in the object contour template according to the difference information; adjusting the object A contour is mapped into the image to obtain a foreground region of the image including the target object and a background region including at least a portion of the foreground region.

Optionally, the difference information includes: scaling information, offset information, and/or angle information between an object contour in the object contour template and a contour of the target object in the image.

Optionally, the image comprises: a still image or a video frame image.

Optionally, the image is a video frame image; the determining target object information from the image to be processed includes: according to target object information determined from a video frame image before the video frame image to be processed, Determining the target image information by the video frame image to be processed; or determining the target object information in each video frame image in the video stream by performing video-by-video frame image detection on the video stream to be processed.

Optionally, the image processing method further includes: determining a transition region between the foreground region and the background region; and performing a blurring process on the transition region.

Optionally, the performing the blurring process on the transition region comprises: performing progressive blurring processing or spot processing on the transition region.

Optionally, the determining target object information from the image to be processed includes: acquiring object selection information; and determining the target object information from the to-be-processed image according to the object selection information.

Optionally, the determining target object information from the image to be processed includes: detecting a target object from the image to be processed, and obtaining the target object information.

Optionally, detecting the target object from the image to be processed, obtaining the target object information, including: detecting a target object from the image to be processed through a pre-trained depth neural network, and obtaining the target object information .

Optionally, the target object information includes any one or more of the following: face information, license plate information, house number information, address information, identity ID information, and trademark information.

Optionally, the face information includes any one or more of the following: information of a face key point, face position information, face size information, and face angle information.

Optionally, the object contour template includes any one or more of the following: a face contour template, a human body contour template, a license plate contour template, a house card contour template, and a predetermined frame contour template.

Optionally, the object contour template includes: a plurality of human body contour templates respectively corresponding to different human face angles; and determining the foreground area and the background area in the image according to the target object information and the predetermined object contour template And further comprising: determining, from the object contour template, a human body contour template corresponding to the face angle information in the face information.

Optionally, the depth neural network is configured to detect face key point information and pre-training by: acquiring a first sample set, the first sample set including a plurality of unlabeled sample images; a neural network, performing key point position labeling on each of the unlabeled sample images in the first sample set to obtain a second sample set, wherein the deep neural network is used to perform key point positioning on the image; The partial sample image and the third sample set in the second sample set are adjusted to adjust parameters of the deep neural network, wherein the third sample set includes a plurality of labeled sample images.

Optionally, the depth point neural network is used to perform key point position labeling on each of the unlabeled sample images in the first sample set to obtain a second sample set, including: collecting the first sample set Each of the unlabeled sample images is subjected to image transformation processing to obtain a fourth sample set; wherein the image transformation process includes any one or more of the following: rotation, translation, scaling, noise addition, and occlusion; a deep neural network, performing key point position labeling on the fourth sample set and each sample image in the first sample set to obtain the second sample set.

Optionally, the parameter of the deep neural network is adjusted according to at least part of the sample image and the third sample set in the second sample set, including: for each unlabeled sample image in the first sample set, Determining, according to the key point position information of the unlabeled sample image, the key point position information of the unlabeled sample image is an optional sample; wherein the key point position information of the unlabeled sample image is And key point position information after performing image transformation processing are all included in the second sample set; adjusting the deep neural network according to each of the selectable samples and the third sample set in the second sample set Parameters.

Optionally, the face key point includes any one or more of the following: an eye key point, a nose key point, a mouth key point, an eyebrow key point, and a face contour key point.

According to another aspect of the embodiments of the present application, an image processing apparatus is further provided, including: an object information determining module, configured to determine target object information from an image to be processed; a front background determining module, configured to use the object information according to the object information Determining the target object information determined by the module and the predetermined object contour template to determine a foreground area and a background area in the image; a blurring processing module, configured to determine a foreground area and/or the background area of the front background determining module Perform blurring.

Optionally, the front background determining module includes: a template matching unit, configured to match at least a local area in the object contour template with the target object information; and a difference determining unit, configured to perform, according to the template matching unit The matching result determines difference information between the object contour in the object contour template and the contour of the target object in the image; the contour adjusting unit is configured to adjust the object contour template according to the difference information determined by the difference determining unit An object contour; a front background determining unit, configured to map an object contour adjusted by the contour adjusting unit into the image, obtain a foreground region including the target object in the image, and include at least a portion of the foreground region Background area.

Optionally, the image comprises: a still image or a video frame image.

Optionally, the image is a video frame image; the object information determining module includes: a first object information determining unit, configured to determine, according to the target object information determined from the video frame image before the video frame image to be processed, Determining the target object information from the to-be-processed video frame image; or, the second object information determining unit is configured to perform video-by-video frame image detection by the video stream to be processed, and determine each video frame image in the video stream. Target object information in .

Optionally, the image processing apparatus further includes: a transition area determining module, configured to determine a transition area between the foreground area and the background area; and a transition blur processing module, configured to determine a transition of the transition area determining module The area is blurred.

Optionally, the transition blur processing module is configured to perform progressive blurring processing or spot processing on the transition region.

Optionally, the object information determining module includes: a selection information acquiring unit, configured to acquire object selection information; and a third object information determining unit, configured to: according to the object selection information acquired by the selection information acquiring unit, from the to-be-processed The target object information is determined in the image.

Optionally, the object information determining module includes: a fourth object information determining unit, configured to detect a target object from the image to be processed, and obtain the target object information.

Optionally, the fourth object information determining unit is configured to detect the target object from the image to be processed by using a pre-trained depth neural network to obtain the target object information.

Optionally, the object contour template includes: a plurality of human body contour templates respectively corresponding to different human face angles; the front background determining module is further configured to determine the image according to the target object information and a predetermined object contour template. Before the foreground area and the background area, a human body contour template corresponding to the face angle information in the face information is determined from the object outline templates.

Optionally, the device further includes: a sample set obtaining module, configured to acquire a first sample set, the first sample set includes a plurality of unlabeled sample images; and a key point position labeling module is configured to be based on the deep nerve a network, performing key point position labeling on each of the unlabeled sample images in the first sample set to obtain a second sample set, wherein the deep neural network is used for key point positioning of the image; the network parameter adjustment module And a parameter for adjusting the depth neural network according to at least a partial sample image and a third sample set in the second sample set, wherein the third sample set includes a plurality of labeled sample images.

Optionally, the key point location labeling module includes: an image transform processing unit, configured to perform image transform processing on each of the unlabeled sample images in the first sample set to obtain a fourth sample set; The image transformation process includes any one or more of the following: rotation, translation, scaling, noise addition, and occlusion; a key point location unit for using the fourth sample set and the location based on the depth neural network Each of the unlabeled sample images in the first sample set performs key point position labeling to obtain the second sample set.

Optionally, the network parameter adjustment module includes: an optional sample determining unit, configured to perform, for each unlabeled sample image in the first sample set, a key point after image transformation processing based on the unlabeled sample image Position information, determining whether the key point position information of the unlabeled sample image is an optional sample; wherein the key point position information of the unlabeled sample image and the key point position information after performing image transformation processing are included in And the network parameter adjustment unit is configured to adjust parameters of the deep neural network according to each of the selectable samples and the third sample set in the second sample set.

According to still another aspect of the embodiments of the present application, an electronic device is further provided, including:

A processor and an image processing apparatus according to any of the preceding claims;

When the processor runs the image processing apparatus, the units in the image processing apparatus according to any of the above embodiments of the present application are executed.

According to still another aspect of the embodiments of the present application, another electronic device is provided, including: a processor and a memory;

The memory is configured to store at least one executable instruction that causes the processor to perform operations corresponding to any of the image processing methods described above.

According to still another aspect of embodiments of the present application, there is also provided a computer program comprising computer readable code, the processor in the device executing the above-described implementation of the present application when the computer readable code is run on a device The instructions of the steps in the image processing method described in any of the embodiments.

According to still another aspect of the embodiments of the present application, a computer readable storage medium is provided for storing computer readable instructions, and when the instructions are executed, implementing the image processing method according to any one of the foregoing embodiments of the present application. The operation of each step in the process.

According to the image processing technology provided by the embodiment of the present application, the image to be processed is detected to determine target object information, and the foreground area and the background area in the image to be processed are acquired according to the determined target object information and the object contour template, and then The background area and/or the foreground area are blurred, so that the foreground area or the background area in which the blurring process needs to be performed can be automatically determined by the target object information detected from the image without manually marking the user to perform the blurring process The area or manual execution of the blur (blur) operation improves the convenience and accuracy of the operation.

The technical solutions of the present application are further described in detail below through the accompanying drawings and embodiments.

DRAWINGS

The accompanying drawings, which are incorporated in FIG.

The present application can be more clearly understood from the following detailed description, in which:

1 is a flow chart of an image processing method according to an embodiment of the present application;

2 is a flowchart of an image processing method according to another embodiment of the present application;

3 is a flowchart of an image processing method according to still another embodiment of the present application;

4 is a schematic diagram of an exemplary character outline template including a human body and a face contour template including a human face in the embodiment of the present application;

5 is a flow chart of an exemplary method of training a keypoint location model in an embodiment of the present application;

Figure 6 is a logic block diagram of an image processing apparatus according to an embodiment of the present application;

FIG. 7 is a logic block diagram showing an image processing apparatus according to another embodiment of the present application; FIG.

FIG. 8 is a logic block diagram showing an image processing apparatus according to still another embodiment of the present application; FIG.

FIG. 9 is a schematic structural view showing an electronic device according to an embodiment of the present application.

detailed description

Exemplary embodiments of the embodiments of the present application are described in detail below with reference to the accompanying drawings. It should be noted that the relative arrangement of the components and steps, numerical expressions and numerical values set forth in the embodiments are not intended to limit the scope of the application.

In the meantime, it should be understood that the dimensions of the various parts shown in the drawings are not drawn in the actual scale relationship for the convenience of the description.

The following description of the at least one exemplary embodiment is merely illustrative and is in no way

Techniques, methods and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but the techniques, methods and apparatus should be considered as part of the specification, where appropriate.

It should be noted that similar reference numerals and letters indicate similar items in the following figures, and therefore, once an item is defined in one figure, it is not required to be further discussed in the subsequent figures.

Embodiments of the present application can be applied to electronic devices such as terminal devices, computer systems, servers, etc., which can operate with numerous other general purpose or special purpose computing system environments or configurations. Examples of well-known terminal devices, computing systems, environments, and/or configurations suitable for use with electronic devices such as terminal devices, computer systems, servers, and the like include, but are not limited to, personal computer systems, server computer systems, thin clients, thick clients Machines, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, small computer systems, mainframe computer systems, and distributed cloud computing technology environments including any of the above, and the like.

Electronic devices such as terminal devices, computer systems, servers, etc., can be described in the general context of computer system executable instructions (such as program modules) being executed by a computer system. Generally, program modules may include routines, programs, target programs, components, logic, data structures, and the like that perform particular tasks or implement particular abstract data types. The computer system/server can be implemented in a distributed cloud computing environment where tasks are performed by remote processing devices that are linked through a communication network. In a distributed cloud computing environment, program modules may be located on a local or remote computing system storage medium including storage devices.

1 is a flow chart of an image processing method according to an embodiment of the present application. The image processing method can be implemented in any terminal device, personal computer or server. Referring to FIG. 1, the image processing method of this embodiment includes:

At step S110, target object information is determined from the image to be processed.

In the embodiment of the present application, the image to be processed has a certain resolution, and may be an image taken by using a shooting device (such as a mobile phone, a digital camera, a camera, etc.), or may be a pre-stored image (such as an image in a mobile phone album). It can also be an image in a video sequence. The image may be an image of a subject, an animal, a vehicle, an object (such as a business card, an ID card, a license plate, etc.). If the image is a person image, the image may also be a portrait (close-up), a bust, or a full-body photo.

At this step S110, the target object information can be determined/detected from the image to be processed by any suitable image analysis technique. The detected target object information can be used to locate the area occupied by the target object in the image.

The target object information may include, but is not limited to, any one or more of the following: location, size of the target object, information of the key part (such as the position of the nose, face position and size, etc.), key points of the target object, targets The attribute information of the object (such as the skin color of the person).

In an optional example, the step S110 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by the object information determining module 610 being executed by the processor.

In step S120, a foreground area and a background area in the image are determined based on the target object information and a predetermined object outline template.

As described above, the target object information determined in step S110 can be used to locate an area occupied by the target object in the image, and thus can be distinguished according to the determined target object information and the object outline template representing the shape and proportional relationship of the target object. An area occupied by the target object in the image to be processed, and an area occupied by the target object in the image to be processed is determined as a foreground area of the image, and at least a part of the image area outside the foreground area is determined as a background region. For example, the human face has a relatively determined position and proportional relationship in the whole human body, and can match the detected target object information with the character outline template that characterizes the shape and proportion of the human body, thereby delineating that the character occupies the image to be processed. The area is used as the foreground area, and all or part of the area other than the foreground area in the image to be processed is determined as the background area.

In an alternative example, the step S120 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a front background determination module 620 that is executed by the processor.

In step S130, the determined foreground area and/or background area is subjected to blurring processing.

In the embodiment of the present application, the background area and/or the foreground area may be blurred according to the needs of the application scenario. For example, the determined background area may be blurred to highlight the captured target object in the image screen to improve the shooting effect; or the foreground area (such as the character area or the license plate) may be blurred to blur the display target. The object (person, ID number, or license plate number, etc.) protects the privacy information; or, the determined background area and foreground area may be blurred at the same time.

In the embodiment of the present application, the foreground area and/or the background area may be blurred by using any suitable image blurring technique. For example, the blurring filter can be used to blur the background area and/or the foreground area, that is, to change the adjacent pixel values by Gaussian filtering to achieve a blurred visual effect. The above is only an exemplary implementation, and the foreground area and/or the background area may be blurred by any other blurring method.

In an alternative example, the step S130 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a blurring processing module 630 executed by the processor.

According to the image processing method of the above embodiment of the present application, the image to be processed is detected to determine target object information, and the foreground area and the background area in the image to be processed are acquired according to the determined target object information and the object contour template, and then The background area and/or the foreground area are blurred, so that the foreground area or the background area in which the blurring process needs to be performed can be automatically determined by the target object information detected from the image without manually marking the user to perform the blurring process The area or manual execution of the blur (blur) operation improves the convenience and accuracy of the operation.

2 is a flow chart of an image processing method according to another embodiment of the present application. Referring to FIG. 2, the image processing method of this embodiment includes:

In step S210, target object information is determined from the image to be processed.

Here, the target object may be a character, an animal, or any object (such as a license plate, a vehicle, an ID card). The determined target object information may include any one or more of the following: face information, license plate information, house number information, address information, identification (ID) information, trademark information, but is not limited to the above information. The target object information each characterizes at least a portion of the features of the target object in the image in which the target object is captured.

Optionally, in an implementation manner of the embodiment of the present application, the step S210 may include step S212 and step S213. In step S212, object selection information is acquired, and the object selection information may be, for example, information of an image area specified by (user), identification (ID) information of an object, information of an object type, and the like. In step S213, target object information is determined from the image to be processed based on the object selection information. For example, the target object information is determined in the specified image area based on the information of the image area specified by the user. In an optional example, the steps S21 and S213 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a selection information acquiring unit 6103 and a third object information determining unit 6104, respectively, which are executed by the processor. By the processing of steps S212 and S213, the image can be detected based on the object selection information provided separately, and the target object information can be acquired.

Optionally, in another implementation manner of the embodiment of the present application, step S210 may include: S214: detecting a target object from the image to be processed, and obtaining the detected target object information. That is to say, the target object is first detected from the image, and the target object information is determined according to the detected target object.

Optionally, the target object may be detected from the image to be processed through a pre-trained deep neural network to obtain the detected target object information. Alternatively, a deep neural network for detecting a target object may be pre-trained by a sample image labeled with object object information for detecting a deep neural network of a target object such as a vehicle, a face, a pedestrian, an animal, or the like. In the detection process, the image to be processed is input to the deep neural network, and the target object information is acquired by the detection process of the deep neural network.

On the other hand, the image to be processed may be a still image captured, a video frame image in the recorded video content, or a video frame image in the online video stream.

Correspondingly, according to still another implementation manner of the embodiments of the present application, step S210 may include: S215, determining the target object information from the video frame image to be processed according to the target object information determined from the previous video frame image. The position and size of the same target object between successive video frames are relatively close. Therefore, the video to be processed can be detected from the current video frame image to be detected according to the target object information determined from the previous or previous video frame images. The target object information of the frame image, thereby improving the detection efficiency. In an optional example, the step S215 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by the first object information determining unit 6101 being executed by the processor.

Alternatively, according to still another implementation manner of the embodiment of the present application, step S210 may include: S216, performing video-by-video frame image detection by the video stream to be processed, and determining target object information in each video frame image in the video stream. By performing frame-by-frame detection on the video frame image in the video stream, the background/foreground blur processing of each frame is respectively performed by the detection result of each frame, thereby effectively ensuring the stability and accuracy of the detection, since each frame is performed. Blurring processing, so from the perspective of the entire video stream, it is equivalent to realizing dynamic tracking blurring of the same target object. In an alternative example, the step S216 may be performed by the processor invoking a corresponding instruction stored in the memory, or may be performed by the second object information determining unit 6102 being executed by the processor.

It should be noted that the video frame in the video stream to be processed mentioned above may represent a real frame in the video stream, and may also be represented as a sample frame in the video stream that needs to be processed. .

The target object information is detected from the image to be processed by the processing of any of the foregoing embodiments.

In an optional example, the step S210 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by the object information determining module 610 executed by the processor.

In step S220, a foreground area and a background area in the image are determined according to the target object information and a predetermined object outline template.

Optionally, step S220 in this embodiment includes the following steps S221, S223, S225, S227, and S229.

In step S221, at least a partial area in the object contour template is matched with the determined target object information. Although there are differences between individual target objects (such as people, dogs, vehicles, license plates, etc.), each type of target object has commonality from the overall outline. Therefore, the object contour template can be preset to outline the outline of the target object that may appear in the image, or that is of interest or to be detected. For example, a character outline template, a car outline template, a dog's outline template, and the like may be set in advance for matching with the target object information.

In an optional example of the embodiments of the present application, the object contour template may include, but is not limited to, any one or more of the following: a face contour template, a human body contour template, a license plate contour template, a house card contour template, a predetermined frame. Outline templates, etc. The face contour template is used to match the silhouette of the person in the recent photo of the person, the human body contour template is used to match the silhouette of the person in the whole body or the half body photo, and the license plate contour template is used to match the license plate contour on the vehicle in the image, the predetermined frame The contour template is used to match the contour of an object having a predetermined shape, such as an identity card or the like.

Optionally, in the step S221, at least a local area in the object contour template may be matched with the determined target object information. For example, assuming that the determined target object information is the license plate information of the vehicle, since the license plate of the vehicle is usually disposed in the middle of the head of the vehicle, the contour template of the front side of the vehicle can be matched with respect to the position of the license plate.

In addition, since the target object may not be photographed at the time of photographing, when the object contour template is matched with the target object information, the local region of the object contour template may be matched with the determined target object information to determine that the target object is in the image. The area occupied by.

In an optional example, the step S221 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a template matching unit 6201 executed by the processor.

In step S223, difference information between the object contour in the object contour template and the contour of the target object in the image is determined according to the matching result.

Since the object contour template that characterizes the common features of the object may not be the same size as the object size in the image to be processed, and the position, posture angle, and the like of the object may deviate from the position, posture angle, and the like in the object contour template, During the matching process, the object contour template may be first scaled, translated, and/or rotated, and then matched with the determined object's position, size, or key point to obtain the object contour and the image to be processed in the object contour template. Information about the difference between the contours of the object.

Here, the difference information may include, but is not limited to, scaling information and/or offset information between the object contour in the object contour template and the contour of the target object in the image, and the like, and may also include, for example, an object contour template. Angle information between the contour of the object and the contour of the target object in the image, and the like.

In an optional example, the step S223 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a difference determining unit 6202 executed by the processor.

In step S225, the object contour in the object contour template is adjusted according to the difference information.

Alternatively, the object contour in the object contour template may be scaled, translated, rotated, etc. according to the difference information including the aforementioned scaling information, offset information, etc., to match the area in which the target object is located in the image.

In an optional example, the step S225 may be performed by the processor invoking a corresponding instruction stored in the memory, or may be performed by the contour adjustment unit 6203 executed by the processor.

In step S227, the adjusted object contour is mapped into the image to be processed, and the foreground region including the target object and the background region including at least part of the non-foreground region are obtained in the image.

By mapping the adjusted object contour into the image to be processed, the portion of the image to be processed that falls within the adjusted character contour can be determined to include the foreground region of the target object, which is the region occupied by the target object. Further, an image area including the foreground area or an image area including a part of the non-foreground area is determined as the background area of the image.

In an alternative example, the step S227 may be performed by the processor invoking a corresponding instruction stored in the memory, or may be performed by the front background determining unit 6204 being executed by the processor.

At step S229, a transition area between the foreground area and the background area is determined.

Alternatively, an image area in the background area that is smaller than a predetermined extended distance from the outer edge of the area where the target object is located may be determined as the transition area. That is to say, the outer edge of the contour of the target object is extended outward by a certain distance, and the extended area is used as the transition area.

In an alternative example, the step S229 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a transition region determination module 640 that is executed by the processor.

In an alternative example, step S220 or S221-229 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a front background determination module 620 that is executed by the processor.

In step S230, the determined foreground area and/or the background area are subjected to blurring processing, and progressive blurring processing or spot processing is performed on the determined transition area.

The blurring process performed on the determined foreground area and/or the background area is similar to the processing of step S130, and will not be described herein. Progressive blurring or spot processing can be performed on the transition area to make the effect of the blurring process more natural.

In an alternative example, the step S230 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a transition blur processing module 650 executed by the processor.

According to the image processing method of the above embodiment of the present application, the still image or the video frame image to be processed is detected in various manners, and the target object information in the still image or the video frame image is determined, according to the determined target object information and the object contour template. Obtaining a foreground area, a background area, and a transition area in the image to be processed, and then blurring the background area and/or the foreground area, and performing a blurring process on the transition area, so as to pass the static image Or the target object information detected by the video frame image to automatically determine the foreground area, the background area, and the transition area that need to perform the blurring process without the user manually marking the area to be subjected to the blurring process or manually performing the blurring (blurring) ) operation, improve the convenience and accuracy of the operation, and make the blur effect more natural.

3 is a flow chart of an image processing method according to still another embodiment of the present application. The image processing method of the present embodiment will be described below with an example in which a person is a target object. Here, the face key point is used as the face information. It should be noted that the face key point is only a feasible implementation manner of the embodiment of the present application, and the face information may further include face location information, face size information, and face angle information. Any one or more of the others. Referring to FIG. 3, the image processing method of this embodiment includes:

At step S310, face information is detected from the image to be processed.

According to an implementation manner, the face key point is detected from the image to be processed by the pre-trained key point positioning model, and the detected face key point is used as the face information. An exemplary method of training a keypoint location model will be described later. Although there are differences in body type between individuals as individuals, they have commonality from the outline of the overall figure, for example, the head is elliptical and the torso is roughly triangular. 4 is a schematic diagram of an exemplary character outline template including a human body and a face contour template including a human face in the embodiment of the present application. In addition, since the person being photographed may be in a plurality of different angles and distances during the shooting of the person, it is also possible to preset a plurality of character contour templates such as a face, a half body, a portrait, a side body, etc., for matching from different shooting distances. Or the image to be processed captured by the shooting angle. Therefore, at this step S310, face angle information can also be detected from the image to be processed.

In an optional example, the step S310 may be performed by the processor invoking a corresponding instruction stored in the memory, or may be performed by the fourth object information determining unit 6105 being executed by the processor.

In step S320, a human body contour template corresponding to the face angle information is determined from among predetermined body contour templates.

In an alternative example, the step S320 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a front background determination module 620 that is executed by the processor.

In step S330, a foreground area and a background area in the image are determined according to the face information and a predetermined human body contour template.

The processing of step S330 is similar to the processing of the foregoing step S120 or S221 to S229, and details are not described herein.

In an alternative example, the step S330 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a front background determination module 620 that is executed by the processor.

In step S340, the foreground area and/or the background area are subjected to blurring processing.

This step is similar to the processing of step S130 and will not be described here.

In an alternative example, the step S340 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a blurring processing module 630 executed by the processor.

According to the image processing method of the above embodiment of the present application, the face information is obtained by detecting the image to be processed, and the foreground area and the background area in the image to be processed are acquired according to the detected face information and the character outline template, and then The background area and/or the foreground area are blurred, so that when the image related to the person is processed, the foreground area and the background area that need to be processed can be automatically and accurately determined by the face information detected from the image. To perform the blurring process on the foreground area or the background area without the user manually marking the area where the blurring process is to be performed or manually performing the processing operation, improving the convenience and accuracy of the operation.

An exemplary method of training a keypoint location model is described below.

5 is a flow chart of an exemplary method of training a keypoint location model in an embodiment of the present application. Referring to FIG. 5, an exemplary method of the training keypoint location model includes:

At step S510, a first sample set is acquired, the first sample set including a plurality of unlabeled sample images.

In practical applications, an image that has been input into the model and has been marked with key position information is generally referred to as an annotated sample image. The key position information refers to the coordinate information of the key point in the image coordinate system. Optionally, the sample image may be marked in advance by manual labeling or the like.

Taking the key points of the face as an example, the key points of the face are mainly distributed in the facial organs and facial contours, such as key points of the eyes, key points of the nose, key points of the mouth, key points of the facial contour, and the like. The face key position information is the coordinate information of the face key point in the face image coordinate system. For example, the upper left corner of a sample image containing a human face is recorded as the coordinate origin, and the horizontal direction is the positive direction of the X-axis, and the vertical direction is the positive direction of the Y-axis. The face image coordinate system is established, and the i-th person is The sitting key of the face key point in the face image coordinate system is (x _i , y _i ), and the sample image obtained by the above method is the labeled sample image. Conversely, if the processing of the above-described key point position labeling is not performed on the sample image, the sample image can be understood as an unlabeled sample image. The first sample set in this step is an image set containing a plurality of the above unlabeled sample images.

In an optional example, the step S510 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a sample set acquisition module 660 executed by the processor.

In step S520, based on the depth neural network, key point position annotation is performed on each of the unlabeled sample images in the first sample set to obtain a second sample set.

Wherein, the deep neural network is used for key point positioning of an image.

The deep neural network may be a convolutional neural network, but is not limited thereto. Since the deep neural network is used to locate the key points of the image, the unlabeled sample images in the first sample set are input into the deep neural network, and the key point position annotation can be realized for each unlabeled sample image. . It should be noted that the key point position labeling is to mark the key point position information (ie, coordinate information) in the unlabeled sample image.

Optionally, the key points may include, but are not limited to, any one or more of the following: a face key point, a limb key point, a palm print key point, and a marker key point. When the key point includes a face key point, the face key point may include, but is not limited to, any one or more of the following: an eye key point, a nose key point, a mouth key point, an eyebrow key point, and a face contour key point.

Taking an unlabeled sample image containing a human face as an example, an unlabeled sample image containing a human face is input into a deep neural network, and the output is an unlabeled sample image itself, and key position information of the unlabeled sample image, such as an eye key. The coordinate information of the point, the coordinate information of the nose key point, and the like. Thus, when a plurality of unlabeled sample images including faces are input to the deep neural network, a large number of unlabeled sample images themselves, and key point position information of the unlabeled sample images constitute the second sample set in this step.

In an alternative example, the step S520 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a keypoint location labeling module 670 that is executed by the processor.

In step S530, parameters of the deep neural network are adjusted according to at least part of the sample image and the third sample set in the second sample set.

The third sample set includes a plurality of labeled sample images.

A partial sample image or a full sample image in the second set of samples may be used, and the third set of samples together adjust the parameters of the deep neural network. Here, the labeled sample image can be referred to the description and explanation in step S510 of the embodiment, and details are not described herein again.

In an alternative example, the step S530 may be performed by a processor invoking a corresponding instruction stored in the memory or by a network parameter adjustment module 680 executed by the processor.

The method for training a key point localization model provided by the above embodiment of the present application uses two sample sets to adjust parameters of the deep neural network, one of which is a second sample set, the second sample set is derived from a deep neural network, including A first sample set of a plurality of unlabeled sample images is obtained by key point position annotation; and the other is a third sample set including a plurality of labeled sample images. The embodiment of the present application can improve the training accuracy of the key point positioning model under the premise that the images input to the model are not all labeled images, in other words, the sample resource waste can be avoided, and the efficiency of the model training can be improved.

According to an implementation manner of the embodiment of the present application, step S520 may include a process of performing image transformation processing on each unlabeled sample image in the first sample set to obtain a fourth sample set, where the image transformation The processing may include, for example but not limited to, any one or more of the following: rotation, translation, scaling, noise addition, and occlusion, but is not limited thereto; based on the depth neural network, the fourth sample set and the Each sample image in the first sample set is subjected to key point position annotation to obtain the second sample set.

For example, if an unlabeled sample image is rotated to set an angle, the set angle is usually in the range of (-20°, 20°), that is, a rotation transformation of a small amplitude. Similarly, the translation processing is also a small displacement. Pan. It is assumed that the first sample set includes 10,000 unlabeled sample images, and each of the unlabeled sample images is subjected to image transformation processing (such as scaling, translation, etc.) to obtain 10 image-converted unlabeled sample images. At this time, 10,000 unlabeled sample images become 100,000 unlabeled sample images, and the 100,000 unlabeled sample images constitute the fourth sample set. It should be noted that, regardless of any combination of image transformation processing, it is within the technical scope of the embodiments of the present application to achieve the same or different image transformation processing effects for each unlabeled sample image in the first sample set. In addition, which image transformation processing can be selected for the unlabeled sample image, and the image transformation processing suitable for the sample image can be combined with the characteristics of the sample image itself.

Since the fourth sample set and the first sample set are both unlabeled sample images, the unlabeled sample image is input to the deep neural network based on the same principle as described in the foregoing embodiment, and the fourth sample set and the first sample set are output. Each sample image itself, as well as key point location information for each sample image.

In addition, according to an implementation manner of the embodiment of the present application, step S530 may include: performing key transformation after image transformation processing based on the unlabeled sample image for each unlabeled sample image in the first sample set Position information, determining whether the key point position information of the unlabeled sample image is an optional sample; wherein the key point position information of the unlabeled sample image and the key point position information after performing image transformation processing are included in The second sample set; adjusting parameters of the deep neural network according to each of the selectable samples and the third sample set in the second sample set.

The key point position information of the unlabeled sample image and the key point position information after the image transform processing are included in the second sample set.

First, the key point position information after the image conversion processing is performed on the unlabeled sample image, and image correction processing is performed. It should be noted that the image correction processing is the inverse transformation processing of the image transformation processing described above. For example, if an unlabeled sample image is shifted to the right by 5 mm, the key point position information after the image conversion processing is performed on the unlabeled sample image, It is necessary to translate 5 mm to the left to implement image correction processing. Secondly, the covariance matrix Cov1 is obtained by image-corrected key position information (ie, coordinate values of a series of points), and the covariance matrix Cov1 is expanded into a vector form by column or row by row, and normalized into a unit vector Cov1_v. . And, the covariance matrix Cov2 is obtained for the key point position information of the unlabeled sample image, and Cov2 is expanded into a vector form column by column or row by row, and normalized into a unit vector Cov2_v. Calculate the inner product of Cov1_v and Cov2_v and record the inner product as D. Finally, D is compared with the set inner product threshold. If D is less than the inner product threshold, the key position information of the unlabeled sample image is an optional sample. Conversely, if D is greater than or equal to the inner product threshold, the key position information of the unlabeled sample image is not an optional sample. By analogy, based on the key point position information before and after the image conversion processing in the second sample set, the above-described judging process is performed on each unlabeled sample image in the first sample set, and each of the selectable samples can be selected.

In addition, another way to select an optional sample is that the difference from the above-mentioned judging process is only in the last step. If D is less than the set threshold, image conversion processing is performed on the unlabeled sample image. After the key point position information, image correction processing is performed to obtain image corrected key point position information. Then, the result of the data distribution of the key position information corrected by the image (for example, the mean value of the coordinate values of a series of points) is used, and the key point position is marked on the unlabeled sample image, and the key position information of the label is used as Samples are selected, including in the second sample set.

The parameters of the deep neural network may be adjusted according to the commonly used training methods of the deep neural network, and the parameters of the deep neural network are adjusted according to the selected samples and the third sample set in the second sample set, and details are not described herein.

Any of the methods provided by the foregoing embodiments of the present application may be performed by any suitable device having data processing capabilities, including but not limited to: a terminal device, a server, and the like. Alternatively, any of the methods provided by the foregoing embodiments of the present application may be executed by a processor, such as the processor, by executing a corresponding instruction stored in a memory to perform any of the methods mentioned in the foregoing embodiments of the present application. This will not be repeated below.

A person skilled in the art can understand that all or part of the steps of implementing the above method embodiments may be completed by using hardware related to the program instructions. The foregoing program may be stored in a computer readable storage medium, and the program is executed when executed. The foregoing steps include the steps of the foregoing method embodiments; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk. Figure 6 is a logic block diagram of an image processing apparatus according to an embodiment of the present application. Referring to FIG. 6, the image processing apparatus of Embodiment 5 includes an object information determining module 610, a front background determining module 620, and a blurring processing module 630. among them:

The object information determining module 610 is configured to determine target object information from the image to be processed.

The front background determining module 620 is configured to determine a foreground area and a background area in the image according to the target object information determined by the object information determining module 610 and the predetermined object contour template.

The blurring processing module 630 is configured to perform a blurring process on the foreground area and/or the background area determined by the front background determining module 620.

The image processing apparatus of the present embodiment can be used to implement the corresponding image processing method in the foregoing method embodiments, and has the beneficial effects of the corresponding method embodiments, and details are not described herein again.

FIG. 7 is a logic block diagram showing an image processing apparatus according to another embodiment of the present application. Referring to FIG. 7, the image processing apparatus of Embodiment 6 includes an object information determining module 610, a front background determining module 620, and a blurring processing module 630. The front background determining module 620 includes a template matching unit 6201, a difference determining unit 6202, a contour adjusting unit 6203, and a front background determining unit 6204. among them:

The template matching unit 6201 is configured to match at least a partial area in the foregoing object contour template with the determined target object information.

The difference determining unit 6202 is configured to determine difference information between the object contour in the object contour template and the contour of the target object in the image according to the matching result of the template matching unit 6201.

The contour adjustment unit 6203 is configured to adjust an object contour in the object contour template according to the difference information determined by the difference determining unit 6202.

The front background determining unit 6204 is configured to map an object contour adjusted by the contour adjusting unit 6203 into the image, obtain a foreground region including the target object in the image, and a background region including at least a portion not the foreground region.

Optionally, the image may include, but is not limited to, a still image or a video frame image.

According to an implementation manner of the embodiments of the present application, the image is a video frame image. The object information determining module 610 includes: a first object information determining unit 6101, configured to determine the target object information from the to-be-processed video frame image according to target object information determined from a video frame image before the video frame image to be processed; Alternatively, the second object information determining unit 6102 is configured to perform target-by-video frame image detection by the video stream to be processed, and determine target object information in each of the video frame images.

Optionally, the image processing apparatus of this embodiment further includes: a transition area determining module 640, configured to determine a transition area between the foreground area and the background area; and a transition blur processing module 650, configured to use the transition area The transition area determined by the determination module 640 is blushed.

Optionally, the transition blur processing module 650 is optionally configured to perform progressive blurring processing or spot processing on the transition region.

According to an embodiment of the present application, the object information determining module 610 includes: a selection information acquiring unit 6103 for acquiring object selection information; and a third object information determining unit 6104 for acquiring according to the selection information acquiring unit 6103. The object selection information determines the target object information from the image to be processed.

According to another implementation manner of the present application, the object information determining module 610 includes: a fourth object information determining unit 6105, configured to detect a target object from the image to be processed, and obtain the detected target object information.

Optionally, the fourth object information determining unit 6105 is configured to detect the target object from the image to be processed through a pre-trained depth neural network, and obtain the detected target object information.

Optionally, the target object information may include, but is not limited to, any one or more of the following: face information, license plate information, house number information, address information, identity ID information, and trademark information.

Optionally, the face information may include, but is not limited to, any one or more of the following: information of a face key point, face position information, face size information, and face angle information.

Optionally, the object contour template may include, but is not limited to, any one or more of the following: a face contour template, a human body contour template, a license plate contour template, a house card contour template, and a predetermined frame contour template.

Optionally, the predetermined object contour template may include: a plurality of human body contour templates respectively corresponding to different human face angles; correspondingly, the front background determining module 620 is further configured to use the target object information according to the target object information and the predetermined object contour Before the template determines the foreground area and the background area in the image, a human body contour template corresponding to the face angle information in the face information is determined from among the predetermined object outline templates.

The image processing apparatus of the present embodiment is used to implement the corresponding image processing method in the foregoing method embodiments of the present application, and has the beneficial effects of the corresponding method embodiments, and details are not described herein again.

FIG. 8 is a logic block diagram showing an image processing apparatus according to still another embodiment of the present application. Referring to FIG. 8, the image processing apparatus of this embodiment includes an object information determining module 610, a front background determining module 620, and a blurring processing module 630. Optionally, the image processing apparatus further includes a transition region determining module 640 and a transition blur processing module 650. In addition, the image processing apparatus further includes a sample set acquisition module 660, a key point location labeling module 670, and a network parameter adjustment module 680. among them:

The sample set obtaining module 660 is configured to acquire a first sample set, where the first sample set includes a plurality of unlabeled sample images.

a key point location labeling module 670, configured to perform key point position labeling on each of the unlabeled sample images in the first sample set based on a depth neural network, to obtain a second sample set, where the depth neural network is used For key point positioning of the image.

The network parameter adjustment module 680 is configured to adjust parameters of the deep neural network according to at least a partial sample image and a third sample set in the second sample set, where the third sample set includes a plurality of labeled sample images .

Optionally, the key point location labeling module 670 may include: an image transformation processing unit 6701, configured to perform image transformation processing on each unlabeled sample image in the first sample set to obtain a fourth sample set, where The image transformation processing may include, but is not limited to, any one or more of the following: rotation, translation, scaling, noise addition, and occlusion; a key point location labeling unit 6702, for the fourth based on the depth neural network The sample set and each unlabeled sample image in the first sample set perform key point position labeling to obtain the second sample set.

Optionally, the network parameter adjustment module 680 includes: an optional sample determining unit 6801, configured to perform image transformation processing on the basis of the unlabeled sample image for each unlabeled sample image in the first sample set. Point position information, determining whether the key point position information of the unlabeled sample image is an optional sample, wherein the key point position information of the unlabeled sample image and the key point position information after the image transformation processing are included In the second sample set, the network parameter adjustment unit 6802 is configured to adjust parameters of the deep neural network according to each of the selectable samples and the third sample set in the second sample set.

Optionally, the face key point may include, but is not limited to, any one or more of the following: an eye key point, a nose key point, a mouth key point, an eyebrow key point, and a face contour key point.

The image processing apparatus of the present embodiment is used to implement the corresponding image processing method in the foregoing method embodiments, and has the beneficial effects of the corresponding method embodiments, and details are not described herein again.

In addition, the embodiment of the present application further provides an electronic device, including: a processor and an image processing apparatus according to any of the foregoing embodiments of the present application. When the processor runs the image processing apparatus, the image of any of the foregoing embodiments of the present application The units in the processing unit are operated.

In addition, the embodiment of the present application further provides another electronic device, including: a processor and a memory, where the memory is used to store at least one executable instruction, and the executable instruction causes the processor to perform image processing in any of the foregoing embodiments of the present application. The corresponding operation of the method.

The embodiment of the present application further provides an electronic device, such as a mobile terminal, a personal computer (PC), a tablet computer, a server, and the like. FIG. 9 is a schematic structural view showing an electronic device according to an embodiment of the present application. Referring now to Figure 9, a block diagram of an electronic device 900 suitable for use in implementing a terminal device or server of an embodiment of the present application is shown. As shown in FIG. 9, electronic device 900 includes one or more processors, communication elements, etc., one or more processors such as one or more central processing units (CPUs) 901, and/or one or more image processing The GPU 913 or the like, the processor may perform various appropriate operations according to executable instructions stored in the read only memory (ROM) 902 or executable instructions loaded from the storage portion 908 into the random access memory (RAM) 903. Action and processing. The communication component includes a communication component 912 and a communication interface 909. The communication component 912 can include, but is not limited to, a network card, which can include, but is not limited to, an IB (Infiniband) network card, the communication interface 909 includes a communication interface of a network interface card such as a LAN card, a modem, etc., and the communication interface 909 is via a network such as the Internet. Perform communication processing.

The processor can communicate with the read only memory 902 and/or the random access memory 930 to execute executable instructions, connect to the communication component 912 via the bus 904, and communicate with other target devices via the communication component 912, thereby completing the embodiments of the present application. Corresponding operations of any one of the methods, for example, determining target object information from the image to be processed; determining foreground and background regions in the image according to the target object information and the predetermined object contour template; / or the background area is blurred.

Further, in the RAM 903, various programs and data required for the operation of the device can be stored. The CPU 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. In the case of RAM 903, ROM 902 is an optional module. The RAM 903 stores executable instructions or writes executable instructions to the ROM 902 at runtime, the executable instructions causing the processor 901 to perform operations corresponding to the above-described communication methods. An input/output (I/O) interface 905 is also coupled to bus 904. The communication component 912 can be integrated or can be configured to have multiple sub-modules (eg, multiple IB network cards) and be on a bus link.

The following components are connected to the I/O interface 905: an input portion 906 including a keyboard, a mouse, etc.; an output portion 907 including, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a storage portion 908 including a hard disk or the like. And a communication interface 909 including a network interface card such as a LAN card, a modem, or the like. Driver 910 is also connected to I/O interface 905 as needed. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like is mounted on the drive 910 as needed so that a computer program read therefrom is installed into the storage portion 908 as needed.

It should be noted that the architecture shown in FIG. 9 is only an optional implementation manner. In an optional practice process, the number and types of components in FIG. 9 may be selected, deleted, added, or replaced according to actual needs. In different function component settings, implementations such as separate settings or integrated settings may also be adopted. For example, the GPU and the CPU may be detachably set or the GPU may be integrated on the CPU, the communication component 912 may be separately configured, or may be integrated in the CPU or On the GPU, and so on. These alternative embodiments are all within the scope of the present application.

In particular, according to embodiments of the present application, the processes described above with reference to the flowcharts may be implemented as a computer software program. For example, embodiments of the present application include a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart, the program code comprising the corresponding execution The instruction corresponding to the method step provided by the embodiment of the present application determines, for example, the target object information from the image to be processed; and determines the foreground area and the background area in the image according to the target object information and the predetermined object contour template; The foreground area and/or the background area are subjected to blurring processing. In such an embodiment, the computer program can be downloaded and installed from the network via a communication component, and/or installed from the removable media 911. When the computer program is executed by the central processing unit (CPU) 901, the above-described functions defined in the method of the embodiment of the present application are executed.

The electronic device 900 of the eighth embodiment detects the image to be processed to determine target object information, and acquires a foreground area and a background area in the image to be processed according to the determined target object information and the object contour template, and then the background area or the foreground area. The blurring process is performed so that the foreground area or the background area in which the blurring process needs to be performed can be automatically determined by the target object information detected from the image without the user manually marking the area where the blurring processing is to be performed or manually performing the virtual (fuzzy) operation to improve the convenience and accuracy of operation.

In addition, the embodiment of the present application further provides a computer program, including computer readable code, when the computer readable code is run on the device, the processor in the device is configured to implement any of the foregoing embodiments of the present application. Instructions for each step in the image processing method.

In addition, the embodiment of the present application further provides a computer readable storage medium for storing computer readable instructions, which are executed to implement the operations of the steps in the image processing method of any of the foregoing embodiments of the present application.

The methods, apparatus, and apparatus of the present application may be implemented in a number of ways. For example, the method, apparatus, and apparatus of the embodiments of the present application can be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described sequence of steps for the method is for illustrative purposes only, and the steps of the method of the embodiments of the present application are not limited to the order of the above optional description unless otherwise specified. Moreover, in some embodiments, the present application may also be embodied as a program recorded in a recording medium, the programs including machine readable instructions for implementing a method in accordance with embodiments of the present application. Thus, the present application also covers a recording medium storing a program for executing the method according to the present application.

The description of the embodiments of the present application has been presented for purposes of illustration and description Many modifications and variations will be apparent to those skilled in the art. The embodiments were chosen and described in order to best explain the principles and embodiments of the embodiments of the invention,

Claims

An image processing method comprising:

Determining target object information from the image to be processed;

Determining a foreground area and a background area in the image according to the target object information and a predetermined object outline template;

The foreground area and/or the background area are blurred.
The method of claim 1, wherein the determining the foreground area and the background area in the image according to the target object information and the predetermined object outline template comprises:

Matching at least a partial area in the object contour template with the target object information;

Determining difference information between an object contour in the object contour template and a contour of the target object in the image according to the matching result;

Adjusting an object contour in the object contour template according to the difference information;

Mapping the adjusted object contour into the image obtains a foreground region of the image including the target object and a background region including at least a portion of the foreground region.
The method of claim 2, wherein the difference information comprises: scaling information, offset information, and/or angle information between an object contour in the object contour template and a contour of the target object in the image .
The method according to any one of claims 1 to 3, wherein the image comprises: a still image or a video frame image.
The method of claim 4 wherein said image is a video frame image;

Determining target object information from the image to be processed, including:

Determining the target object information from the to-be-processed video frame image according to target object information determined from the video frame image preceding the video frame image to be processed; or performing a video-by-video frame image through the video stream to be processed Detecting, determining target object information in each video frame image in the video stream.
The method according to any one of claims 1 to 5, further comprising:

Determining a transition region between the foreground region and the background region;

The transition area is blurred.
The method according to claim 6, wherein said performing blurring processing on said transition region comprises: performing progressive blurring processing or spot processing on said transition region.
The method according to any one of claims 1 to 7, wherein the determining the target object information from the image to be processed comprises:

Obtain object selection information;

Determining the target object information from the image to be processed according to the object selection information.
The method according to any one of claims 1 to 8, wherein the determining the target object information from the image to be processed comprises:

A target object is detected from the image to be processed, and the target object information is obtained.
The method according to claim 9, wherein detecting the target object from the image to be processed and obtaining the target object information comprises:

The target object is obtained by detecting a target object from the image to be processed through a pre-trained deep neural network.
The method according to claim 10, wherein the target object information comprises any one or more of the following: face information, license plate information, house number information, address information, identity ID information, trademark information.
The method according to claim 11, wherein the face information comprises any one or more of the following: information of a face key point, face position information, face size information, face angle information.
The method according to claim 11 or 12, wherein the object outline template comprises any one or more of the following: a face contour template, a human body contour template, a license plate contour template, a house card contour template, a predetermined frame contour template.
The method according to any one of claims 11 to 13, wherein the object contour template comprises: a plurality of human body contour templates respectively corresponding to different face angles;

Before determining the foreground area and the background area in the image according to the target object information and the predetermined object contour template, the method further includes: determining, from the object contour template, face angle information in the face information Corresponding human contour template.
The method according to claim 14, wherein said deep neural network is used for detecting face key point information and pre-training by the following method:

Obtaining a first sample set, the first sample set including a plurality of unlabeled sample images;

And performing, according to the depth neural network, performing key point position labeling on each of the unlabeled sample images in the first sample set to obtain a second sample set; wherein the deep neural network is used to perform key point positioning on the image;

Adjusting parameters of the deep neural network according to at least a partial sample image and a third sample set in the second sample set, wherein the third sample set includes a plurality of labeled sample images.
The method according to claim 15, wherein the depth point neural network is used to perform key point position labeling on each of the unlabeled sample images in the first sample set to obtain a second sample set, including:

Performing image transformation processing on each of the unlabeled sample images in the first sample set to obtain a fourth sample set; wherein the image transformation process includes any one or more of the following: rotation, translation, scaling, and addition Noise and occlusion;

And performing, according to the depth neural network, key point location annotation on each of the fourth sample set and each of the unlabeled sample images in the first sample set to obtain the second sample set.
The method according to claim 16, wherein the adjusting the parameters of the deep neural network according to at least a partial sample image and a third sample set in the second sample set comprises:

For each unlabeled sample image in the first sample set, based on the key point position information after the image transformation process is performed on the unlabeled sample image, determining whether the key point position information of the unlabeled sample image is an optional sample Wherein the key point position information of the unlabeled sample image and the key point position information after the image transformation processing are included in the second sample set;

And adjusting parameters of the deep neural network according to each of the selectable samples and the third sample set in the second sample set.
The method according to any one of claims 15 to 17, wherein the face key point comprises any one or more of the following: an eye key point, a nose key point, a mouth key point, an eyebrow key point, and a face Contour key points.
An image processing apparatus comprising:

An object information determining module, configured to determine target object information from the image to be processed;

a front background determining module, configured to determine a foreground area and a background area in the image according to the target object information determined by the object information determining module and the predetermined object contour template;

And a blurring processing module, configured to perform a blurring process on the foreground area and/or the background area determined by the front background determining module.
The apparatus of claim 19, wherein the front background determination module comprises:

a template matching unit, configured to match at least a local area in the object contour template with the target object information;

a difference determining unit, configured to determine, according to a matching result of the template matching unit, difference information between an object contour in the object contour template and a contour of the target object in the image;

a contour adjustment unit, configured to adjust an object contour in the object contour template according to the difference information determined by the difference determining unit;

And a front background determining unit, configured to map an object contour adjusted by the contour adjusting unit into the image, obtain a foreground region including the target object in the image, and a background region including at least a portion not the foreground region.
The apparatus according to claim 20, wherein the difference information comprises: scaling information, offset information, and/or angle information between an object contour in the object contour template and a contour of the target object in the image .
The apparatus according to any one of claims 19 to 21, wherein the image comprises: a still image or a video frame image.
The apparatus of claim 22, wherein the image is a video frame image;

The object information determining module includes:

a first object information determining unit, configured to determine the target object information from the to-be-processed video frame image according to target object information determined from a video frame image before the video frame image to be processed; or

The second object information determining unit is configured to perform target-by-video frame image detection by the video stream to be processed, and determine target object information in each video frame image in the video stream.
The apparatus according to any one of claims 19 to 23, further comprising:

a transition area determining module, configured to determine a transition area between the foreground area and the background area;

The transition blur processing module is configured to perform a blurring process on the transition region determined by the transition region determining module.
The apparatus according to claim 24, wherein said transition blur processing module is configured to perform progressive blurring processing or spot processing on said transition region.
The apparatus according to any one of claims 19 to 25, wherein the object information determining module comprises:

Selecting an information obtaining unit, configured to acquire object selection information;

The third object information determining unit is configured to determine the target object information from the image to be processed according to the object selection information acquired by the selection information acquiring unit.
The apparatus according to any one of claims 19 to 26, wherein the object information determining module comprises:

And a fourth object information determining unit, configured to detect the target object from the image to be processed, and obtain the target object information.
The apparatus according to claim 27, wherein said fourth object information determining unit is configured to detect the target object from said image to be processed through a pre-trained depth neural network to obtain said target object information.
The apparatus according to claim 28, wherein said target object information comprises any one or more of the following: face information, license plate information, house number information, address information, identity ID information, trademark information.
The device according to claim 29, wherein the face information comprises any one or more of the following: information of a face key point, face position information, face size information, face angle information.
The apparatus according to claim 29 or 30, wherein the object contour template comprises any one or more of the following: a face contour template, a human body contour template, a license plate contour template, a house card contour template, a predetermined frame contour template.
The apparatus according to any one of claims 29 to 31, wherein the object contour template comprises: a plurality of human body contour templates respectively corresponding to different face angles;

The front background determining module is further configured to determine from the object contour template and the face information before determining the foreground area and the background area in the image according to the target object information and the predetermined object contour template. The human face contour information corresponding to the face angle information.
The device according to claim 32, further comprising:

a sample set obtaining module, configured to acquire a first sample set, where the first sample set includes a plurality of unlabeled sample images;

a key point location labeling module, configured to perform key point position labeling on each of the unlabeled sample images in the first sample set based on a depth neural network, to obtain a second sample set, wherein the deep neural network is used for Key point positioning of the image;

And a network parameter adjustment module, configured to adjust parameters of the deep neural network according to at least a partial sample image and a third sample set in the second sample set, wherein the third sample set includes a plurality of labeled sample images.
The apparatus of claim 33, wherein the keypoint location labeling module comprises:

The image transformation processing unit is configured to perform image transformation processing on each of the unlabeled sample images in the first sample set to obtain a fourth sample set; wherein the image transformation processing includes any one or more of the following: Rotate, pan, zoom, add noise, and add occlusion;

a key point location labeling unit, configured to perform key point position labeling on each of the fourth sample set and each of the unlabeled sample images in the first sample set based on the depth neural network, to obtain the second sample set.
The device of claim 34, wherein the network parameter adjustment module comprises:

The optional sample determining unit is configured to determine, according to the unmarked sample image, the key point position information after the image transform processing on the unlabeled sample image in the first sample set, and determine the key of the unlabeled sample image Whether the point position information is an optional sample; wherein the key point position information of the unlabeled sample image and the key point position information after the image transformation processing are included in the second sample set;

And a network parameter adjustment unit, configured to adjust parameters of the deep neural network according to each of the selectable samples and the third sample set in the second sample set.
The device according to any one of claims 32 to 35, wherein the face key point comprises any one or more of the following: an eye key point, a nose key point, a mouth key point, an eyebrow key point, and a face Contour key points.
An electronic device, comprising:

A processor and the image processing device according to any one of claims 19 to 36;

The unit in the image processing apparatus according to any one of claims 19 to 36 is operated when the processor operates the image processing apparatus.
An electronic device comprising: a processor and a memory;

The memory is configured to store at least one executable instruction that causes the processor to perform an operation corresponding to the image processing method of any one of claims 1-18.
A computer program comprising computer readable code, wherein when the computer readable code is run on a device, a processor in the device performs the implementation of any one of claims 1-18 The instructions of each step in the image processing method.
A computer readable storage medium for storing computer readable instructions, wherein the instructions are executed to perform the operations of the steps of the image processing method according to any one of claims 1 to 18.