WO2022261828A1

WO2022261828A1 - Image processing method and apparatus, electronic device, and computer-readable storage medium

Info

Publication number: WO2022261828A1
Application number: PCT/CN2021/100148
Authority: WO
Inventors: 内山寛之; 刘锴
Original assignee: Oppo广东移动通信有限公司
Priority date: 2021-06-15
Filing date: 2021-06-15
Publication date: 2022-12-22

Abstract

Embodiments of the present application disclose an image processing method and apparatus, an electronic device, and a computer-readable storage medium. The method comprises: preprocessing an original person image to obtain a region of interest image of the original person image and a region segmentation image corresponding to the region of interest image, the region segmentation image comprising portrait region information of the region of interest image; generating a first hair mask according to the region of interest image and the region segmentation image; and optimizing the first hair mask so as to obtain a target hair mask corresponding to the original person image. According to the image processing method and apparatus, the electronic device, and the computer-readable storage medium, an accurate hair mask corresponding to a person image can be obtained, so that the hair mask can be used to accurately position a hair region in the person image, thereby improving the image processing effect.

Description

Image processing method, device, electronic device, and computer-readable storage medium

technical field

The present application relates to the field of image technology, in particular to an image processing method, device, electronic equipment, and computer-readable storage medium.

Background technique

In the field of image technology, it is a relatively common image processing process to separate the foreground area and the background area in the image. For a person image containing a person, when separating the foreground portrait area from the background area, it is difficult to accurately separate the hair area of the portrait due to the many details of human hair, which affects the separation effect of the foreground and background of the person image.

Contents of the invention

The embodiment of the present application discloses an image processing method, device, electronic equipment, and computer-readable storage medium, which can obtain an accurate hair mask corresponding to a character image, so that the hair mask can be used to accurately locate the hair region in the character image , which improves the image processing effect.

The embodiment of the present application discloses an image processing method, including: preprocessing the original person image, obtaining the ROI image of the original person image, and the region segmentation image corresponding to the ROI image, the The region segmentation image includes portrait area information of the region of interest image; generating a first hair mask according to the region of interest image and the region segmentation image; optimizing the first hair mask to obtain the The target hair mask corresponding to the original character image.

The embodiment of the present application discloses an image processing device, including: a preprocessing module, configured to preprocess the original character image, obtain the ROI image of the original character image, and the ROI image corresponding to the ROI image Region segmentation image, the region segmentation image includes portrait area information of the region of interest image; mask generation module, used to generate a first hair mask according to the region of interest image and the region segmentation image; optimization module , for optimizing the first hair mask to obtain a target hair mask corresponding to the original person image.

The embodiment of the present application discloses an electronic device, including a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor performs the following steps: The image is preprocessed to obtain the ROI image of the original person image, and the region segmentation image corresponding to the ROI image, and the region segmentation image includes the portrait region information of the ROI image; according to the The region of interest image and the region segmentation image are used to generate a first hair mask; and the first hair mask is optimized to obtain a target hair mask corresponding to the original person image.

The embodiment of the present application discloses a computer-readable storage medium, on which a computer program is stored. When the computer program is executed by a processor, the processor performs the following steps: preprocessing the original character image to obtain the The region of interest image of the original person image, and the region segmentation image corresponding to the region of interest image, the region segmentation image includes portrait region information of the region of interest image; according to the region of interest image and the region of interest image Generate a first hair mask from the region segmentation image; and optimize the first hair mask to obtain a target hair mask corresponding to the original person image.

The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below. Other features and beneficial effects of the present application will appear from the description, drawings and claims.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the following will briefly introduce the accompanying drawings that need to be used in the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present application. For Those of ordinary skill in the art can also obtain other drawings based on these drawings without making creative efforts.

Fig. 1 is a block diagram of an image processing circuit in an embodiment;

Fig. 2 is a flowchart of an image processing method in an embodiment;

Fig. 3 is a schematic diagram of preprocessing the original character image in one embodiment;

Fig. 4 is a flow chart of preprocessing the original character image in one embodiment;

FIG. 5A is a schematic diagram of a portrait segmentation image in an embodiment;

Fig. 5B is a schematic diagram of calculating hair contour lines in one embodiment;

FIG. 5C is a schematic diagram of determining a region of interest in matting in an embodiment;

Fig. 5D is a schematic diagram of correcting the original character image and the segmented portrait image in one embodiment;

Fig. 6 is the flowchart of image processing method in another embodiment;

Fig. 7 is a schematic diagram of generating a first hair mask through an image processing model in an embodiment;

Fig. 8 is a flow chart of calculating the background complexity image in one embodiment;

FIG. 9A is a schematic diagram of calculating background complexity in an embodiment;

Fig. 9B is a schematic diagram of merging the first hair mask before corrosion treatment and the first hair mask after corrosion treatment in one embodiment;

Figure 10 is a schematic diagram of filling holes in the first hair mask in one embodiment;

Fig. 11 is a schematic diagram of enhancing the hair region of the first hair mask in one embodiment;

Fig. 12 is a schematic diagram of softening the first hair mask in one embodiment;

Fig. 13 is a schematic diagram of performing upsampling and filtering on a second hair mask through a guided filter in an embodiment;

Figure 14 is a block diagram of an image processing device in an embodiment;

Fig. 15 is a structural block diagram of an electronic device in one embodiment.

detailed description

The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some, not all, embodiments of the application. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

It should be noted that the terms "comprising" and "having" and any variations thereof in the embodiments of the present application and the drawings are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally further includes For other steps or units inherent in these processes, methods, products or apparatuses.

It can be understood that the terms "first", "second" and the like used in this application may be used to describe various elements herein, but these elements are not limited by these terms. These terms are only used to distinguish one element from another element. For example, a first hair mask could be termed a second hair mask, and, similarly, a second hair mask could be termed a first hair mask, without departing from the scope of the present application. Both the first hair mask and the second hair mask are hair masks, but they are not the same hair mask.

An embodiment of the present application provides an electronic device. The electronic device includes an image processing circuit, and the image processing circuit may be implemented by hardware and/or software components, and may include various processing units defining an ISP (Image Signal Processing, image signal processing) pipeline. Figure 1 is a block diagram of an image processing circuit in one embodiment. For ease of description, FIG. 1 only shows various aspects of the image processing technology related to the embodiment of the present application.

As shown in FIG. 1 , the image processing circuit includes an ISP processor 140 and a control logic 150 . Image data captured by imaging device 110 is first processed by ISP processor 140 , which analyzes the image data to capture image statistics that can be used to determine one or more control parameters of imaging device 110 . Imaging device 110 may include one or more lenses 112 and image sensor 114 . The image sensor 114 may include a color filter array (such as a Bayer filter), and the image sensor 114 may obtain light intensity and wavelength information captured by each imaging pixel and provide a set of raw image data that may be processed by the ISP processor 140 . The attitude sensor 120 (such as a three-axis gyroscope, Hall sensor, accelerometer, etc.) can provide the collected image processing parameters (such as anti-shake parameters) to the ISP processor 140 based on the interface type of the attitude sensor 120 . The attitude sensor 120 interface may adopt a SMIA (Standard Mobile Imaging Architecture, Standard Mobile Imaging Architecture) interface, other serial or parallel camera interfaces, or a combination of the above interfaces.

It should be noted that although only one imaging device 110 is shown in FIG. 1 , at least two imaging devices 110 may be included in this embodiment of the application, and each imaging device 110 may correspond to an image sensor 114 respectively, or may Multiple imaging devices 110 correspond to one image sensor 114 , which is not limited here. The working process of each imaging device 110 may refer to the content described above.

In addition, the image sensor 114 can also send the original image data to the attitude sensor 120, and the attitude sensor 120 can provide the original image data to the ISP processor 140 based on the attitude sensor 120 interface type, or the attitude sensor 120 can store the original image data in the image memory 130 in.

The ISP processor 140 processes raw image data on a pixel-by-pixel basis in various formats. For example, each image pixel may have a bit depth of 8, 10, 12, or 14 bits, and the ISP processor 140 may perform one or more image processing operations on the raw image data, gather statistical information about the image data. Among other things, image processing operations can be performed with the same or different bit depth precision.

ISP processor 140 may also receive image data from image memory 130 . For example, the attitude sensor 120 interface sends raw image data to the image storage 130, and the raw image data in the image storage 130 is provided to the ISP processor 140 for processing. The image memory 130 may be a part of a memory device, a storage device, or an independent dedicated memory in an electronic device, and may include a DMA (Direct Memory Access) feature.

When receiving raw image data from the image sensor 114 interface or from the attitude sensor 120 interface or from the image memory 130, the ISP processor 140 may perform one or more image processing operations, such as temporal filtering. The processed image data may be sent to image memory 130 for additional processing before being displayed. The ISP processor 140 receives processed data from the image memory 130 and subjects the processed data to image data processing in the original domain and in the RGB and YCbCr color spaces. The image data processed by the ISP processor 140 may be output to the display 160 for viewing by the user and/or for further processing by a graphics engine or a GPU (Graphics Processing Unit, graphics processor). In addition, the output of the ISP processor 140 can also be sent to the image memory 130 , and the display 160 can read image data from the image memory 130 . In one embodiment, image memory 130 may be configured to implement one or more frame buffers.

Statistics determined by ISP processor 140 may be sent to control logic 150 . For example, the statistical data may include the vibration frequency of the gyroscope, automatic exposure, automatic white balance, automatic focus, flicker detection, black level compensation, lens 112 shading correction and other image sensor 114 statistical information. Control logic 150 may include a processor and/or a microcontroller that executes one or more routines (e.g., firmware) that determine control parameters of imaging device 110 and ISP processing based on received statistical data. The control parameters of the device 140. For example, the control parameters of the imaging device 110 may include attitude sensor 120 control parameters (such as gain, integration time of exposure control, anti-shake parameters, etc.), camera flash control parameters, camera anti-shake displacement parameters, lens 112 control parameters (such as focus or focal length for zooming) or a combination of these parameters. ISP control parameters may include gain levels and color correction matrices for automatic white balance and color adjustment (eg, during RGB processing), as well as lens 112 shading correction parameters.

By way of example, the image processing method provided by the embodiment of the present application will be described with reference to the image processing circuit in FIG. 1 . The ISP processor 140 can acquire the original character image from the imaging device 110 or the image memory 130, and can perform preprocessing on the original character image to obtain the ROI image of the original character image and the region corresponding to the ROI image Split the image. The ISP processor 140 may generate a first hair mask according to the ROI image and the region segmentation image, and optimize the first hair mask to obtain a target hair mask corresponding to the original person image.

In some embodiments, after the ISP processor 140 obtains the target hair mask corresponding to the original character image, it can accurately determine the hair region in the original character image according to the target hair mask, and use the target hair mask to analyze the hair area of the original character. The image is separated from foreground and background regions. Optionally, image processing can also be performed on the separated background area or foreground area, such as blurring the background area and beautifying the foreground area (such as increasing brightness, whitening portraits, defogging, etc.), etc. , but not limited to this. The ISP processor 140 can send the processed image to the image memory 130 for storage, and can also send the processed image to the display 160 for display, so that the user can observe the processed image through the display 160 conveniently.

As shown in Figure 2, in one embodiment, an image processing method is provided, which can be applied to the above-mentioned electronic equipment, which may include but not limited to mobile phones, smart wearable devices, tablet computers, PC (Personal Computer, personal computer), vehicle-mounted terminal, digital camera, etc., which are not limited in this embodiment of the present application. The image processing method may include the following steps:

Step 210, preprocessing the original person image to obtain the ROI image of the original person image and the region segmentation image corresponding to the ROI image.

The original character image can refer to an image containing a character, and the original character image can be a color image, such as an image in RGB (Red Green Blue, red green blue) format or YUV (Y represents brightness, U and V represent chroma) format images, etc. The original person image may be an image in which a foreground portrait area needs to be separated from a background area. The original person image may be an image pre-stored in the memory of the electronic device, or an image collected in real time by the electronic device through a camera.

As an implementation, the depth information corresponding to the foreground portrait area and the background area in the original person image is quite different, and the depth information can be used to represent the distance between the photographed object and the camera, and the greater the depth information, the greater the distance. Therefore, the depth information corresponding to each pixel in the original person image can be used to divide the foreground area and the background area in the original person image, for example, the background area can be an area composed of pixels whose depth information is greater than the first threshold, the The foreground area may be an area composed of pixels whose depth information is less than the second threshold.

As another implementation manner, face recognition may also be used to divide the foreground area and the background area of the original person image. The electronic device can perform face recognition on the original person image, determine the face area in the original person image, and then determine the portrait area according to the face area. The portrait area refers to the image area where the entire human body is located, and the face area refers to the is the image area where the face of the person is located, and the portrait area includes the face area. Other image areas in the original person image except the portrait area can be determined as the background area.

The electronic device can preprocess the original person image to determine a region of interest (Region of Interest, ROI) in the original person image. (hair matting) image area, the matting area of interest may include the face area. An accurate hair mask corresponding to the original character image can be obtained through hair matting, so as to accurately locate the hair region of the original character image through the precise hair mask. After determining the matting region of interest in the original person image, the electronic device can extract the matting region of interest from the original person image to obtain an image of the region of interest, and at the same time obtain a region segmentation image corresponding to the region of interest image, The region segmentation image may include portrait region information of the region of interest image, and the region segmentation image may be understood as an image obtained by extracting a portrait from the region of interest image.

Fig. 3 is a schematic diagram of preprocessing an original person image in an embodiment. As shown in FIG. 3 , the electronic device can preprocess the original character image 310, determine the matting region of interest 312 in the original character image 310, and extract the matting region of interest 312 from the original character image 310 to obtain a sense The region of interest image 320, and the region segmentation image 330 corresponding to the region of interest image 320 can be obtained at the same time. The region segmentation image 330 matches the ROI image 320, and the region segmentation image 330 can be used to represent the portrait area in the ROI image 320 (ie, the black region in the region segmentation image 330).

Step 220, generating a first hair mask according to the ROI image and the region segmentation image.

The first hair mask can be used to characterize the hair region in the region of interest image, the electronic device can first deduce the hair region in the region of interest image according to the region of interest image and the corresponding region segmentation image, and generate the first hair mask . As an implementation, the electronic device can use machine learning to generate the first hair mask, and can input the image of the region of interest and the corresponding region segmentation image into the pre-trained image processing model, and use the image processing model to generate the hair mask of interest The region image and the region segmentation image are processed to obtain the first hair mask. Wherein, the image processing model can be obtained by training according to multiple sets of sample training images, and each set of sample training images can include a sample person image, a sample portrait segmentation image corresponding to the sample person image, and a sample hair mask, and the sample hair Masks can be used to label hair regions in sample person images.

In other embodiments, the electronic device can also use other methods to generate the first hair mask. For example, the electronic device can determine the profile of the portrait in the image of the region of interest according to the region segmentation image, and determine the portrait area according to the profile of the portrait, and then Perform image recognition on the portrait area, extract image features in the portrait area, and analyze the image features to determine the hair area. Optionally, the image features may include but not limited to edge features, color features, position features, etc., for example, the color of the hair region is usually black, has more edge information, and is located above the face (especially the eyes located in the face) area above), etc.

Step 230, optimize the first hair mask to obtain a target hair mask corresponding to the original person image.

The first hair mask is a hair mask initially obtained from the region-of-interest image and the region-segmented image. The optimization process adjusts and corrects the first hair mask, so that a more detailed and precise target hair mask can be obtained, and the target hair mask can be used to accurately locate the hair region in the original person image. Optionally, the optimization processing may include but not limited to enhancement processing, erosion processing, filling processing, etc., to optimize the edges of the hair region in the first hair mask, and alleviate the lack of hairline edges or It is the case that contains the edge of non-hair content, etc., to get an accurate target hair mask.

After the electronic device obtains the target hair mask, it can separate the foreground portrait area and the background area in the original character image according to the target hair mask. Since the hair area of the original character image is accurately positioned in the target hair mask, it can be accurately Separate the hair area of the portrait from the background area to achieve hair-level image separation.

After the foreground portrait area and background area in the original person image are separated, the separated portrait area and/or background area may be further processed. For example, the background area can be blurred, the brightness of the portrait area can be adjusted, and the white balance parameters of the portrait area can be adjusted. The embodiment of the present application does not limit the image processing after separation.

In the embodiment of the present application, by preprocessing the original person image, the ROI image of the original person image and the region segmentation image corresponding to the ROI image are obtained, and the ROI image and the region segmentation image are generated according to the ROI image and the region segmentation image. the first hair mask, and optimize the first hair mask to obtain the target hair mask corresponding to the original character image, after generating the first hair mask, optimize and correct the first hair mask, In this way, a finer and more accurate target hair mask can be obtained, and the hair region in the original character image can be accurately located by using the target hair mask, thereby improving the image processing of the subsequent image processing such as foreground and background separation of the original character image Effect.

As shown in Figure 4, in one embodiment, the step of preprocessing the original person image to obtain the region of interest image of the original person image and the region segmentation image corresponding to the region of interest image may include the following steps:

Step 402: Determine the matting region of interest in the original person image according to the original person image and the person segmentation image corresponding to the original person image.

A segmented portrait image is an image obtained by extracting a portrait from an original person image, and the segmented portrait image may include portrait area information of the original person image. As an implementation manner, the electronic device may directly acquire an original character image and a segmented portrait image corresponding to the original character image, and perform preprocessing on the original character image according to the segmented portrait image. The segmented portrait image may be an image pre-stored in the memory, and the electronic device may perform portrait extraction on the original person image in advance to obtain the segmented portrait image, and store the segmented portrait image in the memory. That is, the preprocessing process of the original person image does not include the step of extracting the portrait from the original person image.

As another implementation, the preprocessing process of the original person image may include the step of extracting the portrait of the original person image. When the electronic device preprocesses the original person image, it may first perform portrait extraction on the original person image to obtain A segmented image of a portrait, and then based on the segmented image of a portrait, a matting region of interest in the original person image is determined.

In some embodiments, the electronic device can extract the image features of the original person image through the first segmentation model, identify the portrait region in the original person image based on the image features, and perform portrait extraction on the original person image according to the portrait region to obtain the portrait Split the image. The first segmentation model may be obtained by training according to a first set of segmented sample images, which may include a plurality of sample person images, and a sample portrait segmentation image corresponding to each sample person image. Optionally, the first segmented sample image set may only contain multiple sample person images, and each sample person image may be marked with person area information.

Fig. 5A is a schematic diagram of a segmented portrait image in an embodiment. As shown in FIG. 5A , the original person image 310 corresponds to the portrait segmented image 304, which is obtained after portrait extraction is performed on the original person image 310, and the portrait segmented image 304 can be used to represent the portrait area in the original person image 310 .

In some embodiments, the step of determining the matting region of interest in the original character image according to the original character image and the segmented portrait image corresponding to the original character image may include: acquiring the hair segmentation image corresponding to the original character image, according to the The hair contour line is calculated from the hair segmentation image and the portrait segmentation image, and the matting region of interest in the original person image is determined according to the hair contour line.

The hair segmentation image is an image obtained by performing hair segmentation on the original person image. The hair segmentation image may include hair region information of the original person image, and the hair region in the original person image may be identified and extracted to obtain the hair segmentation image. In some embodiments, the electronic device can identify the hair region in the original person image through the second segmentation model, and the second segmentation model can extract the image features of the original person image, and identify the hair region in the original person image based on the image features , and extract the hair region in the original person image to obtain the hair segmentation image. Optionally, the second segmentation model may be obtained by training according to a second set of segmented sample images, which may include a plurality of sample person images and a sample hair segment image corresponding to each sample person image . Optionally, the second set of segmented sample images may only include multiple sample person images marked with hair region information.

In some embodiments, a segmentation model can also be used to identify the portrait region and hair region of the original person image at the same time, and output the portrait segmentation image and hair segmentation image, and the sample person image and the sample portrait corresponding to the sample person image can be simultaneously Segmented images and sample hair segmented images are used as a training set to train the segmentation model, so that it has the ability to simultaneously output portrait segmented images and hair segmented images.

The above segmentation model can use deeplab semantic segmentation algorithm, U-Net network structure, FCN (Fully Convolutional Networks, fully convolutional neural network) and other methods to perform portrait segmentation, which is not limited in the embodiment of the present application.

The hair contour line may be used to describe the contour of the hair region, and the hair contour line may include each pixel point on the outer edge of the hair region, and the outer edge refers to an edge adjacent to the background region. In some embodiments, the electronic device can compare the hair segmentation image with the portrait segmentation image, determine the same pixel points on the outer edge of the hair region, and determine the hair contour line according to the same pixel points. As a specific implementation, the electronic device can use the portrait segmentation image to corrode the hair region in the hair segmentation image, so that the hair region of the hair segmentation image is reduced, and only the edges in the hair segmentation image that coincide with the edges of the portrait segmentation image are retained Pixels, the remaining edge pixels constitute the hair contour.

Fig. 5B is a schematic diagram of calculating hair contour in one embodiment. As shown in FIG. 5B , the electronic device can compare the segmented portrait image 510 with the segmented hair image 520 , determine the same pixel points on the outer edge of the hair region, and obtain the hair contour line 530 . The calculation formula of the hair contour line 530 can be formula (1):

hair_outline=hair_seg-erode(seg) formula (1);

Among them, hair_outline represents the hair contour line 530 , hair_seg represents the hair segmentation image 520 , and seg represents the portrait segmentation image 510 .

After the hair contour line is calculated by the electronic device, the matting region of interest in the original person image can be determined according to the hair contour line. In some embodiments, the face area in the original person image can be determined first, and the initial interest area can be obtained according to the face area. area. Optionally, face recognition can be performed on the original person image to determine the face area. The face area only contains the image content of the face part of the person. The shape of the face area can be a fixed shape, such as a fixed square, rectangle etc.

Optionally, the hair segmented image can also be used to determine the face area in the original person image, the hair segmented image can include edge information that the hair area is around the face, and the hair area can be determined by using the edge information that the hair area is around the face face area.

Since the face area is smaller than the matting area of interest, and the face area needs to be located inside the matting area of interest, the initial area of interest can be obtained based on the determined face area according to the preset area division rules. The position and area size of the initial ROI can be determined according to the determined face area. For example, the region division rule may include that the face region is located in the middle of the initial region of interest, and the size of the initial region of interest is twice the size of the face region; or the lower border of the face region coincides with the lower border of the initial region of interest , and the size of the initial region of interest is 1.5 times that of the face region, etc., the determined face region can also be directly used as the initial region of interest, but not limited thereto, and the region division rules can be set according to actual needs. For different original person images, the determined face regions may occupy different image areas, and the region division rules may also be adjusted accordingly.

Since the initial region of interest is only a rough matting region of interest, it is also necessary to use the hair contour to correct the initial region of interest in order to obtain an accurate matting region of interest. The electronic device may respectively project the hair contour line on the abscissa axis and the ordinate axis of the original character image to obtain a first projection distribution of the hair contour line on the abscissa axis and a second projection distribution on the ordinate axis. Wherein, the axis of abscissa and the axis of ordinate belong to the same plane coordinate system, and the plane coordinate system may include an image coordinate system, a pixel coordinate system, and the like. The first projection distribution can reflect the position of the hair contour line on the abscissa axis, and the second projection distribution can reflect the position of the hair contour line on the ordinate axis.

The electronic device may correct the initial region of interest according to the first projection distribution and the second projection distribution to obtain the matting region of interest. The correction may include using the first projection distribution and the second projection distribution to modify the size and and/or position adjustments. The horizontal range of the matting region of interest can be fixed according to the first projection distribution of the hair contour line on the abscissa axis, and the vertical range of the matting region of interest can be fixed according to the second projection distribution of the hair contour line on the ordinate axis. The horizontal range and vertical range determine the region of interest for matting. The horizontal range may refer to the coordinate range of the matting region of interest on the abscissa axis of the original character image, and the vertical range may refer to the coordinate range of the matting region of interest on the ordinate axis of the original character image, for example , the horizontal range is the abscissa Xa~Xb, and the vertical range is the ordinate Ym~Yn.

The electronic device may adjust the horizontal range of the initial region of interest according to the first projection distribution, so as to fix the horizontal range of the region of interest in the cutout. For example, the horizontal range of the initial region of interest is adjusted so that the horizontal range includes the first projection distribution, and the first projection distribution is located in the middle of the horizontal range. The electronic device can adjust the vertical range of the initial region of interest according to the second projection distribution, so as to fix the vertical range of the region of interest in the matting. For example, the vertical range of the initial region of interest can be adjusted so that the vertical range includes the second projection distribution, and the minimum ordinate of the vertical range can be set to be smaller than the minimum ordinate of the second projection distribution, and the minimum ordinate of the vertical range is the same as the second projection distribution The distance between the minimum ordinates of the two projection distributions is the first pixel distance, the maximum ordinate of the vertical range is greater than the maximum ordinate of the second projection distribution, and the maximum ordinate of the second projection distribution is the same as the maximum ordinate of the second projection distribution The distance between the coordinates is the second pixel distance.

It should be noted that the shape and size of the cutout region of interest can be set according to actual needs, for example, the shape can include rectangle, square, etc., and the above-mentioned first pixel distance, second pixel distance, etc. can be set according to actual needs. The matting region of interest may be entirely within the original person image, or part of it may not be within the original person image.

Correcting the initial region of interest by using the hair contour line can ensure that the obtained matting region of interest contains a complete face region and includes a complete hair region, making the obtained matting region of interest more accurate, including more complete, Rich detail, which improves the accuracy of the subsequent calculation of the hair mask.

Fig. 5C is a schematic diagram of determining a region of interest in matting in an embodiment. As shown in FIG. 5C , to calculate the hair contour line 540 of the original person image 550 , first determine the face area 552 in the original person image 550 , and use the face area 552 to obtain an initial region of interest (not shown). The hair contour line 540 can be projected on the abscissa axis and the ordinate axis of the original character image respectively to obtain the first projection distribution 542 on the abscissa axis and the second projection distribution 544 on the ordinate axis, and according to the first A projection distribution 542 and a second projection distribution 544 adjust the initial ROI to obtain a matted ROI 554 . It can be guaranteed that the matting region of interest 554 includes the complete human face region 552 and includes the complete hair region.

In one embodiment, before the electronic device determines the matting region of interest in the original character image, if the original character image is a rotated image, the original character image and the segmented portrait image corresponding to the original character image can be respectively Correction. If the original character image is a rotated image, the portrait area in the original character image is not vertical to the horizontal, that is, the portrait area is not positive, then the original character image and the portrait segmented image can be corrected first, so that the original character The portrait area in the image is perpendicular to the horizontal (keeps positive). Optionally, the original person image is a rotated image, which may be caused by the rotation of the original person image after post-image processing, or the rotation of the camera currently collecting the original person image.

After correcting the original person image and the segmented portrait image, the electronic device can determine a corrected matting region of interest according to the corrected original person image and the corrected segmented portrait image. The process of determining the corrected region of interest in matting may be similar to the process of determining the region of interest in matting described in the above embodiments, and will not be repeated here.

Since the corrected matting region of interest does not match the uncorrected original person image, the corrected matting region of interest can be rotated according to the rotation direction of the uncorrected original person image to obtain the uncorrected Matting regions of interest in raw person images. The rotation direction may refer to the relative horizontal rotation direction of the portrait area in the uncorrected original person image.

Fig. 5D is a schematic diagram of correcting the original person image and the segmented image of the person in an embodiment. As shown in Figure 5D, the original character image 562 and the segmented portrait image 564 are rotated images, and the original character image 562 and the segmented portrait image 564 can be corrected first to obtain the corrected original character image 572 and the corrected portrait The segmented image 574, the corrected original person image 572 and the corrected portrait segmented image 574 have a front orientation. The corrected matting region of interest 582 can be determined according to the corrected original character image 572 and the corrected portrait segmentation image 574, and then the corrected matting region of interest 582 is rotated according to the rotation direction of the original character image 562, A matted region of interest 584 in the original person image 562 is obtained. In the embodiment of the present application, before determining the matting region of interest, the original person image and the segmented portrait image are corrected first, which can make the recognized matting region of interest more accurate.

Step 404 , respectively cropping the original person image and the segmented portrait image according to the cutout region of interest, to obtain the region of interest image and the region segmentation image corresponding to the region of interest image.

The area of interest in matting can be used as the cropping area of the original person image and the portrait segmentation image, the original person image is cropped to obtain the region of interest image, the portrait segmentation image is cropped to obtain the region segmentation image, the region segmentation image and the Region of interest image matching.

In the embodiment of the present application, in the preprocessing stage of the original person image, the matting region of interest of the original person image is firstly determined, and based on the matting region of interest, the original person image and the segmented portrait image are cropped to obtain the subsequent The image of the region of interest and the region segmentation image used to generate the hair mask can improve the accuracy of the subsequently generated hair mask, and does not require the entire image to refer to the process of generating the hair mask, which can reduce the amount of calculation and improve image processing efficiency.

As shown in FIG. 6 , in one embodiment, another image processing method is provided, which can be applied to the above-mentioned electronic device. The method may include the steps of:

Step 602, preprocessing the original person image to obtain the ROI image of the original person image and the region segmentation image corresponding to the ROI image.

The description of step 602 may participate in the relevant descriptions about preprocessing in the foregoing embodiments, and will not be repeated here.

Step 604: Input the image of the region of interest and the region segmentation image into the image processing model, and process the region of interest image and the region segmentation image through the image processing model to obtain a first hair mask.

In some embodiments, before inputting the region-of-interest image and the region-segmented image into the image processing model, it may be determined whether the image size of the region-of-interest image and the region-segmented image matches the corresponding input image size of the image processing model. If the image size of the region of interest image and the region segmentation image does not match the input image size corresponding to the image processing model, the image of the region of interest and the region segmentation image can be rotated and scaled first to obtain the image corresponding to the image processing model The input image size matches the ROI image and the region segmentation image.

For example, the input image size corresponding to the image processing model is a vertical input size (the length of the image is smaller than the width), if the ROI image and the region segmentation image are horizontal images (the length of the image is larger than the width), the ROI image and After the region segmentation image is rotated 90 degrees clockwise or counterclockwise, it is then input into the image processing model, so as to ensure that the image size of the input region of interest image and region segmentation image is adapted to the image processing model, and the processing of the image processing model is improved. Accuracy.

Image processing models may include neural network models such as CNN (Convolutional Neural Networks, Convolutional Neural Networks). As a specific implementation, the image processing model can be a neural network architecture of U-NET, which can connect the image of the region of interest and the image of the region segmentation, and input the image processing model. The image processing model can include multiple down-sampling layers and multiple up-sampling layers. The image processing model can perform multiple down-sampling and convolution processing on the image of the region of interest and the region segmentation image through multiple down-sampling layers, and then through multiple down-sampling layers. Upsampling layers perform multiple upsampling processes to obtain the first hair mask that is smaller than the input image or has the same resolution as the input image. In the image processing model, skip connections can be realized between the downsampling layer and upsampling layer between the same resolutions, and the features of the downsampling layer and upsampling layer between the same resolutions are fused to make the upsampling process more accurate .

Fig. 7 is a schematic diagram of generating a first hair mask through an image processing model in an embodiment. As shown in FIG. 7 , the region-of-interest image 712 and the corresponding region segmentation image 714 can be input into the image processing model 720, and the image processing model 720 can process the region-of-interest image 712 and the region segmentation image 714, and output the first hair mask 732 .

The image processing model can be obtained by training according to multiple sets of sample training images. Each set of sample training images can include a sample character image, a sample portrait segmentation image corresponding to the sample character image, and a sample hair mask. Optionally, each The set of sample training images may also include sample person images carrying hair region information and corresponding sample person segmentation images. Furthermore, the sample character image and the sample segmented portrait image may be cropped or scaled images according to a set size, which can ensure that the sizes of the images input to the image processing model remain consistent.

When training the image processing model, a set of sample training images can be input into the image processing model to be trained, and the image processing model to be trained can process the input sample person images and sample portrait segmentation images to obtain the predicted hair mask , the predicted hair mask can be compared with the sample hair mask, and the loss of the predicted hair mask relative to the sample hair mask can be calculated through the loss function, and then the parameters of the image processing model can be adjusted according to the loss until the calculated If the loss is less than the preset loss threshold, or the number of parameter adjustments reaches the number threshold, etc., the convergence condition of the image processing model is satisfied.

Optionally, the above loss function may include at least one of L1 loss function and L2 loss function, etc., the L1 loss function is the sum of the absolute value of the difference between the predicted hair mask and the sample hair mask, L2 The loss function is calculated by computing the sum of the squares of the difference between the predicted hair mask and the sample hair mask.

In some embodiments, since the background area is misjudged as the foreground area and not blurred when performing blurring processing on the person image, it will be more difficult than the situation where the foreground area is misjudged as the background area and blurred. Conspicuous, therefore, when calculating the loss of the predicted hair mask relative to the sample hair mask, you can focus on the situation where the background area is misjudged as the foreground area. When calculating the loss, the background area is misjudged as the loss coefficient corresponding to the foreground area It can be greater than the loss coefficient corresponding to the foreground area being misjudged as the background area. Optionally, the loss function can be formula (2):

L(y,t)=δ(t<α) max(yt,0) ² formula (2);

Among them, L(y, t) represents the loss of the predicted hair mask relative to the sample hair mask, α can be the threshold value set, δ can be a judgment function, which is used to judge whether t is less than α, if it is less than 1, output 1, If not less than, output 0, y may refer to the predicted hair mask, and t may be the sample hair mask. Through this loss function, the situation that the background area is misjudged as the foreground area can be effectively reduced, the accuracy of the image processing model to generate the first hair mask can be improved, and the processing effect of subsequent image blurring processing can be improved.

In some embodiments, since the hair region in the sample character image may have some translucency (such as less hair strands or the effect of hair strands flying up), if in the corresponding sample hair mask Marking the translucent hair area may lead to poor image processing effect after the foreground portrait area and background image are separated. For example, when blurring the background area, if the generated hair mask marks all the semi-transparent hair areas in the foreground portrait area, the background area will pass through the semi-transparent hair area, resulting in ineffective blur effect. nature. Therefore, in the embodiment of the present application, the sample hair mask can be enhanced.

As a specific implementation manner, the sample hair mask may be obtained by performing erosion processing on the background complexity image corresponding to the sample character image. The complex background area can be determined by using the background complexity map of the sample person image, and the hair area around the complex background area in the first hair mask is eroded to reduce the mask area around the complex background area. The image processing model is trained through the enhanced sample hair mask, so that the trained image processing model can reduce the labeling of translucent hair, so as to improve the subsequent image processing effect.

In some embodiments, the image resolution of the first hair mask generated by the electronic device based on the region-of-interest image and the region-segmented image may be relatively small. Therefore, the step optimizes the first hair mask to obtain the original person image The corresponding target hair mask may include

steps

606 and 608 .

Step 606, optimize the first hair mask to obtain a second hair mask.

Since the first hair mask generated by the image processing model is not accurate enough, the first hair mask can be optimized to correct the first hair mask generated by the image processing model to obtain a more accurate and detailed second hair mask. membrane.

As an implementation, if the original character image is a rotated image, since the rotation direction of the first hair mask output by the image processing model may be inconsistent with the original character image, the first The hair mask is rotated so that the direction of the first hair mask is consistent with that of the portrait area in the original person image, and then the rotated first hair mask is optimized.

In some embodiments, the electronic device optimizes the first hair mask to obtain the second hair mask, which may include but not limited to any of the following processing methods, or any of the following processing methods Any combination of processing methods:

Method 1: Calculate the background complexity image corresponding to the ROI image, and perform erosion processing on the first hair mask according to the background complexity image to obtain the second hair mask.

The background complexity image corresponding to the ROI image can include the background complexity of the ROI image, which can be used to describe the complexity of the background area in the ROI image. The more image features the background area contains, the corresponding The complexity can be higher. Because in the image with high background complexity, it is easy for the background area to be mistaken for the foreground area, therefore, the background complexity of the image of the region of interest can be calculated, and the first hair mask can be eroded using the background complexity , to reduce the situation where background regions are mistaken for foreground regions.

As shown in FIG. 8 , in one embodiment, the step of calculating the background complexity image corresponding to the ROI image may include steps 802 - 808 .

Step 802, acquiring a grayscale image of the ROI image.

A grayscale image is an image with only one sampled color per pixel, which appears as a gray scale from black to white. The memory of the electronic device may pre-store the grayscale image corresponding to the original person image, and the grayscale image corresponding to the original person image may be cropped according to the determined region of interest in matting to obtain the grayscale image of the region of interest image. As another implementation manner, after acquiring the image of the region of interest, the electronic device may convert the image of the region of interest from RGB format or YUV format to a grayscale image.

Step 804, performing edge detection on the grayscale image to obtain a first edge image.

The electronic device can use Canny edge detection operator, Laplacian detection operator, DoG detection operator, Sophier detection operator, etc. to perform edge detection on the grayscale image, and obtain the first edge image including all edge information in the grayscale image. It should be noted that the embodiment of the present application does not limit a specific edge detection algorithm.

Step 806: Remove hair edges in the first edge image according to the first hair mask to obtain a second edge image.

The hair region in the first edge image can be determined according to the first hair mask, and the hair edges in the hair region in the first edge image are removed to obtain a second edge image that retains edges other than the hair region. Removing the hair edge in the first edge image can prevent the inaccurate calculation of the background complexity due to the influence of the hair edge on the edge of the background region. Since this application is aimed at the accurate positioning of the hair region, using the first hair mask to remove the hair edge in the first edge image can make the calculated background complexity more accurate and more suitable for the accuracy of the hair region. Positioning scheme.

As another implementation, it is also possible to determine the portrait area in the first edge image according to the region segmentation image corresponding to the region of interest image, and remove the edge in the portrait area in the first edge image to obtain an image that only includes the edge of the background area. Second edge image.

Step 808, perform dilation and blur processing on the second edge image to obtain a background complexity image.

The electronic device can expand and blur the edges in the second edge image, so as to enlarge the edges in the second edge image, make edge features more obvious, and improve the accuracy of background complexity calculation. Wherein, the dilation process is a local maximization operation, and the kernel can be used to perform convolution with the edge in the second edge image, and the pixels covered by the kernel can be calculated to make the edge grow. The blurring processing may adopt Gaussian blurring, mean blurring, median blurring and other processing methods, and the specific dilation processing method and blurring processing method are not limited in the embodiment of the present application.

The background complexity can be calculated according to the dilated and blurred second edge image to obtain a corresponding background complexity image. As a specific implementation, the background complexity can be calculated according to the edges in the background area in the second edge image after dilation and blur processing, and the background area that contains more edges corresponds to a higher background complexity, and contains fewer edges The background region corresponding to the lower background complexity.

In some embodiments, a complexity threshold can be set, and in the entire background area, the area whose background complexity is greater than the complexity threshold can be defined as the background complex area, and the area whose background complexity is less than or equal to the complexity threshold can be defined as A simple area for the background. Different values (such as brightness value, gray value or color value, etc.) can be used to represent the complex background area and the simple background area respectively, so as to obtain the background complexity image.

FIG. 9A is a schematic diagram of calculating background complexity in an embodiment. As shown in Figure 9A, the grayscale image 910 of the image of the region of interest can be obtained first, and edge detection is performed on the grayscale image 910 to obtain the first edge image 920, and then the first hair mask 912 matched with the grayscale image 910 The hair edge in the first edge image 920 is removed to obtain the second edge image 930, and the edge except the hair area is reserved in the second edge image 930. The second edge image 930 may be expanded and blurred to obtain an edge image 940, and then the background complexity is calculated based on the edge image 940 to obtain a background complexity image. Using the edge feature to calculate the background complexity can improve the accuracy of the background complexity, and can further improve the accuracy of the subsequent optimization of the first hair mask by using the background complexity.

The electronic device can perform erosion processing on the first hair mask according to the background complexity image to obtain the second hair mask. As a specific implementation, according to the background complexity image, determine the complex background area in the first hair mask whose complexity is greater than the complexity threshold, and erode the hair area around the complex background area in the first hair mask deal with. In the background complexity image, different values can be used to represent the background complex area and the background simple area. For example, it can be represented by different gray values. The area with a gray value of 255 represents a simple background area, and the area with a gray value of 0 represents Areas with complex backgrounds can also be represented by different color values, white indicates areas with simple backgrounds, and black indicates areas with complex backgrounds, but not limited thereto.

Since the image content contained in the complex background area is relatively rich, it is easy to be mistaken for the foreground area, resulting in inaccurate hair area in the first hair mask. Erosion processing can be performed on the hair area around the complex background area in the first hair mask, so as to reduce the mask around the complex background area and improve the situation that the background area is mistaken for the foreground area. Among them, erosion processing is a local minimum operation, which can be calculated by using the mask around the complex background area in the kernel and the first hair mask, and retaining the pixels covering the kernel, that is, using the complex background area to realize The effect of etching around the mask. Optionally, the first hair mask after corrosion treatment can be directly used as the second hair mask.

As another embodiment, after the first hair mask is subjected to corrosion treatment, the first hair mask before the corrosion treatment (that is, the initially obtained first hair mask) can be combined with the first hair mask after the corrosion treatment. The films are fused to obtain a second hair mask. Optionally, the merging manner may include but not limited to taking an average value for merging, allocating different weight coefficients for merging, and the like.

Specifically, the first hair mask before the corrosion treatment and the first hair mask after the corrosion treatment can be subjected to Alpha fusion processing, and the Alpha fusion treatment can be the first hair mask before the corrosion treatment and the first hair mask after the corrosion treatment. Each pixel in the hair mask is assigned an Alpha value, so that the first hair mask before the erosion process and the first hair mask after the erosion process have different transparency. As an implementation, the background complexity image can be used as the Alpha value of the first hair mask after corrosion processing, and the first hair mask before corrosion processing and the first hair mask after corrosion processing can be compared according to the background complexity image. Each pair of matching pixels in the mask is fused to obtain the second hair mask.

Fig. 9B is a schematic diagram of fusing the first hair mask before the etching treatment and the first hair mask after the etching treatment in one embodiment. As shown in Figure 9B, the first hair mask before the corrosion treatment and the first hair mask after the corrosion treatment can be subjected to Alpha fusion processing, and the formula of Alpha fusion processing can be as shown in formula (3):

I=αI ₁₊ (1-α)I ₂ formula (3);

Wherein, I ₁ represents the first hair mask 954 after the corrosion process, I ₂ represents the first hair mask 952 before the corrosion process, α represents the Alpha value of the first hair mask 954 after the corrosion process, and I represents the result of fusion 958 for the second hair mask. The background complexity image 956 can be used as the Alpha value α of the first hair mask 954 after corrosion processing, and Alpha fusion processing is performed on the first hair mask 954 after corrosion processing and the first hair mask 952 before corrosion processing , to obtain the second hair mask 958 . Fusing the first hair mask before corrosion processing with the first hair mask after corrosion processing, and using the background complexity image as the Alpha value for fusion can improve the accuracy of the obtained second hair mask and improve the background The situation where the area is mistaken for the foreground area improves the effect of subsequent image processing.

Method 2: Fill holes in the hair region of the first hair mask to obtain a second hair mask.

Since there are many strands of human hair, there may be more holes in the hair area. If the image content in the holes is directly determined as the background area, then the background area and the portrait area will be separated and the image processing (such as background virtual processing) will be performed later. etc.), it may cause the problem that the effect of image processing is poor. Therefore, in the embodiment of the present application, holes in the hair region of the first hair mask may be filled.

As a specific implementation manner, the confidence degree of the hair region of the first hair mask may be calculated, and holes in the hair region may be filled according to the confidence degree. Optionally, the first hair mask can be used to determine the hair region in the region of interest, and the first hair mask can be calculated according to the image characteristics (such as edge characteristics, color characteristics, brightness characteristics, etc.) of the hair region in the region of interest. Confidence of the hair region of the membrane. A hair mask region with a higher confidence indicates a higher possibility of belonging to a real hair region and a higher accuracy. It should be noted that other methods may also be used to calculate the confidence level, which is not limited here.

The hair region of the first hair mask can be divided according to the preset confidence threshold, and the hair mask region whose confidence is higher than the confidence threshold is extracted, and the hair mask region is expanded, and further, Erosion processing may also be performed on the hair mask region whose confidence level is not higher than the confidence threshold, so as to achieve the effect of filling holes in the hair region of the first hair mask. Optionally, the first hair mask after the filling treatment can be directly used as the second hair mask.

As another embodiment, the first hair mask after the filling process can also be fused with the first hair mask before the filling process to obtain the second hair mask, and the fusion method can include but not limited to mean fusion, Allocation of different weights for fusion, etc., can also be Alpha fusion, and Alpha fusion is performed on the first hair mask after filling processing and the first hair mask before filling processing according to the set Alpha value. The specific fusion method may be similar to the method of fusing the first hair mask before the corrosion treatment and the first hair mask after the corrosion treatment in the above embodiment, and reference may be made to the relevant description above, which will not be repeated here.

Figure 10 is a schematic diagram of filling holes in the first hair mask in one embodiment. As shown in Figure 10, the holes in the first hair mask 1010 can be filled to obtain a second hair mask with the holes filled, which can improve the subsequent blurring of the hair area when blurring the background area of the original character image. The background inside is also blurred, resulting in blurred portraits, which improves the effect of subsequent image processing.

Mode 3: Enhance the edge of the hair region of the first hair mask to obtain the second hair mask.

The enhancement processing may include but not limited to histogram equalization-based enhancement processing, Laplacian-based enhancement processing, logarithmic-Log transformation-based enhancement processing, etc., which is not limited in this embodiment of the present application. As a specific embodiment, the sigmoid function can be used to enhance the edge of the hair region of the first hair mask, and the sigmoid function is used to calculate the pixels on the edge of the hair region of the first hair mask to obtain the second Hair mask.

Fig. 11 is a schematic diagram of enhancing the hair region of the first hair mask in an embodiment. As shown in FIG. 11 , the edge of the hair region of the first hair mask 1110 may be enhanced to obtain a second hair mask 1120 with clearer edges. By performing enhancement processing on the edge of the hair region of the first hair mask, the subsequently obtained foreground portrait region can be made clearer and the image processing effect can be improved.

Method 4: If the image scene corresponding to the original person image is the target scene, soften the edges of the hair region of the first hair mask to obtain a second hair mask.

In the embodiment of the present application, the target scene is a scene with a scene brightness value lower than a brightness threshold, such as a night scene, a dark indoor scene, and the like. In a darker target scene, when virtualizing the background area of the original character image, if the edge definition of the foreground character image is high, it may cause the blurred edge to look unnatural and affect the image processing effect . Therefore, in the embodiment of the present application, it can first be judged whether the image scene corresponding to the original person image is the target scene, and if it is the target scene, the edge of the hair region of the first hair mask can be softened so that the first The edges of the hair area of the hair mask are blurred to improve the image after subsequent bokeh processing.

Optionally, the softening process may use Gaussian filtering, mean filtering, median filtering and other processing manners, which are not limited herein.

In some embodiments, it can be judged whether the image scene corresponding to the original character image is the target scene through the scene classification model. The scene classification model can be obtained by training according to a large number of sample images of the target scene. The scene classification model can extract the original character image image features, and judge whether the original character image belongs to the target scene according to the image features.

In some embodiments, the electronic device can acquire the sensitivity value (ISO) corresponding to the original character image, and the sensitivity value can be used to measure the sensitivity of the film to light. If the original person image is an image captured by the electronic device in real time through the camera, the current photosensitive value of the camera can be obtained directly; if the original person image is an image stored in the memory, the shooting parameters related to the original person image can be read from the memory , so as to obtain the sensitivity value.

It can judge whether the photosensitive value corresponding to the original character image is greater than the photosensitive threshold. If the photosensitive value is greater than the photosensitive threshold, it means that the photosensitive value of the original character image is higher. To more low light, suitable for use in low-light scenes. Therefore, if the photosensitive value corresponding to the original person image is greater than the photosensitive threshold, the image scene corresponding to the original person image can be determined as the target scene, and the edge of the hair region of the first hair mask can be softened. The photosensitivity threshold may be an empirical value obtained through multiple experiments and tests.

It should be noted that other methods may also be used to determine whether the image scene corresponding to the original character image is the target scene, which is not limited in this embodiment of the present application.

Fig. 12 is a schematic diagram of softening the first hair mask in one embodiment. As shown in FIG. 12 , when the original character image is an image in a dark target scene, the edge of the hair region of the first hair mask 1210 can be softened to obtain a second hair mask 1220 with blurred edges . By softening the edge of the hair area of the first hair mask, it is possible to make the edge transition of the portrait more natural and improve the blur effect when the background area of the original character image in the target scene is subsequently blurred. .

It should be noted that the above several optimization processing methods can be combined at will. For example, the background complexity image corresponding to the image of the region of interest can be calculated first, and the first hair mask is corroded according to the background complexity image, and then Holes in the hair region of the first hair mask after the erosion treatment are filled to obtain a second hair mask.

For another example, the background complexity image corresponding to the region-of-interest image can be calculated first, and the first hair mask is eroded according to the background complexity image, and then the holes in the hair region of the eroded first hair mask are Carry out filling, then carry out enhancement processing to the edge of the hair region of the first hair mask after filling, if the image scene corresponding to the original character image is the target scene, then the hair region of the first hair mask after the enhancement processing can be The edge of the image is softened to obtain a second hair mask. If the image scene corresponding to the original person image is not the target scene, the enhanced first hair mask can be used as the second hair mask.

There may be multiple combinations, and the sequence of different processing methods is not limited in the embodiment of the present application, and the optimization processing methods of various combinations are not listed here one by one.

In the embodiment of the present application, by optimizing the first hair mask, a more detailed and accurate second hair mask can be obtained, which can improve the subsequent image processing of the original person image such as foreground and background separation. image processing effects.

Step 608: Perform upsampling and filtering on the second hair mask to obtain a target hair mask corresponding to the original person image.

Since the resolution of the first hair mask output by the image processing model is small, the resolution of the second hair mask obtained after optimizing the first hair mask is also low, and the second hair mask can be Perform upsampling and filtering processing, enlarge the second hair mask, and obtain the target hair mask matching the original character image, so as to use the target hair mask to accurately locate the hair area in the original character image.

In some embodiments, the grayscale image of the region-of-interest image may be used as a guide image of the guide filter, and the guide filter is used to perform upsampling filtering on the second hair mask to obtain the target hair mask. When the guide filter performs upsampling filtering on the second hair mask, it can refer to the image information of the grayscale image of the region of interest image, so that the texture and edge characteristics of the output target hair mask are similar to the grayscale image .

Fig. 13 is a schematic diagram of performing upsampling filtering on the second hair mask through a guided filter in an embodiment. As shown in Figure 13, the size of the second hair mask 1310 can be enlarged first to obtain the enlarged second hair mask 1320, and then the grayscale image 1330 is used as the guide image, and the enlarged second hair mask 1320 can be obtained through the guide filter. The hair mask 1320 undergoes guided filtering to obtain a target hair mask 1330 .

In an embodiment, the electronic device may also perform upsampling filtering on the second hair mask according to the background complexity of the image of the region of interest. In the case that the background complexity of the region of interest image is low, it means that the background of the region of interest image is relatively simple, then the guided filter can be used to perform upsampling filtering on the second hair mask; in the background of the region of interest image In the case of high complexity, it means that the background of the image of the region of interest is complex, and the bilinear interpolation algorithm can be directly used to perform upsampling and filtering on the second hair mask. This can prevent the problem that the background area is mistaken for the hair area when the background of the image of the region of interest is relatively complex, and improve the accuracy of the target hair mask.

As a specific implementation, the electronic device can first divide the second hair mask into regions according to the background complexity image corresponding to the region of interest image, and obtain the simple background region and the complex background region. Or the background area equal to the complexity threshold, the background complex area is the background area whose complexity is higher than the complexity threshold. For dividing the simple background region and the complex background region, reference may be made to the relevant description in the first method of optimizing the first hair mask in the above embodiment, and details are not repeated here.

For simple background areas and complex background areas, different filtering methods can be used for upsampling filtering processing. For simple background areas, guided filtering can be used for upsampling filtering. The grayscale image of the region of interest image can be used as the guiding image of the guiding filter, and the hair area around the simple background area in the second hair mask is subjected to upsampling filtering through the guiding filter to obtain the first filtering result.

For areas with complex backgrounds, a bilinear interpolation algorithm can be used for upsampling filtering. A bilinear interpolation algorithm may be used to perform upsampling and filtering on the hair region around the complex background region in the second hair mask to obtain a second filtering result. The bilinear interpolation algorithm is a linear interpolation extension of the interpolation function with two variables. Its core idea is to perform a linear interpolation in two directions respectively. The bilinear interpolation algorithm uses the known pixels in the second hair mask Interpolation is performed on the enlarged unknown pixels, and for each pixel that needs to be interpolated, it can be calculated based on four known pixels.

After obtaining the first filtering result and the second filtering result, the electronic device can fuse the first filtering result and the second filtering result to obtain the target hair mask. As an implementation, the first filtering result and the second filtering result can be subjected to Alpha fusion processing, and the background complexity image can be used as the Alpha value of the second filtering result, and the background complexity image can be used to analyze the first filtering result and the second filtering result. Alpha fusion processing is performed on the filtering results to obtain the target mask image.

For the mask area of the second hair mask around the background area of different complexity, different upsampling filtering methods can be used respectively, which can reduce the situation where the background area is mistaken for the hair area and improve the accuracy of the target hair mask . It should be noted that other upsampling filtering processing methods may also be used, such as a bi-cubic interpolation algorithm, a nearest neighbor interpolation algorithm, etc., which are not limited in this embodiment of the present application.

In some embodiments, after obtaining the target hair mask, the electronic device may blur the background area of the original person image according to the target hair mask to obtain the target person image. The hair area of the original person image can be determined according to the target hair mask, so that the image area can be accurately determined and the separation of the image area and the background area can be realized. The separated background area can be blurred, and then the blurred background area and the portrait area can be spliced to obtain the target person image. After the background area is blurred, the portrait area can be highlighted. Because the target hair mask carefully and accurately locates the hair area, it can realize the separation of the hair-level portrait area and the background area, improve the accuracy of the separation of the foreground and the background, and make the image of the target person obtained after the blurring process more natural , which improves the bokeh effect of the image.

In the embodiment of the present application, after obtaining the image of the region of interest and the corresponding region segmentation image, the first hair mask can be generated through the image processing model, and the first hair mask output by the image processing model can be optimized and corrected. Obtain a more refined and accurate second hair mask, and then perform upsampling filtering on the second hair mask to obtain a higher-resolution target hair mask, so that the fineness and accuracy of the target hair mask are higher. The target hair mask can be used to accurately locate the hair region in the original person image, thereby improving the image processing effect of subsequent image processing such as foreground and background separation on the original person image.

As shown in FIG. 14 , in one embodiment, an image processing apparatus 1400 is provided, which can be applied to the above-mentioned electronic equipment. The image processing device 1400 may include a preprocessing module 1410 , a mask generation module 1420 and an optimization module 1430 .

The preprocessing module 1410 is configured to preprocess the original person image to obtain an ROI image of the original person image and a region segmentation image corresponding to the ROI image, where the region segmentation image includes portrait region information of the ROI image.

The mask generation module 1420 is configured to generate a first hair mask according to the ROI image and the region segmentation image.

The optimization module 1430 is configured to optimize the first hair mask to obtain a target hair mask corresponding to the original character image.

In the embodiment of the present application, by preprocessing the original person image, the ROI image of the original person image and the region segmentation image corresponding to the ROI image are obtained, and the ROI image and the region segmentation image are generated according to the ROI image and the region segmentation image. the first hair mask, and optimize the first hair mask to obtain the target hair mask corresponding to the original character image, after generating the first hair mask, optimize and correct the first hair mask, A finer and more accurate target hair mask is obtained, which can be used to accurately locate the hair region in the original person image, thereby improving the image processing effect of subsequent image processing such as foreground and background separation on the original person image .

In one embodiment, the preprocessing module 1410 includes a determining unit and a cropping unit.

The determination unit is configured to determine the matting region of interest in the original person image according to the original person image and the portrait segmentation image corresponding to the original person image, the portrait segmentation image is an image obtained after portrait extraction from the original person image, and the portrait The segmented image includes portrait region information of the original person image.

In one embodiment, the determination unit is further configured to obtain the hair segmentation image corresponding to the original person image, calculate the hair contour line according to the hair segmentation image and the portrait segmentation image, and determine the cutout in the original person image according to the hair contour line area of interest. Wherein, the hair segmentation image is an image obtained by performing hair segmentation on the original person image, and the hair segmentation image includes hair region information of the original person image.

In one embodiment, the determining unit is further configured to determine the face area in the original person image, and obtain an initial region of interest according to the face area, and place the hair contour line on the abscissa axis and the ordinate axis of the original person image respectively. Perform projection to obtain the first projection distribution of the hair contour on the abscissa axis and the second projection distribution on the ordinate axis, and correct the initial region of interest according to the first projection distribution and the second projection distribution to obtain a sense of matting area of interest.

In one embodiment, the preprocessing module 1410 further includes a correction unit.

The correcting unit is configured to correct the original character image and the segmented portrait image corresponding to the original character image if the original character image is a rotated image.

The determination unit is also used to determine the corrected matting region of interest according to the corrected original character image and the corrected portrait segmentation image, and to adjust the corrected matting sense according to the rotation direction of the uncorrected original character image. The region of interest is rotated to obtain the matted region of interest in the uncorrected original person image.

The cropping unit is configured to respectively crop the original person image and the segmented portrait image according to the region of interest in the cutout, to obtain the region of interest image and the region segmentation image corresponding to the region of interest image.

In one embodiment, the mask generation module 1420 is further configured to input the region-of-interest image and the region-segmented image into the image processing model, and process the region-of-interest image and the region-segmented image through the image processing model to obtain the first hair mask The image processing model is obtained by training according to multiple sets of sample training images, and each set of sample training images includes a sample person image, a sample portrait segmentation image corresponding to the sample person image, and a sample hair mask.

In one embodiment, the sample hair mask is obtained by performing erosion processing on the background complexity image corresponding to the sample person image.

In one embodiment, the optimization module 1430 includes an optimization sub-module and a filtering sub-module.

The optimization sub-module is used to optimize the first hair mask to obtain the second hair mask.

The filtering sub-module is configured to perform upsampling and filtering on the second hair mask to obtain a target hair mask corresponding to the original person image.

In an embodiment, the optimization sub-module may include one or more of an erosion unit, a filling unit, an enhancement unit, and a softening unit.

The erosion unit is configured to calculate a background complexity image corresponding to the region of interest image, and perform erosion processing on the first hair mask according to the background complexity image to obtain a second hair mask.

In one embodiment, the erosion unit is further configured to acquire a grayscale image of the region of interest image, perform edge detection on the grayscale image to obtain a first edge image, and remove the grayscale image in the first edge image according to the first hair mask The hair edge is obtained by obtaining a second edge image, and the second edge image is expanded and blurred to obtain a background complexity image.

In one embodiment, the corrosion unit is further configured to determine, according to the background complexity image, a complex background area in the first hair mask whose complexity is greater than a complexity threshold, and to perform a calculation on the hairs around the complex background area in the first hair mask. Erosion processing is performed on the region, and the first hair mask before the etching processing is fused with the first hair mask after the etching processing to obtain a second hair mask.

The filling unit is configured to fill holes in the hair region of the first hair mask to obtain a second hair mask.

The enhancement unit is configured to perform enhancement processing on edges of the hair region of the first hair mask to obtain a second hair mask.

The softening unit is used to soften the edge of the hair region of the first hair mask if the image scene corresponding to the original character image is the target scene to obtain the second hair mask. The target scene is that the scene brightness value is lower than The brightness threshold of the scene.

In one embodiment, the softening unit is further configured to acquire the photosensitive value corresponding to the original person image, and if the photosensitive value is greater than the photosensitive threshold, then determine that the image scene corresponding to the original person image is the target scene, and apply the first hair mask The edges of the hair area are softened to get the second hair mask.

In one embodiment, the optimization module 1430 is also used to calculate the background complexity image corresponding to the image of the region of interest through the erosion unit, and perform erosion processing on the first hair mask according to the background complexity image, and then perform erosion processing on the first hair mask through the filling unit. The holes in the hair region of the processed first hair mask are filled, and then the edge of the hair region of the filled first hair mask is enhanced by increasing the unit, and used to determine the original character if the original character is determined by the softening unit The image scene corresponding to the image is the target scene, then the edge of the hair region of the enhanced first hair mask is softened to obtain the second hair mask, if the image scene corresponding to the original person image is not the target scene , the enhanced first hair mask is used as the second hair mask.

In one embodiment, the filtering sub-module is further configured to use the grayscale image of the region of interest image as the guiding image of the guiding filter, and perform upsampling and filtering processing on the second hair mask through the guiding filter to obtain the original person image The corresponding target hair mask.

In one embodiment, the filtering sub-module is further configured to divide the second hair mask according to the background complexity image corresponding to the region of interest image to obtain simple background regions and complex background regions, and convert the gray area of the region of interest image to The degree image is used as the guide image of the guide filter, and the hair region around the simple background area in the second hair mask is upsampled and filtered by the guide filter to obtain the first filtering result, and the second hair mask is processed by bilinear interpolation algorithm In the second hair mask, the hair region around the complex background region is subjected to upsampling and filtering processing to obtain a second filtering result, and then the first filtering result and the second filtering result are fused to obtain a target hair mask. Wherein, the background simple area is the background area whose complexity is lower than or equal to the complexity threshold, and the background complex area is the background area whose complexity is higher than the complexity threshold.

In one embodiment, the above-mentioned image processing apparatus 1400 includes a blurring module in addition to the preprocessing module 1410 , the mask generation module 1420 and the optimization module 1430 .

The blurring module is configured to blur the background area of the original character image according to the target hair mask to obtain the target character image.

Fig. 15 is a structural block diagram of an electronic device in one embodiment. As shown in FIG. 15 , an electronic device 1500 may include one or more of the following components: a processor 1510, a memory 1520 coupled to the processor 1510, wherein the memory 1520 may store one or more computer programs, one or more computer programs It may be configured to implement the methods described in the foregoing embodiments when executed by one or more processors 1510 .

Processor 1510 may include one or more processing cores. The processor 1510 uses various interfaces and circuits to connect various parts of the entire electronic device 1500, and executes or executes instructions, programs, code sets or instruction sets stored in the memory 1520, and calls data stored in the memory 1520 to execute Various functions of the electronic device 1500 and processing data. Optionally, the processor 1510 may adopt at least one of Digital Signal Processing (Digital Signal Processing, DSP), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), and Programmable Logic Array (Programmable Logic Array, PLA). implemented in the form of hardware. The processor 1510 may integrate one or a combination of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU) and a modem. Among them, the CPU mainly handles the operating system, user interface and application programs, etc.; the GPU is used to render and draw the displayed content; the modem is used to handle wireless communication. It can be understood that, the above-mentioned modem may not be integrated into the processor 1510, but may be realized by a communication chip alone.

The memory 1520 may include random access memory (Random Access Memory, RAM), and may also include read-only memory (Read-Only Memory, ROM). The memory 1520 may be used to store instructions, programs, codes, sets of codes or sets of instructions. The memory 1520 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system and instructions for implementing at least one function (such as a touch function, a sound playback function, an image playback function, etc.) , instructions for implementing the foregoing method embodiments, and the like. The storage data area can also store data created by the electronic device 1500 during use, and the like.

It can be understood that the electronic device 1500 may include more or fewer structural elements than those in the above structural block diagram, for example, including a power module, a physical button, a WiFi (Wireless Fidelity, wireless fidelity) module, a speaker, a Bluetooth module, a sensor, etc. , and may not be limited here.

The embodiment of the present application discloses a computer-readable storage medium, which stores a computer program, wherein, when the computer program is executed by a processor, the methods described in the above-mentioned embodiments are implemented.

The embodiment of the present application discloses a computer program product, which includes a non-transitory computer-readable storage medium storing a computer program, and the computer program can be executed by a processor to implement the methods described in the foregoing embodiments.

Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be realized through computer programs to instruct related hardware, and the programs can be stored in a non-volatile computer-readable storage medium When the program is executed, it may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a ROM, or the like.

Any reference to memory, storage, database or other medium as used herein may include non-volatile and/or volatile memory. Suitable non-volatile memory may include ROM, Programmable ROM (PROM), Erasable PROM (Erasable PROM, EPROM), Electrically Erasable PROM (Electrically Erasable PROM, EEPROM) or flash memory. Volatile memory can include random access memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM can take many forms, such as static RAM (Static RAM, SRAM), dynamic RAM (Dynamic Random Access Memory, DRAM), synchronous DRAM (synchronous DRAM, SDRAM), double data rate SDRAM (Double Data Rate) Data Rate SDRAM, DDR SDRAM), enhanced SDRAM (Enhanced Synchronous DRAM, ESDRAM), synchronous link DRAM (Synchlink DRAM, SLDRAM), memory bus direct RAM (Rambus DRAM, RDRAM) and direct memory bus dynamic RAM (Direct Rambus DRAM) , DRDRAM).

It should be understood that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic related to the embodiment is included in at least one embodiment of the present application. Thus, appearances of "in one embodiment" or "in an embodiment" in various places throughout the specification are not necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. Those skilled in the art should also know that the embodiments described in the specification are all optional embodiments, and the actions and modules involved are not necessarily required by this application.

In various embodiments of the present application, it should be understood that the sequence numbers of the above-mentioned processes do not necessarily mean the order of execution. The implementation of the examples constitutes no limitation.

The units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, located in one place, or distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

An image processing method, device, electronic device, and computer-readable storage medium disclosed in the embodiments of the present application have been described above in detail. In this paper, specific examples are used to illustrate the principles and implementation methods of the present application. The above embodiments The description is only used to help understand the method and core idea of the present application. At the same time, for those skilled in the art, based on the idea of this application, there will be changes in the specific implementation and application scope. In summary, the content of this specification should not be construed as limiting the application.

Claims

An image processing method, characterized in that, comprising:

Preprocessing the original person image to obtain a region of interest image of the original person image, and a region segmentation image corresponding to the region of interest image, where the region segmentation image includes portrait region information of the region of interest image ;

generating a first hair mask according to the region-of-interest image and the region-segmented image;

Optimizing the first hair mask to obtain a target hair mask corresponding to the original person image.
The method according to claim 1, wherein the preprocessing of the original person image to obtain the region of interest image of the original person image and the region segmentation image corresponding to the region of interest image includes :

According to the original person image, and the portrait segmentation image corresponding to the original person image, determine the matting region of interest in the original person image, and the portrait segmentation image is obtained after portrait extraction is performed on the original person image image, the portrait segmentation image includes the portrait area information of the original person image;

The original person image and the segmented portrait image are clipped respectively according to the cutout region of interest to obtain a region of interest image and a region segmentation image corresponding to the region of interest image.
The method according to claim 2, characterized in that, according to the original person image, and the portrait segmentation image corresponding to the original person image, determining the matting region of interest in the original person image includes:

Acquiring a hair segmentation image corresponding to the original person image, the hair segmentation image is an image obtained by performing hair segmentation on the original person image, and the hair segmentation image includes hair region information of the original person image;

calculating hair contour lines according to the hair segmentation image and the portrait segmentation image;

A matting region of interest in the original person image is determined according to the hair contour line.
The method according to claim 3, wherein said determining the matting region of interest in the original person image according to the hair contour line comprises:

Determining the face area in the original person image, and obtaining an initial region of interest according to the face area;

Project the hair contour line on the abscissa axis and the ordinate axis of the original character image respectively, to obtain the first projection distribution of the hair contour line on the abscissa axis and the first projected distribution on the ordinate axis. Two-projection distribution;

The initial region of interest is corrected according to the first projection distribution and the second projection distribution to obtain a matted region of interest.
The method according to any one of claims 2 to 4, characterized in that, according to the original character image and the segmented image corresponding to the original character image, it is determined that the cutout in the original character image is of interest Before the region, the method also includes:

If the original character image is a rotated image, correcting the original character image and the segmented portrait image corresponding to the original character image;

The determining the region of interest in matting in the original character image according to the original character image and the segmented portrait image corresponding to the original character image includes:

Determine the corrected matting region of interest according to the corrected original person image and the corrected portrait segmentation image;

The corrected matting region of interest is rotated according to the rotation direction of the uncorrected original person image to obtain the matting region of interest in the uncorrected original person image.
The method according to claim 1, wherein said optimizing the first hair mask to obtain the target hair mask corresponding to the original character image comprises:

Optimizing the first hair mask to obtain a second hair mask;

Perform upsampling and filtering processing on the second hair mask to obtain a target hair mask corresponding to the original person image.
The method according to claim 6, wherein said optimizing said first hair mask to obtain a second hair mask comprises:

Calculating the background complexity image corresponding to the ROI image;

Erosion processing is performed on the first hair mask according to the background complexity image to obtain a second hair mask.
The method according to claim 7, wherein the calculating the background complexity image corresponding to the ROI image comprises:

Acquiring a grayscale image of the image of the region of interest;

performing edge detection on the grayscale image to obtain a first edge image;

removing hair edges in the first edge image according to the first hair mask to obtain a second edge image;

Dilation and blurring are performed on the second edge image to obtain a background complexity image.
The method according to claim 7, wherein said first hair mask is corroded according to said background complexity image to obtain a second hair mask, comprising:

According to the background complexity image, determine a background complex area in the first hair mask whose complexity is greater than a complexity threshold;

Erosion processing is performed on the hair area around the background complex area in the first hair mask;

The first hair mask before the corrosion treatment is fused with the first hair mask after the corrosion treatment to obtain a second hair mask.
The method according to claim 6, wherein said optimizing said first hair mask to obtain a second hair mask comprises:

Filling holes in the hair region of the first hair mask to obtain a second hair mask.
The method according to claim 6, wherein said optimizing said first hair mask to obtain a second hair mask comprises:

The edge of the hair region of the first hair mask is enhanced to obtain a second hair mask.
The method according to claim 6, wherein said optimizing said first hair mask to obtain a second hair mask comprises:

If the image scene corresponding to the original character image is the target scene, soften the edges of the hair region of the first hair mask to obtain a second hair mask, and the target scene is a scene whose brightness value is lower than The brightness threshold of the scene.
The method according to claim 12, wherein if the image scene corresponding to the original person image is the target scene, before blurring the edge of the hair region of the first hair mask, the The method also includes:

Acquiring the photosensitive value corresponding to the original person image;

If the light-sensing value is greater than the light-sensing threshold, it is determined that the image scene corresponding to the original person image is the target scene.
The method according to claim 6, wherein said optimizing said first hair mask to obtain a second hair mask comprises:

Calculating the background complexity image corresponding to the ROI image;

performing erosion processing on the first hair mask according to the background complexity image;

filling holes in the hair region of the first hair mask after the erosion process;

performing enhancement processing on the edges of the hair region of the filled first hair mask;

If the image scene corresponding to the original person image is the target scene, softening the edge of the hair region of the enhanced first hair mask to obtain a second hair mask;

If the image scene corresponding to the original person image is not the target scene, the enhanced first hair mask is used as the second hair mask.
The method according to any one of claims 6-14, wherein the upsampling and filtering process on the second hair mask to obtain the target hair mask corresponding to the original person image comprises:

Using the grayscale image of the region of interest image as the guide image of the guide filter, performing upsampling filtering on the second hair mask through the guide filter, to obtain the target hair mask corresponding to the original person image membrane.
The method according to claim 15, characterized in that, in the guide image using the grayscale image of the region of interest image as a guide filter, the second hair mask is up-sampled through the guide filter Before the filtering process, the method also includes:

According to the background complexity image corresponding to the region of interest image, the second hair mask is divided into regions to obtain a simple background region and a complex background region, and the simple background region is a complexity lower than or equal to a complexity threshold. a background area, the complex background area is a background area whose complexity is higher than the complexity threshold;

The grayscale image of the region of interest image is used as the guide image of the guide filter, and the second hair mask is subjected to upsampling filtering through the guide filter to obtain the target corresponding to the original person image Hair masks, including:

Using the grayscale image of the region of interest image as the guide image of the guide filter, performing upsampling filter processing on the hair region around the background simple region in the second hair mask through the guide filter, Obtain the first filtering result;

Using a bilinear interpolation algorithm to perform upsampling and filtering on the hair area around the complex background area in the second hair mask to obtain a second filtering result;

The first filtering result and the second filtering result are fused to obtain a target hair mask.
The method according to any one of claims 1-4, 6-14, wherein said generating a first hair mask according to said region of interest image and said region segmentation image comprises:

Inputting the region-of-interest image and the region-segmented image into an image processing model, and processing the region-of-interest image and the region-segmented image through the image processing model to obtain a first hair mask, wherein the The image processing model is obtained by training according to multiple sets of sample training images, and each set of sample training images includes a sample person image, a sample portrait segmentation image corresponding to the sample person image, and a sample hair mask.
The method according to claim 17, wherein the sample hair mask is obtained after corrosion processing on the background complexity image corresponding to the sample character image.
The method according to any one of claims 1-4, 6-14, characterized in that the method further comprises:

The background area of the original person image is blurred according to the target hair mask to obtain the target person image.
An image processing device, characterized in that it comprises:

A preprocessing module, configured to preprocess the original person image to obtain an image of a region of interest of the original person image, and a region segmentation image corresponding to the image of the region of interest, the region segmentation image includes the region of interest Portrait area information of the area image;

A mask generating module, configured to generate a first hair mask according to the region-of-interest image and the region-segmented image;

An optimization module, configured to optimize the first hair mask to obtain a target hair mask corresponding to the original person image.
An electronic device, characterized in that it includes a memory and a processor, wherein a computer program is stored in the memory, and when the computer program is executed by the processor, the processor is made to perform the following steps:

Preprocessing the original person image to obtain a region of interest image of the original person image, and a region segmentation image corresponding to the region of interest image, where the region segmentation image includes portrait region information of the region of interest image ;

generating a first hair mask according to the region-of-interest image and the region-segmented image;

Optimizing the first hair mask to obtain a target hair mask corresponding to the original person image.
The electronic device according to claim 21, wherein the preprocessing is performed on the original person image to obtain an ROI image of the original person image and a region segmentation image corresponding to the ROI image, include:

According to the original person image, and the portrait segmentation image corresponding to the original person image, determine the matting region of interest in the original person image, and the portrait segmentation image is obtained after portrait extraction is performed on the original person image image, the portrait segmentation image includes the portrait area information of the original person image;

The original person image and the segmented portrait image are clipped respectively according to the cutout region of interest to obtain a region of interest image and a region segmentation image corresponding to the region of interest image.
The electronic device according to claim 22, characterized in that, according to the original character image and the segmented portrait image corresponding to the original character image, determining the matting region of interest in the original character image comprises:

Acquiring a hair segmentation image corresponding to the original person image, the hair segmentation image is an image obtained by performing hair segmentation on the original person image, and the hair segmentation image includes hair region information of the original person image;

calculating hair contour lines according to the hair segmentation image and the portrait segmentation image;

A matting region of interest in the original person image is determined according to the hair contour line.
The electronic device according to claim 23, wherein the determining the matting region of interest in the original person image according to the hair contour line comprises:

Determining the face area in the original person image according to the hair contour line;

Obtaining an initial region of interest according to the face region;

Project the hair contour line on the abscissa axis and the ordinate axis of the original character image respectively, to obtain the first projection distribution of the hair contour line on the abscissa axis and the first projected distribution on the ordinate axis. Two-projection distribution;

The initial region of interest is corrected according to the first projection distribution and the second projection distribution to obtain a matted region of interest.
The electronic device according to any one of claims 22-24, characterized in that, when the computer program is executed by the processor, the processor is executed according to the original character image, and with the original character image For the segmented portrait image corresponding to the image, before the step of determining the matting region of interest in the original character image, a step is also performed: if the original character image is a rotated image, the original character image and the The portrait segmentation image corresponding to the original person image is corrected;

The determining the region of interest in matting in the original character image according to the original character image and the segmented portrait image corresponding to the original character image includes:

Determine the corrected matting region of interest according to the corrected original person image and the corrected portrait segmentation image;

The corrected matting region of interest is rotated according to the rotation direction of the uncorrected original person image to obtain the matting region of interest in the uncorrected original person image.
The electronic device according to claim 20, wherein said optimizing the first hair mask to obtain a target hair mask corresponding to the original character image comprises:

Optimizing the first hair mask to obtain a second hair mask;

Perform upsampling and filtering processing on the second hair mask to obtain a target hair mask corresponding to the original person image.
The electronic device according to claim 26, wherein said optimizing the first hair mask to obtain a second hair mask comprises:

Calculating the background complexity image corresponding to the ROI image;

Erosion processing is performed on the first hair mask according to the background complexity image to obtain a second hair mask.
The electronic device according to claim 27, wherein the calculating the background complexity image corresponding to the ROI image comprises:

Acquiring a grayscale image of the image of the region of interest;

performing edge detection on the grayscale image to obtain a first edge image;

removing hair edges in the first edge image according to the first hair mask to obtain a second edge image;

Dilation and blurring are performed on the second edge image to obtain a background complexity image.
The electronic device according to claim 27, wherein the etching process is performed on the first hair mask according to the background complexity image to obtain a second hair mask, comprising:

According to the background complexity image, determine a background complex area in the first hair mask whose complexity is greater than a complexity threshold;

Erosion processing is performed on the hair area around the background complex area in the first hair mask;

The first hair mask before the corrosion treatment is fused with the first hair mask after the corrosion treatment to obtain a second hair mask.
The electronic device according to claim 26, wherein said optimizing the first hair mask to obtain a second hair mask comprises:

Filling holes in the hair region of the first hair mask to obtain a second hair mask.
The electronic device according to claim 26, wherein said optimizing the first hair mask to obtain a second hair mask comprises:

The edge of the hair region of the first hair mask is enhanced to obtain a second hair mask.
The electronic device according to claim 26, wherein said optimizing the first hair mask to obtain a second hair mask comprises:

If the image scene corresponding to the original character image is the target scene, soften the edges of the hair region of the first hair mask to obtain a second hair mask, and the target scene is a scene whose brightness value is lower than The brightness threshold of the scene.
The electronic device according to claim 32, characterized in that, when the computer program is executed by the processor, the processor is executed to execute the said if the image scene corresponding to the original character image is the target scene, then Before the step of blurring the edge of the hair region of the first hair mask, the following steps are also performed:

Acquiring the photosensitive value corresponding to the original person image;

If the light-sensing value is greater than the light-sensing threshold, it is determined that the image scene corresponding to the original person image is the target scene.
The electronic device according to claim 26, wherein said optimizing the first hair mask to obtain a second hair mask comprises:

Calculating the background complexity image corresponding to the ROI image;

performing erosion processing on the first hair mask according to the background complexity image;

filling holes in the hair region of the first hair mask after the erosion process;

performing enhancement processing on the edges of the hair region of the filled first hair mask;

If the image scene corresponding to the original person image is the target scene, softening the edge of the hair region of the enhanced first hair mask to obtain a second hair mask;

If the image scene corresponding to the original person image is not the target scene, the enhanced first hair mask is used as the second hair mask.
The electronic device according to any one of claims 26 to 34, wherein the upsampling and filtering of the second hair mask to obtain the target hair mask corresponding to the original person image includes:

Using the grayscale image of the region of interest image as the guide image of the guide filter, performing upsampling filtering on the second hair mask through the guide filter, to obtain the target hair mask corresponding to the original person image membrane.
The electronic device according to claim 35, wherein when the computer program is executed by the processor, the processor makes the grayscale image of the region-of-interest image as a guide filter when executing the Before the step of performing upsampling and filtering on the second hair mask by the guide filter, a step is also performed: according to the background complexity image corresponding to the region of interest image, the second hair mask is The mask is divided into regions to obtain a simple background region and a complex background region. The simple background region is a background region whose complexity is lower than or equal to the complexity threshold, and the complex background region is a background region whose complexity is higher than the complexity threshold. background area;

The grayscale image of the region of interest image is used as the guide image of the guide filter, and the second hair mask is subjected to upsampling filtering through the guide filter to obtain the target corresponding to the original person image Hair masks, including:

Using the grayscale image of the region of interest image as the guide image of the guide filter, performing upsampling filter processing on the hair region around the background simple region in the second hair mask through the guide filter, Obtain the first filtering result;

Using a bilinear interpolation algorithm to perform upsampling and filtering on the hair area around the complex background area in the second hair mask to obtain a second filtering result;

The first filtering result and the second filtering result are fused to obtain a target hair mask.
The electronic device according to any one of claims 21-24, 26-34, wherein said generating a first hair mask according to said region-of-interest image and said region-segmented image comprises:

Inputting the region-of-interest image and the region-segmented image into an image processing model, and processing the region-of-interest image and the region-segmented image through the image processing model to obtain a first hair mask, wherein the The image processing model is obtained by training according to multiple sets of sample training images, and each set of sample training images includes a sample person image, a sample portrait segmentation image corresponding to the sample person image, and a sample hair mask.
The electronic device according to claim 37, wherein the sample hair mask is obtained by performing corrosion processing on the background complexity image corresponding to the sample character image.
The electronic device according to any one of claims 21-24, 26-34, characterized in that, when the computer program is executed by the processor, it also causes the processor to perform the following steps:

The background area of the original person image is blurred according to the target hair mask to obtain the target person image.
A computer-readable storage medium with a computer program stored thereon, wherein when the computer program is executed by a processor, the processor is made to perform the following steps:

Preprocessing the original person image to obtain a region-of-interest image of the original person image, and a region segmentation image corresponding to the region-of-interest image, where the region segmentation image includes portrait region information of the region-of-interest image ;

generating a first hair mask according to the region-of-interest image and the region-segmented image;

Optimizing the first hair mask to obtain a second hair mask;

Perform upsampling and filtering processing on the second hair mask to obtain a target hair mask corresponding to the original person image.
The computer-readable storage medium according to claim 40, wherein the preprocessing is performed on the original character image to obtain the ROI image of the original character image and the region corresponding to the ROI image Segment images, including:

According to the original person image, and the portrait segmentation image corresponding to the original person image, determine the matting region of interest in the original person image, and the portrait segmentation image is obtained after portrait extraction is performed on the original person image image, the portrait segmentation image includes the portrait area information of the original person image;

The original person image and the segmented portrait image are clipped respectively according to the cutout region of interest to obtain a region of interest image and a region segmentation image corresponding to the region of interest image.
The computer-readable storage medium according to claim 41, wherein, according to the original character image and the segmented portrait image corresponding to the original character image, the matting region of interest in the original character image is determined ,include:

Acquiring a hair segmentation image corresponding to the original person image, the hair segmentation image is an image obtained by performing hair segmentation on the original person image, and the hair segmentation image includes hair region information of the original person image;

calculating hair contour lines according to the hair segmentation image and the portrait segmentation image;

A matting region of interest in the original person image is determined according to the hair contour line.
The computer-readable storage medium according to claim 42, wherein the determining the matting region of interest in the original person image according to the hair contour line comprises:

Determining the face area in the original person image according to the hair contour line;

Obtaining an initial region of interest according to the face region;

Project the hair contour line on the abscissa axis and the ordinate axis of the original character image respectively, to obtain the first projection distribution of the hair contour line on the abscissa axis and the first projected distribution on the ordinate axis. Two-projection distribution;

The initial region of interest is corrected according to the first projection distribution and the second projection distribution to obtain a matted region of interest.
The computer-readable storage medium according to any one of claims 41-43, wherein when the computer program is executed by a processor, the processor executes the For the portrait segmentation image corresponding to the person image, before the step of determining the matting region of interest in the original person image, a step is also performed: if the original person image is a rotated image, the original person image and the corresponding The portrait segmentation image corresponding to the original person image is corrected;

The determining the region of interest in matting in the original person image according to the original person image and the image segmentation image corresponding to the original person image includes:

Determine the corrected matting region of interest according to the corrected original person image and the corrected portrait segmentation image;

According to the rotation direction of the uncorrected original person image, the corrected matting region of interest is rotated to obtain the matting region of interest in the uncorrected original person image.
The computer-readable storage medium according to claim 40, wherein said optimizing the first hair mask to obtain the target hair mask corresponding to the original character image comprises:

Optimizing the first hair mask to obtain a second hair mask;

Perform upsampling and filtering processing on the second hair mask to obtain a target hair mask corresponding to the original person image.
The computer-readable storage medium according to claim 45, wherein said optimizing the first hair mask to obtain a second hair mask comprises:

Calculating the background complexity image corresponding to the ROI image;

Erosion processing is performed on the first hair mask according to the background complexity image to obtain a second hair mask.
The computer-readable storage medium according to claim 46, wherein the calculating the background complexity image corresponding to the ROI image comprises:

Acquiring a grayscale image of the image of the region of interest;

performing edge detection on the grayscale image to obtain a first edge image;

removing hair edges in the first edge image according to the first hair mask to obtain a second edge image;

Dilation and blurring are performed on the second edge image to obtain a background complexity image.
The computer-readable storage medium according to claim 46, wherein the etching process is performed on the first hair mask according to the background complexity image to obtain a second hair mask, comprising:

According to the background complexity image, determine a background complex area in the first hair mask whose complexity is greater than a complexity threshold;

Erosion processing is performed on the hair area around the background complex area in the first hair mask;

The first hair mask before the corrosion treatment is fused with the first hair mask after the corrosion treatment to obtain a second hair mask.
The computer-readable storage medium according to claim 45, wherein said optimizing the first hair mask to obtain a second hair mask comprises:

Filling holes in the hair region of the first hair mask to obtain a second hair mask.
The computer-readable storage medium according to claim 45, wherein said optimizing the first hair mask to obtain a second hair mask comprises:

The edge of the hair region of the first hair mask is enhanced to obtain a second hair mask.
The computer-readable storage medium according to claim 45, wherein said optimizing the first hair mask to obtain a second hair mask comprises:

If the image scene corresponding to the original character image is the target scene, soften the edges of the hair region of the first hair mask to obtain a second hair mask, and the target scene is a scene whose brightness value is lower than The brightness threshold of the scene.
The computer-readable storage medium according to claim 51, wherein when the computer program is executed by the processor, the processor executes the if the image scene corresponding to the original character image is the target scene, Then before the step of blurring the edge of the hair region of the first hair mask, the following steps are also performed:

Acquiring the photosensitive value corresponding to the original person image;

If the light-sensing value is greater than the light-sensing threshold, it is determined that the image scene corresponding to the original person image is the target scene.
The computer-readable storage medium according to claim 45, wherein said optimizing the first hair mask to obtain a second hair mask comprises:

Calculating the background complexity image corresponding to the ROI image;

performing erosion processing on the first hair mask according to the background complexity image;

filling holes in the hair region of the first hair mask after the erosion process;

performing enhancement processing on the edges of the hair region of the filled first hair mask;

If the image scene corresponding to the original person image is the target scene, softening the edge of the hair region of the enhanced first hair mask to obtain a second hair mask;

If the image scene corresponding to the original person image is not the target scene, the enhanced first hair mask is used as the second hair mask.
The computer-readable storage medium according to any one of claims 45-53, wherein the upsampling and filtering process is performed on the second hair mask to obtain the target hair mask corresponding to the original person image, include:

Using the grayscale image of the region of interest image as the guide image of the guide filter, performing upsampling filtering on the second hair mask through the guide filter, to obtain the target hair mask corresponding to the original person image membrane.
The computer-readable storage medium according to claim 54, wherein when the computer program is executed by the processor, the processor is executed to use the grayscale image of the region of interest image as a guide filter Before the step of upsampling and filtering the second hair mask through the guiding filter, the step of: performing the step of: The hair mask is divided into regions to obtain a simple background region and a complex background region, the simple background region is a background region whose complexity is lower than or equal to the complexity threshold, and the complex background region is a background region whose complexity is higher than the complexity threshold the background area of

The grayscale image of the region of interest image is used as the guide image of the guide filter, and the second hair mask is subjected to upsampling filtering through the guide filter to obtain the target corresponding to the original person image Hair masks, including:

Using the grayscale image of the region of interest image as the guide image of the guide filter, performing upsampling filter processing on the hair region around the background simple region in the second hair mask through the guide filter, Obtain the first filtering result;

Using a bilinear interpolation algorithm to perform upsampling and filtering on the hair area around the complex background area in the second hair mask to obtain a second filtering result;

The first filtering result and the second filtering result are fused to obtain a target hair mask.
The computer-readable storage medium according to any one of claims 40-44, 45-53, wherein the generating the first hair mask according to the region-of-interest image and the region-segmented image comprises:

Inputting the region-of-interest image and the region-segmented image into an image processing model, and processing the region-of-interest image and the region-segmented image through the image processing model to obtain a first hair mask, wherein the The image processing model is obtained by training according to multiple sets of sample training images, and each set of sample training images includes a sample person image, a sample portrait segmentation image corresponding to the sample person image, and a sample hair mask.
The computer-readable storage medium according to claim 56, wherein the sample hair mask is obtained by performing corrosion processing on the background complexity image corresponding to the sample character image.
The computer-readable storage medium according to any one of claims 40-44, 45-53, characterized in that, when the computer program is executed by the processor, it also causes the processor to perform the following steps:

The background area of the original person image is blurred according to the target hair mask to obtain the target person image.