CN111524087B

CN111524087B - Image processing method and device, storage medium and terminal

Info

Publication number: CN111524087B
Application number: CN202010337094.7A
Authority: CN
Inventors: 游瑞蓉; 李怀东; 毛亚磊
Original assignee: Spreadtrum Communications Shanghai Co Ltd
Current assignee: Spreadtrum Communications Shanghai Co Ltd
Priority date: 2020-04-24
Filing date: 2020-04-24
Publication date: 2023-06-20
Anticipated expiration: 2040-04-24
Also published as: CN111524087A

Abstract

An image processing method and device, a storage medium and a terminal, comprising the following steps: providing a main shot image and a secondary shot image, wherein a camera for shooting the main shot image and a camera for shooting the secondary shot image are positioned on the same device; determining a depth map of the main shooting image according to the main shooting image and the auxiliary shooting image, or determining the depth map of the main shooting image according to the main shooting image, wherein the depth map comprises depth values of all pixel points in the main shooting image; image segmentation is carried out on the main shot image so as to obtain a portrait area; determining an average depth value according to at least a part of the portrait area; and processing the main shot image or the auxiliary shot image by adopting the average depth value as the depth value of each pixel point in the portrait region. The invention has the opportunity to enable the portrait area to have more proper depth value, thereby improving the imaging quality of the portrait area.

Description

Image processing method and device, storage medium and terminal

Technical Field

The present invention relates to the field of computer technologies, and in particular, to an image processing method and apparatus, a storage medium, and a terminal.

Background

With the development of smart phones, the function of the camera shooting function in the mobile phones of users is more and more important, and the camera shooting function is gradually developed into double shooting, triple shooting or even more cameras from a single shooting, so that the effect of comparing with the single shooting is achieved. The large aperture effect is one of important applications of mobile phone camera shooting, and the large aperture effect aiming at human images is highly focused by a large number of users.

The single-shot mobile phone lacks depth information, only can distinguish the foreground and the background, and the area distinguished as the background is uniformly virtual, so that an unnatural virtual effect can appear. Therefore, the mobile phone shooting based on the double shooting, the structured light or the TOF system can obtain depth information according to a stereoscopic vision algorithm, a deep learning method and the like. However, the obtained depth information is very dependent on hardware, and if the hardware is dropped or the lens distortion is large, the accuracy of the depth information is greatly affected. Current stereoscopic methods rely on stability of hardware assembly and are mostly prone to depth computation errors in high frequency, low texture regions, such as problems with clear upper body and blurring of the upper body.

There is a need for an image processing method that can effectively avoid the problems of depth errors and mistakes.

Disclosure of Invention

The invention solves the technical problem of providing an image processing method and device, a storage medium and a terminal, which have the opportunity to enable a portrait area to have a more suitable depth value, thereby improving the imaging quality of the portrait area and effectively avoiding the problems of depth errors and errors.

In order to solve the above technical problems, an embodiment of the present invention provides an image processing method, including the following steps: providing a main shot image and a secondary shot image, wherein a camera for shooting the main shot image and a camera for shooting the secondary shot image are positioned on the same device; determining a depth map of the main shooting image according to the main shooting image and the auxiliary shooting image, or determining the depth map of the main shooting image according to the main shooting image, wherein the depth map comprises depth values of all pixel points in the main shooting image; image segmentation is carried out on the main shot image so as to obtain a portrait area; determining an average depth value according to at least a part of the portrait area; and processing the main shot image or the auxiliary shot image by adopting the average depth value as the depth value of each pixel point in the portrait region.

Optionally, determining the depth map of the primary image includes: and matching the main shooting image and the auxiliary shooting image by adopting a stereo matching algorithm so as to determine a depth map of the main shooting image.

Optionally, performing image segmentation on the main shot image to obtain a portrait area includes: performing face detection by taking the main shot image as a reference to obtain a face area; if the face is detected, image segmentation is carried out on the face to which the face belongs so as to obtain the face region.

Optionally, one or more of the following is satisfied: performing face detection by adopting a deep learning method based on face characteristics; and adopting a deep learning method to segment the image of the face.

Optionally, determining the average depth value according to at least a part of the portrait area includes: determining one or more face feature points; taking each face feature point as a center, determining a region in a preset range around the face feature point, and marking the region as a first feature region; determining a superposition area of the first characteristic area and the face area, and marking the superposition area as a first calculation image block; and determining the average depth value according to the depth value of each pixel point in the first calculated image block.

Optionally, the face feature points are selected from: face center points and face key feature points.

Optionally, determining the average depth value according to the depth value of each pixel point in the first calculated image block includes: determining a first weight value of each first calculated image block; carrying out weighted average on the depth values of all pixel points in all the first calculated image blocks to obtain the average depth value; the closer the first calculated image block is to the center point of the face area, the larger the first weight value is.

Optionally, determining the average depth value according to the depth value of each pixel point in the first calculated image block includes: determining a second weight value of each first calculated image block and a third weight value of each pixel point in each first calculated image block; according to the second weight value and the third weight value, carrying out weighted average on the depth value of each pixel point in each first calculated image block so as to obtain the average depth value; the closer the first calculated image block is to the characteristic region, the larger the second weight value is; the closer the position of the pixel point in the first computing image block to which the pixel point belongs is to the center of the first computing image block, the larger the third weight value is.

Optionally, determining the average depth value according to at least a part of the portrait area includes: determining one or more human body feature points; taking each human body characteristic point as a center, determining an area within a preset range around the human body characteristic point, and marking the area as a second characteristic area; determining the superposition area of the second characteristic area and the portrait area, and marking the superposition area as a second calculation image block; and determining the average depth value according to the depth value of each pixel point in the second calculated image block.

Optionally, the human feature points are human skeleton nodes.

Optionally, determining the average depth value according to the depth value of each pixel point in the second calculated image block includes: determining a fourth weight value for each second computed image block; carrying out weighted average on the depth values of all pixel points in each second calculated image block to obtain the average depth value; the closer the second calculated image block is to the center point of the portrait area, the larger the fourth weight value is.

Optionally, determining the average depth value according to the depth value of each pixel point in the second calculated image block includes: determining a fifth weight value of each second calculated image block and a sixth weight value of each pixel point in each second calculated image block; according to the fifth weight value and the sixth weight value, carrying out weighted average on the depth values of all pixel points in each second calculated image block to obtain the average depth value; the closer the second calculated image block is to the center point of the portrait area, the larger the fifth weight value is; the closer the pixel point is located in the second computing image block to the center of the second computing image block, the larger the sixth weight value is.

Optionally, the main shot image further includes a border area and a background area, the border area is an area surrounding the portrait area and having a preset width, and areas other than the portrait area and the border area are background areas; the method further comprises the steps of: and carrying out fusion processing on the boundary area between the portrait area and the background area to determine the fusion depth value of each pixel point in the boundary area.

Optionally, the fusing processing of the boundary area between the portrait area and the background area includes: determining a boundary region adjacent to the feature region; carrying out image morphology processing on a boundary area adjacent to the characteristic area by adopting a first template size; carrying out image morphology processing on the boundary area which is not adjacent to the characteristic area by adopting a second template size; wherein the first template size is greater than the second template size; the characteristic region is selected from one or more of the following: the detected human face area, the human face center area, the human face characteristic point area and the human skeleton node area.

Optionally, the image processing method further includes: and filtering the fusion depth value of each pixel point in the boundary region.

Optionally, a guard filtering method is adopted to filter the fusion depth value.

To solve the above technical problem, an embodiment of the present invention provides an image processing apparatus, including: the image providing module is used for providing a main shooting image and a secondary shooting image, and a camera used for shooting the main shooting image and a camera used for shooting the secondary shooting image are positioned on the same device; a depth map determining module, configured to determine a depth map of the main shot image according to the main shot image and the sub shot image, or determine a depth map of the main shot image according to the main shot image, where the depth map includes depth values of each pixel point in the main shot image; the image segmentation module is used for carrying out image segmentation on the main shot image so as to obtain a portrait area; the average depth value determining module is used for determining an average depth value according to at least one part of the human image area; and the image processing module is used for processing the main shot image or the auxiliary shot image or the main and auxiliary image by adopting the average depth value as the depth value of each pixel point in the portrait area.

To solve the above technical problem, an embodiment of the present invention provides a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above image processing method.

In order to solve the above technical problems, an embodiment of the present invention provides a terminal, including a memory and a processor, where the memory stores a computer program capable of running on the processor, and the processor executes steps of the image processing method when running the computer program.

Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:

in the embodiment of the invention, the depth map of the main shot image is determined, the average depth value is determined according to at least a part of the areas in the portrait area, and the average depth value is used as the depth value of each pixel point in the portrait area, so that the portrait area has a better suitable depth value, the imaging quality of the portrait area is improved, and the problems of depth errors and errors are effectively avoided.

Further, by taking each face feature point as the center, determining the region in the preset range around the face feature point, further determining the overlapping region of the feature region and the face region as a first calculation image block, and determining the average depth value according to the depth value of each pixel point in the first calculation image block, the average depth value can be determined according to the depth value of the pixel point with higher importance degree, so that the image region is further selected to be a more suitable depth value, the imaging quality of the image region is improved, and depth errors and errors are avoided.

Further, by taking each human body characteristic point as the center, determining the region in the preset range around the human body characteristic point, further determining the superposition region of the characteristic region and the portrait region as a second calculation image block, and determining the average depth value according to the depth value of each pixel point in the second calculation image block, the average depth value can be determined according to the depth value of the pixel point with higher importance degree, thereby further enabling the portrait region to select a more suitable depth value, improving the imaging quality of the portrait region, and avoiding depth errors and mistakes.

Further, determining a first weight value for each first computed image block; and carrying out weighted average on the depth values of all pixel points in all the calculated image blocks to obtain the average depth value, and adopting a higher first weight value for the calculated image block with higher importance degree, so that a more suitable depth value is further selected for the portrait region, the imaging quality of the portrait region is improved, and depth errors and errors are avoided.

Further, determining a second weight value of each first calculated image block and a third weight value of each pixel point in each image block; and carrying out weighted average on the depth values of all the pixel points in each first calculated image block to obtain the average depth value, and more finely reflecting the influence of each calculated image block and each pixel point by setting two weight values, thereby further selecting more suitable depth values and improving the imaging quality of a portrait region.

Further, determining a fourth weight value for each second computed image block; and carrying out weighted average on the depth values of all pixel points in all the calculated image blocks to obtain the average depth value, and adopting a higher fourth weight value for the calculated image block with higher importance degree, so that a more suitable depth value is further selected for the portrait region, the imaging quality of the portrait region is improved, and depth errors and errors are avoided.

Further, determining a fifth weight value of each second calculated image block and a sixth weight value of each pixel point in each image block; and carrying out weighted average on the depth values of all the pixel points in each second calculated image block to obtain the average depth value, and more finely reflecting the influence of each calculated image block and each pixel point by setting two weight values, thereby further selecting more suitable depth values and improving the imaging quality of a portrait region.

Further, fusion processing is carried out on the boundary area between the portrait area and the background area so as to determine the fusion depth value of each pixel point in the boundary area, thereby being beneficial to correcting errors and error areas and improving the depth distribution of the portrait area and the background area.

Further, filtering processing is carried out on the fusion depth values of all the pixel points in the boundary area, so that the problems of non-uniformity and local errors of the depth information of the figure calculated by the depth map are solved, and the image quality is improved.

Drawings

FIG. 1 is a schematic view of a working scene of a primary camera and secondary camera combination in the prior art;

FIG. 2 is a flow chart of an image processing method in an embodiment of the present application;

FIG. 3 is a flow chart of a method embodying step S24 of FIG. 2;

FIG. 4 is a flow chart of a method embodying step S32 of FIG. 3;

FIG. 5 is a flow chart of another implementation method of step S32 in FIG. 3;

fig. 6 is a schematic structural diagram of an image processing apparatus in the embodiment of the present application.

Detailed Description

As described above, the application of the camera technology in daily life of people is increasing, and the large aperture effect is one of the most commonly used functions of current camera fans and mobile phone photographers. The camera hardware is utilized to realize the optical large aperture effect, so that the defocusing effect of a main body and a virtual background can be obtained, but the size of the camera is inevitably increased, the carrying is inconvenient, and the cost is greatly increased; the smart phone has small size, convenience and rapid photographing, the position of photographing is increasingly important, the smart phone can only be matched with a double-shot lens, and the technology of obtaining the depth of field information of an object through the bionics principle to realize the large aperture effect is developed, so that the smart phone is one of the important functions of current mobile phone photographing, and is also called out-of-focus imaging, depth of field imaging and scenic imaging (Bokeh) images. Double-shot imaging is an important research topic in the fields of image processing and computer vision, and can be widely applied to the fields of mobile phone double-shot, robot navigation and the like.

The double-shot generated depth-of-field image in the smart phone simulates the large aperture depth-of-field effect, the main subject is highlighted, the background is fuzzy and soft and attractive, the color transition is natural, and the like, and the competitive power of the double-shot mobile phone in the market can be improved.

Referring to fig. 1, fig. 1 is a schematic view of a working scene of a combination of a main camera and a sub camera in the prior art.

Specifically, the main camera 11 and the sub camera 12 may be located on the same device 10, and the main image 13 and the sub image 14 are photographed respectively.

Further, the combination of the main camera 11 and the sub camera 12 as shown in fig. 1 may be selected from:

a. a double shot of a full-color (Red-Green-Blue, RGB) main camera and an RGB sub-camera combination;

b. a double shot of an RGB primary camera and a black and white (MONO) secondary camera combination;

c. a long Jiao Zhu camera and a wide-focus secondary camera;

d. a wide Jiao Zhu camera and an ultra-wide-focus auxiliary camera.

Taking the a combination as an example, the Field of View (FOV) of a general sub-shot is larger than that of a main shot, and the hardware characteristics of the sub-shot are lower than those of the main shot in consideration of the hardware cost. Because the hardware parameters of different cameras are different (such as optical center, focal length, FOV, distortion and other internal parameters), and the installation and arrangement of the modules are different (such as arranged base lines, relative angles, positions and the like), the main camera and the auxiliary camera on the same module can necessarily obtain images with different FOVs, different relative positions and different shielding when shooting the same object.

In the existing simple and low-cost large aperture generation method, the main shot image and the auxiliary shot image can be synchronously acquired, and the parallax information of the main shot image and the auxiliary shot image is utilized for carrying out depth calculation, but the method relies on the quality of double shot calibration, and the problems of local depth information error, weak texture repeated texture error and the like are easy to occur. When shooting a portrait, the problem of blurring of the upper body is easy to appear.

Specifically, before the depth of field is calculated, calibration (Calibration) of the handset is required. Obtaining the internal parameters, external parameters and distortion parameters of the main shot and the auxiliary shot. And is very dependent on the accuracy of calibration and hardware stability. When the hardware characteristics of the main camera and the auxiliary camera are inconsistent, the main image and the auxiliary image are not aligned in the geometric direction due to inaccurate calibration algorithm or falling of the mobile phone module; in addition, in practical application, because the two cameras are different in position, photographed scenes are changed in many ways, because parallax matching depends on feature point matching, partial region inaccuracy can be brought by utilizing parallax calculation only, the upper and lower depth of field calculation of a person is inconsistent, body part errors occur, and correct depth information cannot be obtained due to repeated texture and weak texture or shielding problems.

In another existing advanced large aperture method, artificial intelligence (Artificial Intelligence, AI) deep learning technology can be adopted to extract the position of a person or perform key correction processing on local information, such as skin color segmentation, gesture segmentation and the like. The purpose is to obtain the effects of correct and clear main body and gradual background blurring. The processing method can reduce the dependence on hardware calibration, but has high processing performance requirements.

The inventor finds that in the prior art, because the main graph and the auxiliary graph are not aligned in the geometric direction, the partial region is inaccurate due to the fact that parallax calculation is only utilized, the problems of inconsistent calculation of the upper and lower depth of field of a person and the like occur, and the parallax calculation error of stereo matching occurs. If the primary and secondary images can be made consistent in depth of field calculation, the stereo matching parallax calculation error can be reduced.

In order to make the above objects, features and advantages of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.

Referring to fig. 2, fig. 2 is a flowchart of an image processing method in an embodiment of the present application. The image processing method may include steps S21 to S25:

step S21: providing a main shot image and a secondary shot image, wherein a camera for shooting the main shot image and a camera for shooting the secondary shot image are positioned on the same device;

step S22: determining a depth map of the main shooting image according to the main shooting image and the auxiliary shooting image, or determining the depth map of the main shooting image according to the main shooting image, wherein the depth map comprises depth values of all pixel points in the main shooting image;

step S23: image segmentation is carried out on the main shot image so as to obtain a portrait area;

step S24: determining an average depth value according to at least a part of the portrait area;

step S25: and processing the main shot image or the auxiliary shot image by adopting the average depth value as the depth value of each pixel point in the portrait region.

In the implementation of step S21, a main camera and a sub camera are provided, which may be located on the same device, as shown in fig. 1.

It will be appreciated that the primary and secondary cameras may have the same or similar orientation to obtain images with similar angles of view.

Further, the embodiment of the invention can further comprise the step of calibrating the main shot image and the auxiliary shot image so as to geometrically correct the main shot image and the auxiliary shot image.

In the embodiment of the invention, the specific calibration method and steps are not limited.

In the implementation of step S22, a depth map of the main shot image may be determined according to the main shot image and the sub shot image, or a depth map of the main shot image may be determined according to the main shot image.

The depth map may be a depth image based on an image scene obtained by using a manual or automatic method, and the image represents distance depth information of the pixel by using a pixel point value of the image. Among these, methods for obtaining disparity map/depth map information include, but are not limited to: manual annotation, image tool annotation, depth map calculated based on hardware system characteristics by image processing, machine learning or deep learning algorithms. The relationship of the depth map and the disparity map is approximately an inverse proportional relationship.

As can be seen from the above, in another specific embodiment of step S22, the disparity map of the main shot image may be determined according to the main shot image and the sub shot image, or the disparity map of the main shot image may be determined according to the main shot image.

Further, the step of determining a depth map of the primary image may include: and matching the main shooting image and the auxiliary shooting image by adopting a stereo matching algorithm so as to determine a depth map of the main shooting image.

Specifically, a stereo matching algorithm may be used to find out a pixel point corresponding to each pixel point of an image on an image of another view angle, calculate a parallax image, estimate a depth image, and determine a parallax image of the main image.

Still further, the stereo matching algorithm may be selected from: color segmentation (color segment) -based segmentation algorithm and semi-global matching (SGM) algorithm to improve the accuracy of determination.

In the implementation of step S23, a portrait area in the main shot image is determined.

Further, the step of performing image segmentation on the main shot image to obtain a portrait area may include: performing face detection by taking the main shot image as a reference; if the face is detected, image segmentation is carried out on the face to which the face belongs so as to obtain the face region.

Still further, the image processing method may satisfy one or more of the following: performing face detection by adopting a deep learning method based on face characteristics; and adopting a deep learning method to segment the image of the face.

In the embodiment of the invention, by adopting a proper deep learning method, the accuracy of face detection and image segmentation can be improved, and the quality of subsequent image processing can be improved.

In a specific implementation of step S24, an average depth value may be determined according to the face region.

Referring to fig. 3, fig. 3 is a flowchart of a specific implementation method of step S24 in fig. 2. The step of determining the average depth value according to at least a part of the portrait area may include steps S31 to S32, each of which will be described below.

In step S31, one or more face feature points are determined.

The face feature points can be used for representing points which are more important to the quality of the representation image, and the depth value of the main shooting image is determined according to the face feature points, so that the quality of the image is improved.

Further, the face feature points may be selected from: the face center point and the face key feature points, wherein the face key feature points can be eyes, ears, mouths, noses, eyebrows and the like.

In step S32, a region within a preset range around each face feature point is determined with each face feature point as a center, and is recorded as a first feature region.

Specifically, a region of a preset length and width around the face feature points may be determined to obtain a rectangular first feature region, and a region of a preset radius around the face feature points may be determined to obtain a circular first feature region.

In step S33, a region where the first feature region overlaps the face region is determined and denoted as a first calculation image block.

Specifically, the first calculated image block is located in the face region, and the influence of the background is removed, which is more helpful to improve the quality of the image.

In step S34, the average depth value is determined according to the depth value of each pixel point in the first calculated image block.

Referring to fig. 4, fig. 4 is a flowchart of a specific implementation method of step S34 in fig. 3. The step of determining the average depth value according to the depth value of each pixel point in the first calculated image block may include steps S41 to S42, which will be described below.

In step S41, a first weight value of each first calculated image block is determined.

Further, the first weight value may be affected by a distance between the first calculated image block and the face region, and the closer the first calculated image block is to the face region, the larger the first weight value may be.

In step S42, the depth values of the pixels in each first computed image block are weighted and averaged to obtain the average depth value.

Specifically, the first weight value of each first calculated image block may be set to w1, and the average depth value is determined using the following formula:

wherein D is used for representing the average depth value, J is used for representing the J-th first calculation image block in the J first calculation image blocks, I is used for representing the I-th pixel point in the I pixel points of the current first calculation image block, and D _i,j For representing the depth value, w1, of the ith pixel point in the jth first computed image block _j A first weight value representing a j-th first calculated image block.

In the embodiment of the invention, the area in the preset range around each face feature point is determined by taking each face feature point as the center, the overlapping area of the feature area and the face area is further determined to be a first calculation image block, the average depth value is determined according to the depth value of each pixel point in the first calculation image block, and the average depth value can be determined according to the depth value of the pixel point with higher importance degree, so that a more suitable depth value is further selected for the portrait area, the imaging quality of the portrait area is improved, and depth errors and errors are avoided.

Referring to fig. 5, fig. 5 is a flowchart of another implementation method of step S32 in fig. 3. The step of determining the average depth value according to the depth value of each pixel point in the first calculated image block may include steps S51 to S52, which will be described below.

In step S51, a second weight value of each first calculated image block and a third weight value of each pixel point within each image block are determined.

Further, the second weight value is affected by a distance between the first calculated image block and the feature region, and the second weight value may be larger as the first calculated image block is closer to the feature region.

The third weight value is affected by the distance between the position of the pixel point in the first computing image block and the center of the first computing image block, and the closer the position of the pixel point in the first computing image block is to the center of the first computing image block, the larger the third weight value is.

In step S52, according to the first weight value and the second weight value, the depth values of the pixels in each first computed image block are weighted and averaged to obtain the average depth value.

Specifically, the second weight value of each first calculated image block may be set to w2, the third weight value of each pixel point in each image block is set to w3, and the average depth value is determined by using the following formula:

wherein D is used for representing the average depth value, J is used for representing the J-th first calculation image block in the J first calculation image blocks, I is used for representing the I-th pixel point in the I pixel points of the current first calculation image block, and D _i,j For representing the depth value, w2, of the ith pixel point in the jth first computed image block _j Second weight value for representing jth first calculated image block, w3 _i,j And the third weight value is used for representing the ith pixel point in the current jth first calculated image block.

In the embodiment of the invention, a second weight value of each first calculated image block and a third weight value of each pixel point in each image block are determined; and carrying out weighted average on the depth values of all the pixel points in each first calculated image block to obtain the average depth value, and more finely reflecting the influence of each first calculated image block and each pixel point by setting two weight values, so that more suitable depth values are further selected and used, and the imaging quality of a portrait region is improved.

With continued reference to fig. 2, in an implementation of step S24, an average depth value may also be determined from the portrait area. The portrait area may include a face area, and may further include other portrait areas other than the face area.

Further, determining an average depth value from at least a portion of the portrait area includes: determining one or more human body feature points; taking each human body characteristic point as a center, determining an area within a preset range around the human body characteristic point, and marking the area as a second characteristic area; determining the superposition area of the second characteristic area and the portrait area, and marking the superposition area as a second calculation image block; and determining the average depth value according to the depth value of each pixel point in the second calculated image block.

The human body characteristic points can be used for representing points which are more important to the quality of the representation image, and the depth value of the main shooting image is determined according to the human body characteristic points, so that the quality of the image is improved more favorably.

Further, the human body feature points may be human body skeletal nodes, such as finger joints, wrists, elbows, leg joints, ankles, toe joints, and the like.

Specifically, the area of the preset length and width around the human body feature point may be determined to obtain a rectangular second feature area, and the area of the preset radius around the human body feature point may be determined to obtain a circular second feature area.

In the embodiment of the invention, the second calculated image block is positioned in the portrait area, and the influence of the background is removed, thereby being more beneficial to improving the quality of the image.

In a specific implementation manner of the embodiment of the present invention, determining the average depth value according to the depth value of each pixel point in the second calculated image block includes: determining a fourth weight value for each second computed image block; carrying out weighted average on the depth values of all pixel points in each second calculated image block to obtain the average depth value; the closer the second calculated image block is to the center point of the face area, the larger the fourth weight value is.

Specifically, the fourth weight value of each second calculated image block may be set to w4, and the average depth value is determined using the following formula:

wherein D is used for representing an average depth value, J is used for representing a J-th second calculation image block in the J first calculation image blocks, I is used for representing an I-th pixel point in I pixel points of the current second calculation image block, and D _i,j For representing the depth value, w4, of the ith pixel point in the jth second computed image block _j And a fourth weight value for representing the j-th second calculated image block.

Further, the fourth weight value may be affected by a distance between the second calculated image block and the face region, and the closer the second calculated image block is to the center point of the portrait region, the larger the fourth weight value may be

In the embodiment of the invention, the region in the preset range around the human body characteristic points is determined by taking each human body characteristic point as the center, the overlapping region of the characteristic region and the portrait region is further determined to be a second calculation image block, and the average depth value is determined according to the depth value of each pixel point in the second calculation image block, so that the average depth value can be determined according to the depth value of the pixel point with higher importance degree, thereby further enabling the portrait region to select a more suitable depth value, improving the imaging quality of the portrait region and avoiding depth errors and mistakes.

In another specific implementation of the embodiment of the present invention, determining the average depth value according to the depth value of each pixel point in the second calculated image block includes: determining a fifth weight value of each second calculated image block and a sixth weight value of each pixel point in each second calculated image block; according to the fifth weight value and the sixth weight value, carrying out weighted average on the depth values of all pixel points in each second calculated image block to obtain the average depth value; the closer the second calculated image block is to the center point of the portrait area, the larger the fifth weight value is; the closer the pixel point is located in the second computing image block to the center of the second computing image block, the larger the sixth weight value is.

Further, the fifth weight value may be greater as the second calculated image block is closer to the center point of the portrait area, which is affected by the distance between the second calculated image block and the feature area.

The sixth weight value is affected by the distance between the position of the pixel point in the second computing image block and the center of the second computing image block, and the closer the position of the pixel point in the second computing image block is to the center of the second computing image block, the larger the sixth weight value is.

Specifically, the fifth weight value of each second calculated image block may be set to w5, the sixth weight value of each pixel point in each image block is set to w6, and the average depth value is determined using the following formula:

wherein D is used for representing the average depth value, J is used for representing the J-th second calculated image block in the J second calculated image blocks, I is used for representing the I-th pixel point in the I-th pixel points of the current second calculated image block, and D _i,j For representing the depth value, w5, of the ith pixel point in the jth second computed image block _j Sixth weight value for representing jth second calculated image block, w6 _i,j And a sixth weight value for representing an ith pixel point in the current jth second calculated image block.

In the embodiment of the invention, a fifth weight value of each second calculated image block and a sixth weight value of each pixel point in each image block are determined; and carrying out weighted average on the depth values of all the pixel points in each second calculation image block to obtain the average depth value, and more finely reflecting the influence of each second calculation image block and each pixel point by setting two weight values, thereby further selecting more suitable depth values and improving the imaging quality of a portrait region.

With continued reference to fig. 2, in the implementation of step S25, the average depth value may be used as a depth value of each pixel point in the portrait area, to process the main or sub-photographic image.

Specifically, the average depth value of the region may be adopted, and for the portrait region after the portrait segmentation, the depth value of each pixel in the portrait region may be replaced pixel by pixel.

Further, the main image further includes a border area and a background area, the border area being an area surrounding the portrait area and having a preset width, and an area other than the portrait area and the border area being a background area; the method may further comprise: and carrying out fusion processing on the boundary area between the portrait area and the background area to determine the fusion depth value of each pixel point in the boundary area.

In the embodiment of the invention, the fusion processing is carried out on the boundary area between the portrait area and the background area to determine the fusion depth value of each pixel point in the boundary area, thereby being beneficial to correcting error and error areas and improving the depth distribution of the portrait area and the background area.

Still further, the step of performing the fusion process on the boundary region between the portrait region and the background region may include: determining a boundary region adjacent to the feature region; carrying out image morphology processing on a boundary area adjacent to the characteristic area by adopting a first template size; carrying out image morphology processing on the boundary area which is not adjacent to the characteristic area by adopting a second template size; wherein the first template size is greater than the second template size; the characteristic region is selected from one or more of the following: the detected human face area, the human face center area, the human face characteristic point area and the human skeleton node area.

In the embodiment of the invention, the human image area and the depth map can be fused by using a morphological method by setting the boundary area adjacent to the characteristic area and adopting a larger template size, so that the error and the error area can be corrected better, the depth distribution of the human body and the background area can be improved more obviously in the area, and the quality of the image can be improved.

In another specific implementation manner of the embodiment of the present invention, a region belonging to a human head in the portrait region may be further determined and marked as a head region, and image morphology processing may be performed on a boundary region adjacent to the head region by using a third template size, where the third template size is larger than the first template size.

In the embodiment of the invention, the human image area and the depth map can be fused by using a morphological method by setting the boundary area adjacent to the head area to adopt a larger template size, so that the error and the error area can be corrected better, the depth distribution of the human body and the background area can be improved more obviously in the area, and the quality of the image can be improved.

Still further, the human head needs to include a hair region.

In the embodiment of the invention, the human image area and the depth map can be fused by using a morphological method by setting the boundary areas adjacent to the area containing the hair to adopt larger template sizes, so that the error and the error area can be corrected better, the depth distribution of the human body and the background area can be improved more obviously in the area near the hair, and the hair imaging effect can be improved.

Further, the image processing method may further include: and filtering the fusion depth value of each pixel point in the boundary region.

Furthermore, a guard filtering method can be adopted to filter the fusion depth value.

In the embodiment of the invention, the filtering processing is carried out on the fusion depth value of each pixel point in the boundary area, so that the problems of uneven and local error of the depth information of the figure calculated by the depth map are solved, and the image quality is improved.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an image processing apparatus in an embodiment of the present application. The image processing apparatus may include:

an image providing module 61 for providing a main shot image and a sub shot image, wherein a camera for shooting the main shot image and a camera for shooting the sub shot image are positioned on the same device;

a depth map determining module 62, configured to determine a depth map of the main shot image according to the main shot image and the sub shot image, or determine a depth map of the main shot image according to the main shot image, where the depth map includes depth values of each pixel point in the main shot image;

An image segmentation module 63, configured to perform image segmentation on the main shot image to obtain a portrait area;

an average depth value determining module 64, configured to determine an average depth value according to at least a part of the image areas;

the image processing module 65 is configured to process the main image or the sub-image or the main and sub-images by using the average depth value as a depth value of each pixel point in the portrait area.

Regarding the principle, implementation and advantageous effects of the image processing apparatus, please refer to the foregoing and the related descriptions of the image processing method shown in fig. 2 to 5, which are not repeated herein.

The embodiment of the invention also provides a storage medium, on which a computer program is stored, which, when being executed by a processor, performs the steps of the above method. The storage medium may be a computer readable storage medium, and may include, for example, a non-volatile memory (non-volatile) or a non-transitory memory (non-transitory) and may also include an optical disc, a mechanical hard disc, a solid state hard disc, and the like.

The embodiment of the invention also provides a terminal which comprises a memory and a processor, wherein the memory stores a computer program capable of running on the processor, and the processor executes the steps of the method when running the computer program. The terminal comprises, but is not limited to, a mobile phone, a computer, a tablet personal computer and other terminal equipment.

Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be made by one skilled in the art without departing from the spirit and scope of the invention, and the scope of the invention should be assessed accordingly to that of the appended claims.

Claims

1. An image processing method, characterized by comprising the steps of:

providing a main shot image and a secondary shot image, wherein a camera for shooting the main shot image and a camera for shooting the secondary shot image are positioned on the same device;

determining a depth map of the main shooting image according to the main shooting image and the auxiliary shooting image, or determining the depth map of the main shooting image according to the main shooting image, wherein the depth map comprises depth values of all pixel points in the main shooting image;

image segmentation is carried out on the main shot image so as to obtain a portrait area;

determining an average depth value according to the overlapping area of at least a part of the human image area and the human face area or the human image area;

and processing the main shot image or the auxiliary shot image by adopting the average depth value as the depth value of each pixel point in the portrait region.

2. The image processing method according to claim 1, wherein determining the depth map of the main shot image comprises:

and matching the main shooting image and the auxiliary shooting image by adopting a stereo matching algorithm so as to determine a depth map of the main shooting image.

3. The image processing method according to claim 1, wherein performing image segmentation on the main shot image to obtain a portrait area includes:

performing face detection by taking the main shot image as a reference to obtain the face region;

if the face is detected, image segmentation is carried out on the face to which the face belongs so as to obtain the face region.

4. A method of image processing according to claim 3, wherein one or more of the following is satisfied: performing face detection by adopting a deep learning method based on face characteristics;

and adopting a deep learning method to segment the image of the face.

5. The image processing method according to claim 3, wherein,

determining an average depth value according to the overlapping area of at least a part of the human image area and the human face area comprises:

determining one or more face feature points;

Taking each face feature point as a center, determining a region in a preset range around the face feature point, and marking the region as a first feature region;

determining a superposition area of the first characteristic area and the face area, and marking the superposition area as a first calculation image block; and determining the average depth value according to the depth value of each pixel point in the first calculated image block.

6. The image processing method according to claim 5, wherein,

the face feature points are selected from the group consisting of: face center points and face key feature points.

7. The image processing method of claim 5, wherein determining the average depth value based on the depth value of each pixel in the first computed image block comprises:

determining a first weight value of each first calculated image block;

carrying out weighted average on the depth values of all pixel points in all the first calculated image blocks to obtain the average depth value;

the closer the first calculated image block is to the center point of the face area, the larger the first weight value is.

8. The image processing method of claim 5, wherein determining the average depth value based on the depth value of each pixel in the first computed image block comprises:

Determining a second weight value of each first calculated image block and a third weight value of each pixel point in each first calculated image block;

according to the second weight value and the third weight value, carrying out weighted average on the depth value of each pixel point in each first calculated image block so as to obtain the average depth value;

the closer the first calculated image block is to the characteristic region, the larger the second weight value is; the closer the position of the pixel point in the first computing image block to which the pixel point belongs is to the center of the first computing image block, the larger the third weight value is.

9. The image processing method according to claim 1, wherein,

determining an average depth value from the overlapping areas of at least a portion of the portrait areas and the portrait areas includes:

determining one or more human body feature points;

taking each human body characteristic point as a center, determining an area within a preset range around the human body characteristic point, and marking the area as a second characteristic area;

determining the superposition area of the second characteristic area and the portrait area, and marking the superposition area as a second calculation image block; and determining the average depth value according to the depth value of each pixel point in the second calculated image block.

10. The image processing method according to claim 9, wherein,

the human body characteristic points are human body skeleton nodes.

11. The image processing method of claim 9, wherein determining the average depth value from the depth value of each pixel in the second computed image block comprises:

determining a fourth weight value for each second computed image block;

carrying out weighted average on the depth values of all pixel points in each second calculated image block to obtain the average depth value;

the closer the second calculated image block is to the center point of the portrait area, the larger the fourth weight value is.

12. The image processing method of claim 9, wherein determining the average depth value from the depth value of each pixel in the second computed image block comprises:

determining a fifth weight value of each second calculated image block and a sixth weight value of each pixel point in each second calculated image block;

according to the fifth weight value and the sixth weight value, carrying out weighted average on the depth values of all pixel points in each second calculated image block to obtain the average depth value;

The closer the second calculated image block is to the center point of the portrait area, the larger the fifth weight value is;

the closer the pixel point is located in the second computing image block to the center of the second computing image block, the larger the sixth weight value is.

13. The image processing method according to claim 1, wherein the main shot image further includes a border region and a background region, the border region being a region surrounding the portrait region and having a preset width, the region other than the portrait region and the border region being a background region;

the method further comprises the steps of:

and carrying out fusion processing on the boundary area between the portrait area and the background area to determine the fusion depth value of each pixel point in the boundary area.

14. The image processing method according to claim 13, wherein performing fusion processing on a boundary region between the portrait region and a background region includes:

determining a boundary region adjacent to the feature region;

carrying out image morphology processing on a boundary area adjacent to the characteristic area by adopting a first template size;

carrying out image morphology processing on the boundary area which is not adjacent to the characteristic area by adopting a second template size;

Wherein the first template size is greater than the second template size;

the characteristic region is selected from one or more of the following: the detected human face area, the human face center area, the human face characteristic point area and the human skeleton node area.

15. The image processing method according to claim 13, characterized by further comprising: and filtering the fusion depth value of each pixel point in the boundary region.

16. The image processing method according to claim 13, wherein the fusion depth value is subjected to a filtering process using a guard filtering method.

17. An image processing apparatus, comprising:

the image providing module is used for providing a main shooting image and a secondary shooting image, and a camera used for shooting the main shooting image and a camera used for shooting the secondary shooting image are positioned on the same device;

a depth map determining module, configured to determine a depth map of the main shot image according to the main shot image and the sub shot image, or determine a depth map of the main shot image according to the main shot image, where the depth map includes depth values of each pixel point in the main shot image;

The image segmentation module is used for carrying out image segmentation on the main shot image so as to obtain a portrait area;

the average depth value determining module is used for determining an average depth value according to the overlapping area of at least a part of the human image area and the human face area or the human image area;

and the image processing module is used for processing the main shot image or the auxiliary shot image or the main and auxiliary image by adopting the average depth value as the depth value of each pixel point in the portrait area.

18. A storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the image processing method according to any of claims 1 to 16.

19. A terminal comprising a memory and a processor, the memory having stored thereon a computer program executable on the processor, characterized in that the processor executes the steps of the image processing method according to any of claims 1 to 16 when the computer program is executed.