WO2018176929A1

WO2018176929A1 - Image background blurring method and apparatus

Info

Publication number: WO2018176929A1
Application number: PCT/CN2017/117180
Authority: WO
Inventors: 宋明黎; 李欣; 黄一宁
Original assignee: 华为技术有限公司
Priority date: 2017-03-27
Filing date: 2017-12-19
Publication date: 2018-10-04
Also published as: CN108668069A; CN108668069B

Abstract

Embodiments of the present application disclose an image background blurring method and apparatus. The method comprises: extracting a reference image and m non-reference images from a target video according to an image extraction rule; constructing a first image pyramid by using the reference image, and constructing m second image pyramids by using the m non-reference images; determining a scene depth map of the reference image by using the first image pyramid and the m second image pyramids; dividing pixel points of the reference image into n depth layers by using the scene depth map; determining target positions in the reference image; determining, from the n depth layers, a target depth layer at which pixel points corresponding to the target positions are located; and blurring pixels to be processed. In the embodiments of the present application, pixels to be processed comprised in a depth layer of n depth layers other than a target depth layer can be blurred, so as to obtain an image in which pixels of the target depth layer are clear and the pixels to be processed are blurred.

Description

Image background blurring method and device

Technical field

The embodiments of the present application relate to the field of image processing technologies, and in particular, to an image background blurring method and apparatus.

Background technique

The background blur of an image refers to the way in which focus is focused on a theme in an image, and the non-theme elements are blurred. For example, when we take a picture of a landscape, we want to use the mountain as the theme of the whole image, we can focus the camera on the mountain, and the image of the mountain will become clear, and the water will become blurred; We want to use the water surface as the theme of the whole image, we need to focus the camera on the water surface, the image of the water surface will become clear, and the mountain will become blurred.

At present, photographs with background blurring effects usually require a SLR camera with a large aperture. However, the currently widely used smartphones are limited in size, cost, and use environment, and the lenses they match are basically of a small aperture type. Smartphones with digital camera functions cannot achieve image blurring due to hardware limitations.

Therefore, how to use a smart phone to capture a visually beautiful image with a clear foreground and a blurred background has become a technical problem that needs to be solved.

Summary of the invention

The embodiment of the present application provides an image background blurring method and device, so that the mobile terminal can capture an image with a clear foreground and a blurred background. The embodiment of the present application is implemented as follows: In a first aspect, an embodiment of the present application provides an image background blurring method, the method comprising: extracting a reference image and m non-reference images in a target video according to an image extraction rule; Constructing a first image pyramid using the reference image, constructing m second image pyramids using m non-reference images; determining a scene depth map of the reference image using the first image pyramid and the m second image pyramids; using the scene depth map to reference the image The pixel points are divided into n depth layers; the target position is determined in the reference image; the target depth layer where the pixel point corresponding to the target position is located is determined from the n depth layers; and the pixel to be processed is subjected to blur processing.

The target video is a video captured by the mobile terminal according to a predetermined trajectory, and the predetermined trajectory may be preset, and the predetermined trajectory is a moving trajectory on the same plane. The predetermined trajectory may be a left-to-right moving trajectory on the same plane, and the predetermined trajectory may also be a right-to-left moving trajectory on the same plane, and the predetermined trajectory may also be top-to-bottom in the same plane. The moving track, the predetermined track may also be a moving track from bottom to top on the same plane.

The image extraction rule is a preset rule, and the image extraction rule may be: selecting a reference image and m non-reference images in the target video according to the playing duration of the target video, where m is a positive integer greater than or equal to 1.

The reference image and the non-reference image are images extracted from different moments in the target video, and the reference image is the same as the shooting scene of the non-reference image, but the angle of view of the reference image is different from the position of the non-reference image. of.

In the process of constructing the first image pyramid by using the reference image, the mobile terminal uses the reference image as the bottom image of the first image pyramid. Then, the resolution of the underlying image of the first image pyramid is reduced to half as the upper layer image of the underlying image of the first image pyramid, and this step is continuously repeated to continuously obtain the upper layer image of the first image pyramid. Finally, the first image pyramid of a reference image having a different resolution can be obtained by repeating several times.

The scene depth map of the reference image represents the relative distance between any pixel point in the reference image and the mobile terminal, and the pixel value of the pixel point in the scene depth map represents the relative distance between the actual location where the pixel point is located and the mobile terminal.

The mobile terminal can acquire the preset n and the manner of dividing the depth layer, so that the number of depth layers and the depth range of each depth layer can be known.

There are many ways to determine the target position in the reference image. These are briefly introduced below.

In the first way, the target position is determined in the reference image according to the control command. Wherein, the control instruction may be an instruction input by the user on the touch screen of the mobile terminal by using a finger.

In the second way, the specific position in the reference image is determined as the target position. Wherein, the specific position in the reference image is a previously specified position.

In the third method, the face image in the reference image is identified, and the position of the face image in the reference image is determined as the target position.

Where n is greater than or equal to 2, the pixel to be processed is a pixel point included in a depth layer other than the target depth layer among the n depth layers.

In the first aspect, the embodiment of the present application divides each pixel of the reference image into n depth layers by using the obtained scene depth map, and determines the target position in the n depth layers by using the determined target position of the reference image. The target depth layer in which the pixel of the target position is located. Therefore, in the embodiment of the present application, the pixel to be processed included in the depth layer other than the target depth layer in the n depth layers may be blurred to obtain the target depth layer. An image with clear pixels and blurred pixels to be processed.

In a possible implementation, determining the scene depth map of the reference image by using the first image pyramid and the m second image pyramids comprises: determining a reference according to the top image of the first image pyramid and the top image of the m second image pyramids a preliminary depth map of the image, the first image pyramid and the m second image pyramids each including a top image and a lower image; determining a reference according to the preliminary depth map, the lower image of the first image pyramid, and the lower image of the m second image pyramids The depth map of the scene of the image.

Wherein, the reference image at different resolutions is deeply sampled in the first image pyramid and the m second image pyramids, and the high-resolution scene depth map is derived by using the low-resolution preliminary depth map, thereby speeding up the depth recovery The speed of the reference image depth can be generated more quickly by the embodiment of the present application.

In a possible implementation, determining a preliminary depth map of the reference image according to the top image of the first image pyramid and the top image of the m second image pyramids comprises: a top image according to the first image pyramid and m second images The top image of the pyramid is used to calculate the first matching loss body; the Markov random field model is constructed according to the first matching loss body to optimize the global matching loss, and the preliminary depth map of the reference image is obtained.

The first matching loss body may be first calculated according to the top image of the first image pyramid and the top image of the m second image pyramids; then, the MRF model is constructed according to the first matching loss body to perform global matching loss optimization, thereby A preliminary depth map of the reference image with a smooth detail.

In a possible implementation manner, calculating, according to the top image of the first image pyramid and the top image of the m second image pyramids, calculating the first matching loss body includes: obtaining the reference image and the view angle of the m non-reference images a camera external parameter and a camera internal parameter of the mobile terminal; determining a feature point in the reference image according to the feature point extraction rule; acquiring a three-dimensional coordinate of the feature point of the reference image; determining a minimum of the reference image in the scene according to the three-dimensional coordinate of the feature point of the reference image a depth value and a maximum depth value; determining a plurality of depth planes between the minimum depth value and the maximum depth value; using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm to calculate a plurality of depth planes from a plane in which the reference image is located to m a first homography matrix of a plane map in which the non-reference image is located; using a plane scan algorithm and a first homography matrix, each pixel point of the top image of the first image pyramid is projected to m pieces by multiple depth planes The plane value of each pixel point is obtained on the plane where the top image of the image pyramid is located; a parameter value of each pixel of the top image of the first image pyramid and a parameter value after each pixel point projection, determining a matching loss of each pixel point on the depth value; each of the top image of the first image pyramid The matching loss of the pixel points in the plurality of depth planes is determined as the first matching loss body.

In addition, by obtaining multiple depth planes, the re-projection is used to calculate the matching loss, so that the depth of the camera can be better adapted to the camera pose changes of the reference image and the m non-reference images in the depth recovery, and the reliability of the depth recovery method is improved. .

In a possible implementation manner, determining the plurality of depth planes between the minimum depth value and the maximum depth value comprises: calculating a first depth plane where the minimum depth value is located by using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm The second homography matrix of the reference image plane to m non-reference image plane mappings; using the camera internal parameter, the camera external parameter and the direct linear transformation algorithm, the second depth plane where the maximum depth value is calculated is from the reference image plane to m non- a third homography matrix of the reference image plane mapping; projecting a pixel point in the reference image according to the second homography matrix onto a plane where the m non-reference images are located, to obtain a first projection point; A pixel is projected onto a plane on which the m non-reference images are located according to the third homography matrix to obtain a second projection point; and a plurality of samples are uniformly sampled on a line formed between the first projection point and the second projection point. Point; backprojecting a plurality of sampling points into a three-dimensional space of a viewing angle of the reference image to obtain a plurality of depths corresponding to depth values of the plurality of sampling points flat.

Wherein, when calculating the matching loss of the pixel of the reference image according to a depth plane, the pixel needs to be re-projected onto the m non-reference image planes, and after the multiple depth planes are re-projected, in the m non-reference images. The positions of the present application are helpful for the subsequent steps to more efficiently extract the pixel matching information between the reference image and the m non-reference images, thereby improving the accuracy of the scene depth map.

In a possible implementation manner, determining a scene depth map of the reference image according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids includes: determining a top image of the first image pyramid Pixels corresponding to the pixels of the lower image of the first image pyramid; determining pixel points of the lower image of the m second image pyramids corresponding to the pixels of the top image of the m second image pyramids; determining according to the preliminary depth map An estimated depth value of a pixel point of a lower layer image of the first image pyramid; a minimum depth value and a maximum depth value of a pixel point of the lower layer image of the first image pyramid are determined according to the estimated depth value; determining between the minimum depth value and the maximum depth value a plurality of depth planes of the lower layer image of the first image pyramid; calculating a second matching loss body corresponding to the lower layer image of the first image pyramid and the lower layer image of the m second image pyramids by using the plane scanning algorithm and the plurality of depth planes; The lower layer image of the first image pyramid is used as the guide image, using a guided filtering algorithm The second matching loss body is locally optimized to obtain a third matching loss body; according to the third matching loss body, the depth value of the matching loss in the second matching loss body is selected for each pixel of the lower layer image of the first image pyramid, The depth map of the scene of the reference image.

Wherein, the preliminary depth map is used to estimate the minimum depth value and the maximum depth value of the pixel points of the lower layer image of the first image pyramid, thereby determining a relatively small depth search interval, thereby reducing the calculation amount and improving the depth recovery method for image noise. The robustness of the interference.

In a possible implementation manner, determining, from the n depth layers, a target depth layer where a pixel point corresponding to the target position is located includes: acquiring a specified pixel point of a target position of the reference image; determining and specifying the pixel in the scene depth map The corresponding pixel value of the point; determining the target depth layer where the specified pixel point is located in the n depth layers according to the pixel value corresponding to the specified pixel point.

After the mobile terminal determines the target location in the reference image, it can directly go to the specified pixel point of the target location, and then determine the pixel value corresponding to the specified pixel point in the scene depth map, and then the pixel value can be known to correspond to the pixel value. The target depth layer, in this case, the target depth layer where the pixel point corresponding to the target position is located can be determined in the n depth layers.

In a possible implementation manner, performing blur processing on the pixel to be processed includes: determining L depth layers where the pixel to be processed is located, L is greater than or equal to 2 and less than n; calculating depths of the L depth layers and the target depth layer Poor; the pixel points of each of the L depth layers are subjected to a predetermined ratio of blur processing according to the depth difference, and the degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference.

Wherein, since both the target depth layer and the L depth layers are available, the depth difference between the L depth layers and the target depth layer can be calculated, and then the mobile terminal can each of the L depth layers according to the depth difference. The pixel points of the depth layer are subjected to a preset ratio blurring process. The degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference, and if the depth difference between the depth layer and the target depth layer in the L depth layers is larger, then the pixel points in the depth layer The greater the degree of blurring; if the depth difference between the depth layer and the target depth layer in the L depth layers is smaller, the degree of blurring of the pixel points in the depth layer is smaller, thereby reflecting the level of different distances in the reference image sense.

In a second aspect, an embodiment of the present application provides an image background blurring apparatus, where the apparatus includes: an extracting module, configured to extract a reference image and m non-reference images in a target video according to an image extraction rule, and the target video is utilized. The video captured by the mobile terminal according to a predetermined trajectory, m is greater than or equal to 9;

a building module, configured to construct a first image pyramid by using a reference image, and construct m second image pyramids by using m non-reference images;

a first determining module, configured to determine a scene depth map of the reference image by using the first image pyramid and the m second image pyramids, where the scene depth map of the reference image represents a relative distance between any pixel point in the reference image and the mobile terminal;

a dividing module, configured to divide a pixel point of the reference image into n depth layers by using a scene depth map, wherein a depth of the object corresponding to the pixel point in the different depth layer to the mobile terminal is different, where n is greater than or equal to 2;

a second determining module, configured to determine a target location in the reference image;

a third determining module, configured to determine, from the n depth layers, a target depth layer where the pixel corresponding to the target location is located;

And a fuzzy processing module, configured to perform blur processing on the pixel to be processed, where the pixel to be processed is a pixel point included in a depth layer other than the target depth layer among the n depth layers.

In the second aspect, the embodiment of the present application divides each pixel of the reference image into n depth layers by using the obtained scene depth map, and determines the target position in the n depth layers by using the determined target position of the reference image. The target depth layer in which the pixel of the target position is located. Therefore, in the embodiment of the present application, the pixel to be processed included in the depth layer other than the target depth layer in the n depth layers may be blurred to obtain the target depth layer. An image with clear pixels and blurred pixels to be processed.

In a possible implementation, the first determining module is specifically configured to determine a preliminary depth map of the reference image according to the top image of the first image pyramid and the top image of the m second image pyramids, the first image pyramid and the m The second image pyramids each include a top layer image and a lower layer image; and determine a scene depth map of the reference image according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids.

In a possible implementation, the first determining module is configured to calculate a first matching loss body according to the top image of the first image pyramid and the top image of the m second image pyramids; and construct the first matching loss body according to the first matching loss body The Markov random field model performs global matching loss optimization to obtain a preliminary depth map of the reference image.

In a possible implementation manner, the first determining module is specifically configured to acquire a camera external parameter and a camera internal parameter of the mobile terminal at a viewing angle where the reference image and the m non-reference images are located; and determine the reference image according to the feature point extraction rule. a feature point; obtaining a three-dimensional coordinate of the feature point of the reference image; determining a minimum depth value and a maximum depth value in the scene in which the reference image is located according to the three-dimensional coordinate of the feature point of the reference image; determining between the minimum depth value and the maximum depth value a depth plane; using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm to calculate a first homography matrix of a plurality of depth planes from a plane in which the reference image is located to a plane mapping of m non-reference images; using a plane scanning algorithm And the first homography matrix, each pixel of the top image of the first image pyramid is projected in a plurality of depth planes onto a plane on which the top image of the m second image pyramids is located, to obtain a projection of each pixel point Parameter value; parameter value of each pixel point and projection of each pixel point according to the top image of the first image pyramid Parameter values, is determined for each pixel in the mismatching loss on the depth value; the top image of each pixel of the pyramid image of the first match loss is determined as a first member in a plurality of matching loss of depth plane.

In a possible implementation manner, the first determining module is specifically configured to calculate, by using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm, the first depth plane where the minimum depth value is located, from the reference image plane to the m non-reference images. The second homography matrix of the plane mapping; using the camera internal parameter, the camera external parameter and the direct linear transformation algorithm, the third homography of the second depth plane where the maximum depth value is located is mapped from the reference image plane to the m non-reference image planes. a matrix of pixels in the reference image is projected onto the plane of the m non-reference images according to the second homography matrix to obtain a first projection point; and one pixel point in the reference image is according to the third homography The matrix is projected onto a plane where m non-reference images are located to obtain a second projection point; uniformly sampling a line between the first projection point and the second projection point to obtain a plurality of sampling points; and reversing the plurality of sampling points Projecting into the three-dimensional space of the viewing angle of the reference image, a plurality of depth planes corresponding to the depth values of the plurality of sampling points are obtained.

In a possible implementation manner, the first determining module is specifically configured to determine a pixel point of the lower layer image of the first image pyramid corresponding to the pixel point of the top image of the first image pyramid; and determine the m second image pyramids a pixel of the lower image of the m second image pyramid corresponding to the pixel of the top image; determining an estimated depth value of the pixel of the lower image of the first image pyramid according to the preliminary depth map; determining the first image pyramid according to the estimated depth value a minimum depth value and a maximum depth value of a pixel of the lower layer image; determining a plurality of depth planes of the lower layer image of the first image pyramid between the minimum depth value and the maximum depth value; calculating using a plane scanning algorithm and a plurality of depth planes a second matching loss body corresponding to the lower layer image of the first image pyramid and the lower layer image of the m second image pyramids; the lower layer image of the first image pyramid is used as the guiding image, and the second matching loss body is locally optimized by using the guiding filtering algorithm Obtaining a third matching loss body; according to the third matching loss body, being the first image gold Each pixel of the lower layer image of the word tower selects a depth value with the smallest matching loss in the second matching loss body to obtain a scene depth map of the reference image.

In a possible implementation manner, the third determining module is specifically configured to acquire a specified pixel point of the target position of the reference image; determine a pixel value corresponding to the specified pixel point in the scene depth map; and correspond to the pixel corresponding to the specified pixel point The value determines the target depth layer at which the specified pixel point is located in the n depth layers.

In a possible implementation, the fuzzy processing module is specifically configured to determine L depth layers where the pixel to be processed is located, L is greater than or equal to 2 and less than n; and calculating a depth difference between the L depth layers and the target depth layer; The depth difference performs a predetermined ratio of blurring on the pixel points of each of the L depth layers, and the degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference.

In a third aspect, an embodiment of the present application provides an image background blurring apparatus, where the apparatus includes: a processor and a memory, wherein the memory stores an operation instruction executable by the processor, and the processor reads the operation instruction in the memory. For implementing the following method: extracting a reference image and m non-reference images in the target video according to an image extraction rule, the target video is a video captured by the mobile terminal according to a predetermined trajectory, m is greater than or equal to 1; constructing the first image by using the reference image Pyramid, constructing m second image pyramids using m non-reference images; determining a scene depth map of the reference image using the first image pyramid and the m second image pyramids, the scene depth map of the reference image representing any pixel point in the reference image The relative distance from the mobile terminal; the pixel depth of the reference image is divided into n depth layers by using the scene depth map, wherein the depth of the object corresponding to the pixel point in the different depth layer is different to the mobile terminal, where n is greater than or equal to 2 Determining the target position in the reference image; determining the target position from the n depth layers The target depth layer where the corresponding pixel is located; the pixel to be processed is blurred, and the pixel to be processed is a pixel included in the depth layer other than the target depth layer among the n depth layers.

In a third aspect, the embodiment of the present application divides each pixel of the reference image into n depth layers by using the obtained scene depth map, and determines the target position in the n depth layers by using the determined target position of the reference image. The target depth layer in which the pixel of the target position is located. Therefore, in the embodiment of the present application, the pixel to be processed included in the depth layer other than the target depth layer in the n depth layers may be blurred to obtain the target depth layer. An image with clear pixels and blurred pixels to be processed.

DRAWINGS

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings to be used in the embodiments will be briefly described below. Obviously, the drawings in the following description are only some embodiments of the present application, Those skilled in the art can also obtain other drawings based on these drawings without paying any creative work.

FIG. 1 is a flowchart of an image background blurring method provided by an embodiment of the present application;

FIG. 2 is a flowchart of another image background blurring method provided by an embodiment of the present application;

FIG. 3 is a flowchart of still another image background blurring method provided by an embodiment of the present application;

FIG. 4 is a flowchart of still another image background blurring method provided by an embodiment of the present application;

FIG. 5 is a flowchart of still another image background blurring method provided by an embodiment of the present application;

FIG. 6 is a flowchart of still another image background blurring method provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of an image background blurring apparatus provided by an embodiment of the present application;

FIG. 8 is a schematic diagram of still another image background blurring device provided by an embodiment of the present application;

FIG. 9 is a schematic diagram showing a design structure of an image background blurring device provided by an embodiment of the present application.

detailed description

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.

FIG. 1 is a flowchart of an image background blurring method provided by an embodiment of the present application. The image background blurring method shown in FIG. 1 can cause the mobile terminal to capture an image with a clear foreground and a blurred background. The method includes the following steps.

Step S11: Extracting a reference image and m non-reference images in the target video according to an image extraction rule, where the target video is a video captured by the mobile terminal according to a predetermined trajectory, where m is greater than or equal to 1.

The method provided by the embodiment of the present application can be applied to a mobile terminal, and the mobile terminal can be a device such as a smart phone.

The target video is a video captured by the mobile terminal according to a predetermined trajectory, and the predetermined trajectory may be preset, and the predetermined trajectory is a moving trajectory on the same plane. For example, the predetermined trajectory may be a left-to-right moving trajectory on the same plane, and the predetermined trajectory may also be a right-to-left moving trajectory on the same plane, and the predetermined trajectory may also be up to the same plane. The lower moving track, the predetermined track may also be a moving track from bottom to top on the same plane. Of course, regardless of which predetermined trajectory is used to capture the video, the camera of the mobile terminal needs to always be aligned with the position that needs to be taken.

When the target video is captured by the mobile terminal, the user needs to move the mobile terminal in a single direction, slowly and smoothly, and the moving distance can be 20 cm-30 cm. During the movement of the user holding the mobile terminal, the mobile terminal can judge the moving distance according to the gyroscope and select an appropriate reference image and a non-reference image in the target video.

The image extraction rule is a preset rule, and the image extraction rule may be: selecting a reference image and m non-reference images in the target video according to the playing duration of the target video, where m is a positive integer greater than or equal to 1. For example, if the length of the target video is 20 seconds, the image extraction rule may be to select 1 reference image and 4 non-reference images in the target video, and determine the image of the 10th second in the target video as the reference image, which will be 1st. Seconds, 3rd, 18th, and 20th seconds are used as non-reference images.

Certainly, the embodiment of the present application does not limit the number of non-reference images. For example, the number of non-reference images may be three, the number of non-reference images may be four, and the number of non-reference images may be five.

The reference image and the non-reference image are images extracted from different moments in the target video, and the reference image is the same as the shooting scene of the non-reference image, but the angle of view of the reference image is different from the position of the non-reference image. of. For example, the user captures a 10-second target video by using the mobile terminal, and the shooting scene of the target video is Plant A and Plant B, for setting the image extraction rule in advance to extract the image of the 5th second in the target video as a reference image. Four images of the 1st, 3rd, 8th, and 10th seconds are extracted as non-reference images in the target video, and the shooting scenes of the reference image and the non-reference image are both Plant A and Plant B, but, shooting The positions of the reference image and the non-reference image are different.

Step S12, constructing a first image pyramid by using a reference image, and constructing m second image pyramids by using m non-reference images.

After the mobile terminal extracts the reference image and the non-reference image in the target video, a first image pyramid can be constructed by using one reference image, and m second image pyramids are constructed by using m non-reference images. The "first" and "second" in the first image pyramid and the second image pyramid are only used to distinguish image pyramids constructed from different images, the first image pyramid represents only the image pyramid constructed by the reference image, and the second image pyramid Represents only image pyramids constructed from non-reference images.

In the process of constructing the first image pyramid by using the reference image, the mobile terminal uses the reference image as the bottom image of the first image pyramid. Then, the resolution of the underlying image of the first image pyramid is reduced to half as the upper layer image of the underlying image of the first image pyramid, and this step is continuously repeated to continuously obtain the upper layer image of the first image pyramid. Finally, the first image pyramid with a reference image of different resolution can be obtained by repeating several times.

The following is an example to briefly introduce the construction process of the first image pyramid. For example, if the number of layers of the first image pyramid is limited to three layers in advance, and the resolution of the reference image is 1000×1000, then the mobile terminal uses the reference image as the third layer image of the first image pyramid, then the first image pyramid The resolution of the third layer image is 1000×1000; then, the resolution of the third layer image of the first image pyramid is reduced to half as the second layer image of the first image pyramid, then the second image pyramid is second The resolution of the layer image is 500×500; finally, the resolution of the second layer image of the first image pyramid is further reduced to half as the third layer image of the first image pyramid, then the third layer image of the first image pyramid The resolution is 250×250. At this time, the first image pyramid includes three layers of images, which are reference images with different resolutions, the first layer image is a reference image with a resolution of 250×250, and the second layer image has a resolution of 500×500. The reference image, the third layer image is a reference image with a resolution of 1000×1000.

Of course, the construction process of the second image pyramid is the same as the construction process of the first image pyramid, and the number of layers of the second image pyramid and the first image pyramid is also the same, and the first image pyramid and the second image may be defined according to actual conditions. The number of layers in the pyramid.

Step S13: Determine a scene depth map of the reference image by using the first image pyramid and the m second image pyramids.

Wherein, after a first image pyramid and m second image pyramids are constructed, the scene depth map of the reference image may be determined by using the first image pyramid and the m second image pyramids.

The scene depth map of the reference image represents the relative distance between any pixel point in the reference image and the mobile terminal, and the pixel value of the pixel point in the scene depth map represents the relative distance between the actual location where the pixel point is located and the mobile terminal. In order to better illustrate the scene depth map, the following is a brief description of the example. For example, assuming that the resolution of the reference image is 100×100, the reference image has 10,000 pixels, and after determining the scene depth map of the reference image by using the first image pyramid and the m second image pyramids, the scene depth map The pixel value of 10000 pixels in the middle represents the relative distance between the actual position where 10000 pixels are located and the mobile terminal.

Step S14, dividing the pixel points of the reference image into n depth layers by using the scene depth map.

The depth of the object corresponding to the pixel in the different depth layer to the mobile terminal is different, where n is greater than or equal to 2, and each depth layer has a depth range. For example, the depth of a certain depth layer may range from 10 meters to 20 degrees. Meter. The n depth layers constitute the scene depth of the reference image, and the scene depth is the distance between the mobile terminal and the position of the farthest pixel point in the reference image. For example, the scene depth may be 0 to 30 meters.

The mobile terminal can acquire the preset n and the manner of dividing the depth layer, so that the number of depth layers and the depth range of each depth layer can be known. After the scene depth map of the reference image is obtained, the pixel values of the pixel points in the scene depth map can be determined. Since the pixel value of the pixel point in the scene depth map indicates the relative distance between the actual position where the pixel point is located and the mobile terminal, the mobile terminal may divide each pixel point of the reference image into n according to the pixel value of the pixel point of the scene depth map. In the depth layer.

For example, if the resolution of the reference image is 100×100, then the reference image has 10,000 pixels, and the pixel value of 10000 pixels in the scene depth map indicates the actual position of 10000 pixels and the relative position of the mobile terminal. distance. Assuming that the depth of the reference image is 0 to 30 meters, the mobile terminal divides the depth of the reference image into three depth layers according to a preset rule, and the depth of the first depth layer ranges from 0 meters to 10 meters. The second depth layer has a depth ranging from 10 meters to 20 meters, and the third depth layer has a depth ranging from 20 meters to 30 meters. Assuming that the actual position of the pixel A in the reference image is 15 meters relative to the mobile terminal, then the pixel A is divided into the second depth layer; assuming that the actual position of the pixel B in the reference image is The relative distance of the mobile terminal is 25 meters, then the pixel point B is divided into the third depth layer; if the actual position of the pixel point C in the reference image is 5 meters relative to the mobile terminal, then the pixel point C It will be divided into the first depth layer.

Step S15, determining a target position in the reference image.

Among them, there are many ways to determine the target position in the reference image. The following briefly introduces these methods.

In the first way, the target position is determined in the reference image according to the control command. Wherein, the control instruction may be an instruction input by the user on the touch screen of the mobile terminal by using a finger. For example, the user clicks a certain position in the reference image displayed on the touch screen of the mobile terminal by using the finger, and the mobile terminal determines the location clicked by the user as the target position.

In the second way, the specific position in the reference image is determined as the target position. Wherein, the specific position in the reference image is a previously specified position. For example, by determining the center point of the reference image as a specific position in advance, the mobile terminal can determine the center point of the reference image as the target position. For another example, the location closest to the mobile terminal in the reference image is determined as a specific location in advance, and then the mobile terminal can determine the location closest to the mobile terminal in the reference image as the target location.

In the third method, the face image in the reference image is identified, and the position of the face image in the reference image is determined as the target position. Since the face image in the reference image does not necessarily be in the position of the reference image, the mobile terminal needs to first recognize the face image in the reference image. After the face image in the reference image is recognized, the position where the face image is located is determined, and the position where the face image is located is determined as the target position.

Of course, the embodiments of the present application are not limited to the above manners, and other methods may be used to determine the target location in the reference image.

Step S16: Determine, from the n depth layers, a target depth layer where the pixel point corresponding to the target position is located.

Among them, since there are many ways to determine the target depth layer where the pixel points corresponding to the target position are located from the n depth layers, a mode will be briefly described below.

Optionally, determining, from the n depth layers, the target depth layer where the pixel corresponding to the target location is located may include the following steps: first, acquiring a specified pixel point of the target position of the reference image; and second step, at the scene depth The pixel value corresponding to the specified pixel point is determined in the figure; in the third step, the target depth layer where the specified pixel point is located is determined in the n depth layers according to the pixel value corresponding to the specified pixel point.

For example, assuming that the depth of the reference image is 0 to 30 meters, the mobile terminal divides the depth of the reference image into three depth layers according to a preset rule, and the depth of the first depth layer is 0 meters. Up to 10 meters, the depth of the second depth layer ranges from 10 meters to 20 meters, and the depth of the third depth layer ranges from 20 meters to 30 meters. Assuming that the specified pixel point of the target position of the reference image is the pixel point A, and the pixel value corresponding to the pixel point A is determined to be 15 meters in the scene depth map, it can be known that the target depth layer corresponding to the pixel value of 15 meters is the first The two depth layers, because the pixel value of 15 meters falls within the depth range of the second depth layer by 10 meters to 20 meters, so the target depth layer where the pixel point A is located is the second depth layer.

The pixel corresponding to the target depth layer may be a pixel of one object, and the pixel corresponding to the target depth layer may also be a pixel of multiple objects. For example, the object formed by the pixel points corresponding to the target depth layer is only one flower. For another example, the object formed by the pixel corresponding to the target depth layer includes a flower and a tree. For another example, the object formed by the pixel corresponding to the target depth layer is a part of a tree. For another example, the object formed by the pixel corresponding to the target depth layer includes a part of a flower and a part of a tree.

Step S17: Perform blur processing on the pixel to be processed.

The pixel to be processed is a pixel point included in a depth layer other than the target depth layer among the n depth layers.

After the mobile terminal determines the target depth layer where the pixel corresponding to the target position is located from the n depth layers, it can be known that the pixel points in the target depth layer need to be kept clear, and the n depth layers except the target depth layer The pixels included in the depth layer need to be blurred, and the pixel to be processed is the pixel that needs to be blurred, so the pixel to be processed is blurred. After the pixel point blur processing is to be processed, since the pixel points where the target depth layer is located are clear, the reference image becomes an image in which the pixel points of the target depth layer are clear and the pixels to be processed are blurred.

There are many ways to blur the pixel to be processed. For example, the pixel to be processed can be blurred by a Gaussian blur algorithm. Of course, other fuzzy algorithms can also be used for processing.

For example, assuming that the depth of the reference image is 0 to 30 meters, the mobile terminal divides the depth of the reference image into three depth layers according to a preset rule, and the depth of the first depth layer is 0 meters. Up to 10 meters, the depth of the second depth layer ranges from 10 meters to 20 meters, and the depth of the third depth layer ranges from 20 meters to 30 meters. Assuming that the specified pixel point of the target position of the reference image is the pixel point A, and the pixel value corresponding to the pixel point A is determined to be 15 meters in the scene depth map, it can be known that the target depth layer corresponding to the pixel value of 15 meters is the first Two depth layers, so the pixels to be processed contained in the first depth layer and the third depth layer need to be blurred, and the pixels in the second depth layer need to be kept clear. After the pixels to be processed included in the first depth layer and the third depth layer are blurred, the reference image becomes the pixel of the second depth layer, and the first depth layer and the third layer An image of a depth layer of pixels to be processed that is blurred.

Optionally, in step S17, in order to make the pixels to be processed have different degrees of blurring, thereby embodying the layering of the distance in the reference image, the following manner may be implemented. Therefore, the step S17 may further include the following steps: first, determining L depth layers where the pixel to be processed is located, L is greater than or equal to 2 and less than n; and in the second step, calculating a depth difference between the L depth layers and the target depth layer; In the third step, the pixel points of each of the L depth layers are subjected to a predetermined ratio of blur processing according to the depth difference, and the degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference .

Wherein, since the pixels to be processed are distributed in different depth layers, it is necessary to determine the L depth layers where the pixel points to be processed are located, and then calculate the depth difference between the L depth layers and the target depth layer.

The depth difference is the distance between two depth layers, for example, the depth of the first depth layer ranges from 0 meters to 10 meters, and the depth of the second depth layer ranges from 10 meters to 20 meters, and the third depth layer The depth range is 20 meters to 30 meters, then the depth difference between the first depth layer and the second depth layer is 10 meters, and the depth difference between the first depth layer and the third depth layer is 20 meters.

After obtaining the depth difference between the L depth layers and the target depth layer, the pixel points of each of the L depth layers may be subjected to a predetermined ratio of blur processing according to the depth difference. For example, suppose the first depth layer is the target depth layer, the second depth layer and the third depth layer are the two depth layers where the pixel to be processed is located, and the depth of the first depth layer and the second depth layer The difference is 10 meters, and the difference between the depth of the first depth layer and the third depth layer is 20 meters, then the pixel of the second depth layer is blurred by 25%, and the pixel of the third depth layer is used. The point is blurred by 50%.

Since both the target depth layer and the L depth layers are available, the depth difference between the L depth layers and the target depth layer can be calculated, and then the mobile terminal can each depth in the L depth layers according to the depth difference. The pixels of the layer are subjected to a preset ratio of blurring. The degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference, and if the depth difference between the depth layer and the target depth layer in the L depth layers is larger, then the pixel points in the depth layer The greater the degree of blurring; if the depth difference between the depth layer and the target depth layer in the L depth layers is smaller, the degree of blurring of the pixel points in the depth layer is smaller, thereby reflecting the level of different distances in the reference image sense.

In the embodiment shown in FIG. 1 , the embodiment of the present application divides each pixel of the reference image into n depth layers by using the obtained scene depth map, and then uses the determined target image position of the reference image at n depths. In the layer, the target depth layer in which the pixel of the target location is located is determined. Therefore, in the embodiment of the present application, the pixel to be processed included in the depth layer other than the target depth layer in the n depth layers may be subjected to blur processing to obtain the target depth. An image in which the pixels of the layer are clear and the pixels to be processed are blurred. Therefore, the embodiment of the present application can cause the mobile terminal to capture an image with a clear foreground and a blurred background.

Referring to FIG. 2, FIG. 2 is a flowchart of another image background blurring method provided by an embodiment of the present application. The embodiment shown in FIG. 2 is an embodiment based on the refinement of step S12 in FIG. 1, so that the same contents as in FIG. 1 can be referred to the embodiment shown in FIG. 1. The method shown in Figure 2 includes the following steps.

Step S21: Determine a preliminary depth map of the reference image according to the top image of the first image pyramid and the top image of the m second image pyramids, the first image pyramid and the m second image pyramids each including a top image and a lower layer image.

In the embodiment of the present application, the first layer image of the first image pyramid is referred to as a top layer image, and the second layer image to the last layer image of the first image pyramid is collectively referred to as a lower layer image, and the last image of the first image pyramid is The layer image is called the underlying image. The first layer image of the second image pyramid is referred to as the top layer image, the second layer image of the second image pyramid is collectively referred to as the lower layer image, and the last layer image of the second image pyramid is referred to as the bottom layer image. .

Since there are many ways to determine the preliminary depth map of the reference image according to the top image of the first image pyramid and the top image of the m second image pyramids, an implementation manner will be described below, and will not be described herein.

Step S22: Determine a scene depth map of the reference image according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids.

Wherein, since there are many ways according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids, an implementation manner will be described below, and will not be described herein.

In the embodiment shown in FIG. 2, the reference images at different resolutions are depth-sampled in the first image pyramid and the m second image pyramids, and the high-resolution scene is derived using the low-resolution preliminary depth map. The depth map, thereby speeding up the depth recovery, so the embodiment of the present application can use the image pyramid to generate the scene depth map of the reference image more quickly.

Referring to FIG. 3, FIG. 3 is a flowchart of still another image background blurring method provided by an embodiment of the present application. The embodiment shown in FIG. 3 is based on the refined embodiment of step S21 in FIG. 2, so the same content as FIG. 2 can be seen in the embodiment shown in FIG. 2. The method shown in Figure 3 includes the following steps.

Step S31: Calculate a first matching loss body according to the top image of the first image pyramid and the top image of the m second image pyramids.

The specific details of calculating the first matching loss body are described in detail in the subsequent steps, and are not described herein again.

Step S32, constructing a MRF (Markov Random Field) model according to the first matching loss body to perform global matching loss optimization, and obtain a preliminary depth map of the reference image.

Wherein, since the detail of the preliminary depth map of the reference image is not smooth enough and is not fine enough, a subsequent step is needed to smooth the preliminary depth map of the reference image.

In the embodiment shown in FIG. 3, a manner of specifically generating a preliminary depth map of the reference image is given. Of course, other means may be used to generate a preliminary depth map of the reference image, which is not described herein. According to the embodiment shown in FIG. 3, the first matching loss body may be first calculated according to the top image of the first image pyramid and the top image of the m second image pyramids; and then the MRF model is constructed according to the first matching loss body. The global matching loss is optimized so that a preliminary depth map of the reference image with a smooth detail can be obtained.

Referring to FIG. 4, FIG. 4 is a flowchart of still another image background blurring method provided by an embodiment of the present application. The embodiment shown in FIG. 4 is based on the refined embodiment of step S31 in FIG. 3, so the same content as FIG. 3 can be seen in the embodiment shown in FIG. The method shown in Figure 4 includes the following steps.

Step S41: Acquire a camera external parameter and a camera internal parameter of the mobile terminal in the perspective of the reference image and the m non-reference images.

Wherein, the mobile terminal can refer to the coordinates of the feature points of the image and the non-reference image, the correspondence relationship of the feature points, and the SFM (Structure from Motion) algorithm to calculate the reference image and the non-reference image. The camera external reference of the corresponding mobile terminal in the perspective, the camera external reference of the mobile terminal includes the camera optical center coordinates and the camera optical axis orientation. The camera internal parameters are obtained by pre-calibrating the camera. For example, the mobile terminal can determine the camera internal reference using the camera calibration toolbox through the checkerboard feature.

Step S42: Determine feature points in the reference image according to the feature point extraction rule.

Step S43: Obtain three-dimensional coordinates of feature points of the reference image.

The mobile terminal can perform feature point tracking on the target video by using KLT (Kanade Lucas Tomasi Feature Tracker) algorithm to obtain three-dimensional coordinates of several feature points and several feature points of the reference image.

Step S44: Determine a minimum depth value and a maximum depth value in the scene where the reference image is located according to the three-dimensional coordinates of the feature points of the reference image.

The minimum depth value and the maximum depth value of the feature points in the reference image may be first determined according to the three-dimensional coordinates; then, the depth range formed by the minimum depth value and the maximum depth value of the feature point is expanded by a preset value to obtain a reference. The minimum depth value and maximum depth value within the scene in which the image is located. The preset value can be a predetermined empirical value.

Step S45: Collect a plurality of depth planes between the minimum depth value and the maximum depth value.

The number of depth planes to be collected and the manner in which the depth planes are collected may be preset. For example, 11 depth planes are uniformly collected between the minimum depth value and the maximum depth value.

Step S46: Calculate, by using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm, a first homography matrix of a plurality of depth planes from a plane where the reference image is located to a plane where the m non-reference images are located.

Among them, the number of the first homography matrix is related to the calculation, so a plurality of first homography matrices are obtained here.

Step S47, using a Ps (Plane sweep) algorithm and a first homography matrix, projecting each pixel of the top image of the first image pyramid to a top image of the m second image pyramids by using multiple depth planes. On the plane where you are, get the parameter values after each pixel point is projected.

Among them, the parameter value can be the color and texture of each pixel.

Step S48: Determine a matching loss of each pixel point on the depth value according to a parameter value of each pixel point of the top image of the first image pyramid and a parameter value after each pixel point projection.

The matching loss can be defined as the absolute difference of the parameter values before and after the re-projection, and the parameter value can be a pixel color gradient.

Step S49, determining a matching loss of each pixel point of the top image of the first image pyramid in the plurality of depth planes as the first matching loss body.

In the embodiment shown in FIG. 4, a specific manner of generating the first matching loss body is given. Of course, other means may be used to generate the preliminary depth map of the reference image, which is not described herein. In this embodiment, the conventional method is not used to correct the image before calculating the matching loss, but multiple depth planes are obtained, and the matching loss is calculated by using the re-projection, so that the reference image and m non-can be better adapted in the depth recovery. The reference image changes the camera pose corresponding to the angle of view, improving the reliability of the depth recovery method.

Referring to FIG. 5, FIG. 5 is a flowchart of still another image background blurring method provided by an embodiment of the present application. The embodiment shown in FIG. 5 is based on the refined embodiment of step S45 in FIG. 4, so the same content as FIG. 4 can be seen in the embodiment shown in FIG. The method shown in Figure 5 includes the following steps.

Step S51, using a camera internal parameter, a camera external parameter, and a DLT (Direct Linear Transform) algorithm to calculate a second homography of the first depth plane where the minimum depth value is mapped from the reference image plane to the m non-reference image planes. Sexual matrix.

Step S52: Calculate, by using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm, a third homography matrix of the second depth plane where the maximum depth value is located, from the reference image plane to the m non-reference image planes.

Among them, the number of the second homography matrix is related to the calculation, so a plurality of second homography matrices are obtained here.

Step S53: Projecting a pixel point in the reference image onto the plane where the m non-reference images are located according to the second homography matrix, to obtain a first projection point.

Step S54: Projecting one pixel point in the reference image onto the plane where the m non-reference images are located according to the third homography matrix, to obtain a second projection point.

Step S55: uniformly sampling a line formed between the first projection point and the second projection point to obtain a plurality of sampling points.

Step S56: Backprojecting a plurality of sampling points into a three-dimensional space of a viewing angle of the reference image to obtain a plurality of depth planes corresponding to depth values of the plurality of sampling points.

In the embodiment shown in FIG. 5, when calculating the matching loss of the pixels of the reference image according to a depth plane, the pixel needs to be re-projected onto the m non-reference image planes, after the multiple depth planes are re-projected. The positions in the m non-reference images are equally spaced, so the embodiment of the present application helps the subsequent steps to extract the pixel matching information between the reference image and the m non-reference images more efficiently, thereby improving the depth map of the scene. Precision.

Referring to FIG. 6, FIG. 6 is a flowchart of still another image background blurring method provided by an embodiment of the present application. The embodiment shown in FIG. 6 is based on the refined embodiment of step S22 in FIG. 2, so the same content as FIG. 2 can be seen in the embodiment shown in FIG. 2. The method shown in Figure 6 includes the following steps.

Step S61: Determine pixel points of the lower layer image of the first image pyramid corresponding to the pixel points of the top image of the first image pyramid.

Step S62: Determine pixel points of the lower layer images of the m second image pyramids corresponding to the pixel points of the top image of the m second image pyramids.

Step S63: Determine an estimated depth value of a pixel point of the lower layer image of the first image pyramid according to the preliminary depth map.

Step S64: Determine a minimum depth value and a maximum depth value of the pixel points of the lower layer image of the first image pyramid according to the estimated depth value.

Step S65: Determine a plurality of depth planes of the lower layer image of the first image pyramid between the minimum depth value and the maximum depth value.

For a specific implementation manner of how to determine a plurality of depth planes of the lower layer image of the first image pyramid between the minimum depth value and the maximum depth value, reference may be made to the embodiment shown in FIG. 4, and details are not described herein again.

Step S66: Calculate a second matching loss body corresponding to the lower layer image of the first image pyramid and the lower layer image of the m second image pyramids by using the plane scanning algorithm and the plurality of depth planes.

Step S67: Using the lower layer image of the first image pyramid as the guide image, locally optimizing the second matching loss body by using a bootstrap filtering algorithm to obtain a third matching loss body.

Step S68: Select a depth value with a minimum matching loss in the second matching loss body for each pixel of the lower layer image of the first image pyramid according to the third matching loss body, to obtain a scene depth map of the reference image.

In the embodiment shown in FIG. 6, the preliminary depth map is used to estimate the minimum depth value and the maximum depth value of the pixel points of the lower layer image of the first image pyramid, thereby determining a relatively small depth search interval, thereby reducing the amount of calculation and The robustness of the depth recovery method to interference such as image noise is improved.

FIG. 7 is a schematic diagram of an image background blurring apparatus provided by an embodiment of the present application. FIG. 7 is an embodiment of the apparatus corresponding to FIG. 1. For the same content as that of FIG. 1 in FIG. 7, please refer to the corresponding embodiment of FIG. Referring to FIG. 7, the terminal device includes the following modules:

The extraction module 11 is configured to extract a reference image and m non-reference images in the target video according to an image extraction rule, where the target video is a video captured by the mobile terminal according to a predetermined trajectory, m is greater than or equal to 9;

a building module 12, configured to construct a first image pyramid by using a reference image, and construct m second image pyramids by using m non-reference images;

a first determining module 13 configured to determine a scene depth map of the reference image by using the first image pyramid and the m second image pyramids, where the scene depth map of the reference image represents a relative distance between any pixel point and the mobile terminal in the reference image ;

The dividing module 14 is configured to divide the pixel points of the reference image into the n depth layers by using the scene depth map, wherein the objects corresponding to the pixel points in the different depth layers are different in depth to the mobile terminal, where n is greater than or equal to 2;

a second determining module 15 configured to determine a target location in the reference image;

The third determining module 16 is configured to determine, from the n depth layers, a target depth layer where the pixel point corresponding to the target location is located;

The fuzzy processing module 17 is configured to perform blur processing on the pixel to be processed, where the pixel to be processed is a pixel point included in a depth layer other than the target depth layer among the n depth layers.

Optionally, the first determining module 13 is configured to determine, according to the top image of the first image pyramid and the top image of the m second image pyramids, a preliminary depth map of the reference image, the first image pyramid and the m second image pyramids. Each includes a top layer image and a bottom layer image; and a scene depth map of the reference image is determined according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids.

Optionally, the first determining module 13 is configured to calculate a first matching loss body according to the top image of the first image pyramid and the top image of the m second image pyramids; and construct a Markov according to the first matching loss body The airport model performs global matching loss optimization to obtain a preliminary depth map of the reference image.

Optionally, the first determining module 13 is configured to acquire a camera external parameter and a camera internal parameter of the mobile terminal at a viewing angle where the reference image and the m non-reference images are located; and determine a feature point in the reference image according to the feature point extraction rule; Obtaining a three-dimensional coordinate of the feature point of the reference image; determining a minimum depth value and a maximum depth value in the scene where the reference image is located according to the three-dimensional coordinates of the feature point of the reference image; determining a plurality of depth planes between the minimum depth value and the maximum depth value; Using a camera internal parameter, a camera external parameter and a direct linear transformation algorithm, calculating a first homography matrix of a plurality of depth planes from a plane in which the reference image is located to a plane in which the m non-reference images are located; using the plane scanning algorithm and the first single a pixel matrix, each pixel of the top image of the first image pyramid is projected in a plurality of depth planes onto a plane on which the top image of the m second image pyramids is located, to obtain a parameter value after each pixel point projection; The parameter value of each pixel of the top image of the first image pyramid and the parameter value after each pixel point projection, Each pixel in the mismatching loss on the depth value; the top image of each pixel of the pyramid image of the first plurality of depth plane mismatching loss is determined as the first matching loss thereof.

Optionally, the first determining module 13 is configured to calculate, by using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm, a first depth plane where the minimum depth value is located, and a mapping from the reference image plane to the m non-reference image planes. a second homography matrix; using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm to calculate a third homography matrix in which the second depth plane where the maximum depth value is located is mapped from the reference image plane to m non-reference image planes; A pixel in the reference image is projected onto the plane of the m non-reference images according to the second homography matrix to obtain a first projection point; and one pixel point in the reference image is projected to the third homography matrix to m a second projection point is obtained on a plane where the non-reference image is located; a plurality of sampling points are uniformly sampled on a line formed between the first projection point and the second projection point; and the plurality of sampling points are back-projected to the reference image In the three-dimensional space of the viewing angle, a plurality of depth planes corresponding to the depth values of the plurality of sampling points are obtained.

Optionally, the first determining module 13 is specifically configured to determine a pixel point of the lower layer image of the first image pyramid corresponding to the pixel point of the top image of the first image pyramid; and determine the top image of the m second image pyramid a pixel point of the lower layer image of the m second image pyramids corresponding to the pixel; determining an estimated depth value of the pixel point of the lower layer image of the first image pyramid according to the preliminary depth map; determining the lower layer image of the first image pyramid according to the estimated depth value a minimum depth value and a maximum depth value of the pixel; determining a plurality of depth planes of the lower layer image of the first image pyramid between the minimum depth value and the maximum depth value; calculating the first image pyramid using the plane scanning algorithm and the plurality of depth planes The second matching loss body corresponding to the lower layer image of the m second image pyramids; the lower layer image of the first image pyramid is used as the guiding image, and the second matching loss body is locally optimized by the guiding filtering algorithm to obtain the third matching Loss body; according to the third matching loss body, the lower layer image of the first image pyramid Loss of the second matching member matching each pixel in the selected minimum loss of depth value, scene depth to obtain the reference image of FIG.

Optionally, the third determining module 16 is specifically configured to acquire a specified pixel point of the target position of the reference image; determine a pixel value corresponding to the specified pixel point in the scene depth map; and obtain n pixel values according to the specified pixel point Determine the target depth layer where the specified pixel is located in the depth layer.

Optionally, the blur processing module 17 is specifically configured to determine L depth layers where the pixel to be processed is located, L is greater than or equal to 2 and less than n; and calculating a depth difference between the L depth layers and the target depth layer; The pixel points of each depth layer in the depth layer are subjected to a preset ratio of blurring processing, and the degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference.

FIG. 8 is a schematic diagram of still another image background blurring device provided by an embodiment of the present application. Referring to FIG. 8, the apparatus includes: a processor 21 and a memory 22, wherein the memory 22 stores operation instructions executable by the processor 21, and the processor 21 reads operation instructions in the memory 22 for implementing the above method embodiments. The method in .

FIG. 9 is a schematic diagram showing a design structure of an image background blurring device provided by an embodiment of the present application. The image background blurring device includes a transmitter 1101, a receiver 1102, a controller/processor 1103, a memory 1104, and a modem processor 1105.

Transmitter 1101 conditions (e.g., analog transforms, filters, amplifies, and upconverts, etc.) the output samples and generates an uplink signal that is transmitted to the base station via the antenna. On the downlink, the antenna receives the downlink signal transmitted by the base station. Receiver 1102 conditions (eg, filters, amplifies, downconverts, digitizes, etc.) the signals received from the antenna and provides input samples. In modem processor 1105, encoder 1106 receives the traffic data and signaling messages to be transmitted on the uplink and processes (e.g., formats, codes, and interleaves) the traffic data and signaling messages. Modulator 1107 further processes (e.g., symbol maps and modulates) the encoded traffic data and signaling messages and provides output samples. Demodulator 1109 processes (e.g., demodulates) the input samples and provides symbol estimates. The decoder 1108 processes (e.g., deinterleaves and decodes) the symbol estimate and provides decoded data and signaling messages that are sent to the terminal. Encoder 1106, modulator 1107, demodulator 1109, and decoder 1108 may be implemented by a composite modem processor 1105. These units are processed according to the radio access technology employed by the radio access network (e.g., access technologies of LTE and other evolved systems).

The controller/processor 1103 is configured to extract a reference image and m non-reference images in the target video according to an image extraction rule, where the target video is a video captured by the mobile terminal according to a predetermined trajectory, m is greater than or equal to 1; constructing with the reference image a first image pyramid, m first image pyramids are constructed using m non-reference images; a scene depth map of the reference image is determined using the first image pyramid and the m second image pyramids, and the scene depth map of the reference image represents the reference image The relative distance between the arbitrary pixel and the mobile terminal; the pixel of the reference image is divided into n depth layers by using the scene depth map, wherein the depth of the object corresponding to the pixel in the different depth layer is different to the mobile terminal, where n The target position is determined in the reference image; the target depth layer where the pixel corresponding to the target position is located is determined from the n depth layers; the pixel to be processed is blurred, and the pixel to be processed is n depth layers The pixel points contained in the depth layer other than the target depth layer.

It should be noted that the embodiments provided in the present application are only optional embodiments introduced in the present application, and those skilled in the art can design more embodiments based on this, and therefore are not described herein.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.

A person skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the system, the device and the unit described above can refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.

In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of cells is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or integrated. Go to another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.

The units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

Functionality, if implemented as a software functional unit and sold or used as a stand-alone product, can be stored on a computer readable storage medium. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including The instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) or a processor to perform all or part of the steps of the various embodiments of the present application. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

The above is only a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present application, and should cover Within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of protection of the claims.

Claims

An image background blurring method, characterized in that the method comprises:

Extracting a reference image and m non-reference images in the target video according to an image extraction rule, where the target video is a video captured by the mobile terminal according to a predetermined trajectory, m is greater than or equal to 1;

Constructing a first image pyramid using the reference image, and constructing m second image pyramids by using the m non-reference images;

Determining a scene depth map of the reference image by using the first image pyramid and the m second image pyramids, the scene depth map of the reference image representing an arbitrary pixel point in the reference image and the mobile terminal Relative distance between

Decoding, by the scene depth map, the pixel points of the reference image to the n depth layers, wherein the objects corresponding to the pixel points in the different depth layers are different in depth to the mobile terminal, where n is greater than or equal to 2;

Determining a target position in the reference image;

Determining, from the n depth layers, a target depth layer where the pixel point corresponding to the target location is located;

The pixel to be processed is subjected to blurring processing, and the pixel to be processed is a pixel point included in a depth layer other than the target depth layer among the n depth layers.
The image background blurring method according to claim 1, wherein determining the scene depth map of the reference image by using the first image pyramid and the m second image pyramids comprises:

Determining a preliminary depth map of the reference image according to a top image of the first image pyramid and a top image of the m second image pyramids, the first image pyramid and the m second image pyramids each including a top layer Image and underlying image;

Determining a scene depth map of the reference image according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids.
The image background blurring method according to claim 2, wherein determining the preliminary depth map of the reference image according to the top image of the first image pyramid and the top image of the m second image pyramids comprises:

Calculating a first matching loss body according to the top image of the first image pyramid and the top image of the m second image pyramids;

A global matching loss optimization is performed according to the first matching loss body to construct a Markov random field model, and a preliminary depth map of the reference image is obtained.
The image background blurring method according to claim 3, wherein calculating the first matching loss body according to the top image of the first image pyramid and the top image of the m second image pyramids comprises:

Obtaining a camera external parameter and a camera internal reference of the mobile terminal at a viewing angle where the reference image and the m non-reference images are located;

Determining feature points in the reference image according to a feature point extraction rule;

Obtaining three-dimensional coordinates of feature points of the reference image;

Determining a minimum depth value and a maximum depth value in a scene in which the reference image is located according to a three-dimensional coordinate of a feature point of the reference image;

Determining a plurality of depth planes between the minimum depth value and the maximum depth value;

Calculating a first homography of the plurality of depth planes from a plane in which the reference image is located to a plane mapping of the m non-reference images by using the camera internal parameter, the camera external parameter, and a direct linear transformation algorithm matrix;

Using a plane scan algorithm and the first homography matrix, projecting each pixel of the top image of the first image pyramid to the top image of the m second image pyramids by the plurality of depth planes On the plane of the plane, the parameter values after the projection of each pixel point are obtained;

Determining a matching loss of each pixel point on the depth value according to a parameter value of each pixel point of the top image of the first image pyramid and a parameter value after the projection of each pixel point;

A matching loss of each pixel point of the top image of the first image pyramid at the plurality of depth planes is determined as a first matching loss body.
The image background blurring method according to claim 4, wherein determining a plurality of depth planes between the minimum depth value and the maximum depth value comprises:

Computing, by the camera internal parameter, the camera external parameter, and a direct linear transformation algorithm, a second homography of the first depth plane where the minimum depth value is mapped from the reference image plane to the m non-reference image planes Sexual matrix

Calculating, by the camera internal parameter, the camera external parameter, and the direct linear transformation algorithm, a third depth plane in which the maximum depth value is located, and mapping from the reference image plane to the m non-reference image planes Univariate matrix;

And projecting a pixel point in the reference image onto the plane where the m non-reference images are located according to the second homography matrix, to obtain a first projection point;

Projecting a pixel point in the reference image onto the plane where the m non-reference images are located according to the third homography matrix, to obtain a second projection point;

And uniformly sampling a line formed between the first projection point and the second projection point to obtain a plurality of sampling points;

Projecting the plurality of sampling points back to the three-dimensional space of the perspective of the reference image to obtain a plurality of depth planes corresponding to the depth values of the plurality of sampling points.
The image background blurring method according to claim 2, wherein the reference is determined according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids The scene depth map of the image includes:

Determining a pixel point of a lower layer image of the first image pyramid corresponding to a pixel point of the top image of the first image pyramid;

Determining pixel points of the lower layer images of the m second image pyramids corresponding to pixel points of the top image of the m second image pyramids;

Determining an estimated depth value of a pixel point of a lower layer image of the first image pyramid according to the preliminary depth map;

Determining, according to the estimated depth value, a minimum depth value and a maximum depth value of a pixel point of a lower layer image of the first image pyramid;

Determining a plurality of depth planes of the lower layer image of the first image pyramid between the minimum depth value and the maximum depth value;

Calculating, by using a plane scanning algorithm and the plurality of depth planes, a second matching loss body corresponding to the lower layer image of the first image pyramid and the lower layer image of the m second image pyramids;

Using the lower layer image of the first image pyramid as a guiding image, locally optimizing the second matching loss body by using a guiding filtering algorithm to obtain a third matching loss body;

And selecting, according to the third matching loss body, a depth value that is the smallest matching loss in the second matching loss body for each pixel of the lower layer image of the first image pyramid, to obtain a scene depth map of the reference image.
The image background blurring method according to any one of claims 1 to 6, wherein the target depth layer in which the pixel point corresponding to the target position is determined from the n depth layers comprises:

Obtaining a specified pixel point of a target position of the reference image;

Determining, in the scene depth map, a pixel value corresponding to the specified pixel point;

Determining, in the n depth layers, a target depth layer in which the specified pixel point is located according to a pixel value corresponding to the specified pixel point.
The image background blurring method according to any one of claims 1 to 6, wherein the blurring of the pixel to be processed comprises:

Determining L depth layers where the pixel to be processed is located, L is greater than or equal to 2 and less than n;

Calculating a depth difference between the L depth layers and the target depth layer;

And performing, according to the depth difference, a pixel of each of the L depth layers by a predetermined proportion of blurring, a degree of blur of the pixel of each of the L depth layers and the The depth difference is proportional.
An image background blurring device, characterized in that the device comprises:

An extraction module, configured to extract a reference image and m non-reference images in the target video according to an image extraction rule, where the target video is a video captured by the mobile terminal according to a predetermined trajectory, m is greater than or equal to 9;

a building module, configured to construct a first image pyramid by using the reference image, and construct m second image pyramids by using the m non-reference images;

a first determining module, configured to determine, by using the first image pyramid and the m second image pyramids, a scene depth map of the reference image, where a scene depth map of the reference image represents any pixel in the reference image The relative distance between the point and the mobile terminal;

a dividing module, configured to divide, by using the scene depth map, pixel points of the reference image into n depth layers, wherein objects corresponding to pixel points in different depth layers have different depths to the mobile terminal, where n is greater than Equal to 2;

a second determining module, configured to determine a target location in the reference image;

a third determining module, configured to determine, from the n depth layers, a target depth layer where a pixel point corresponding to the target location is located;

And a fuzzy processing module, configured to perform blur processing on the pixel to be processed, where the pixel to be processed is a pixel point included in a depth layer other than the target depth layer among the n depth layers.
The image background blurring device according to claim 9, wherein:

The first determining module is configured to determine a preliminary depth map of the reference image according to the top image of the first image pyramid and the top image of the m second image pyramids, the first image pyramid and the The m second image pyramids each include a top layer image and a lower layer image; determining the reference image according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids Scene depth map.
The image background blurring device according to claim 10, wherein:

The first determining module is configured to calculate a first matching loss body according to the top image of the first image pyramid and the top image of the m second image pyramids; and construct a Mar according to the first matching loss body The Cove random field model performs global matching loss optimization to obtain a preliminary depth map of the reference image.
The image background blurring device according to claim 11, wherein:

The first determining module is specifically configured to acquire camera external parameters and camera internal parameters of the mobile terminal at a viewing angle where the reference image and the m non-reference images are located; and determine the reference image according to a feature point extraction rule a feature point in the feature image; acquiring a three-dimensional coordinate of the feature point of the reference image; determining a minimum depth value and a maximum depth value in the scene in which the reference image is located according to the three-dimensional coordinates of the feature point of the reference image; Determining a plurality of depth planes between the value and the maximum depth value; calculating, by the camera internal parameter, the camera outer parameter, and a direct linear transformation algorithm, the plurality of depth planes from a plane in which the reference image is located to a first homography matrix of the plane mapping of the m non-reference images; using a plane scan algorithm and the first homography matrix, each pixel of the top image of the first image pyramid is Projecting a depth image onto a plane on which the top image of the m second image pyramids is located, and obtaining a parameter value after the projection of each pixel point; a parameter value of each pixel of the top image of an image pyramid and a parameter value after the projection of each pixel point, determining a matching loss of the pixel point on the depth value; and the first image pyramid The matching loss of each pixel of the top image at the plurality of depth planes is determined as the first matching loss body.
The image background blurring device according to claim 12, wherein:

The first determining module is configured to calculate, by using the camera internal parameter, the camera external parameter, and a direct linear transformation algorithm, a first depth plane where the minimum depth value is located, from the reference image plane to the m a second homography matrix of the non-reference image plane mapping; using the camera internal parameter, the camera outer parameter, and the direct linear transformation algorithm, calculating a second depth plane where the maximum depth value is located by the reference image plane a third homography matrix mapped to the m non-reference image planes; projecting one pixel point in the reference image according to the second homography matrix onto a plane where the m non-reference images are located Obtaining a first projection point; projecting a pixel point in the reference image onto the plane where the m non-reference images are located according to the third homography matrix, to obtain a second projection point; A plurality of sampling points are uniformly sampled on a line formed between a projection point and the second projection point; and the plurality of sampling points are backprojected into a three-dimensional space of a viewing angle of the reference image, And a plurality of depth planes corresponding to depth values of the plurality of sampling points.
The image background blurring device according to claim 10, wherein:

The first determining module is specifically configured to determine a pixel point of a lower layer image of the first image pyramid corresponding to a pixel point of a top image of the first image pyramid; and determine a pyramid with the m second image pyramids a pixel point of a lower layer image of the m second image pyramids corresponding to a pixel of the top image; determining an estimated depth value of a pixel point of the lower layer image of the first image pyramid according to the preliminary depth map; a depth value determining a minimum depth value and a maximum depth value of a pixel point of the lower layer image of the first image pyramid; determining a plurality of lower layer images of the first image pyramid between the minimum depth value and the maximum depth value a depth matching plane; calculating, by the plane scanning algorithm and the plurality of depth planes, a second matching loss body corresponding to the lower layer image of the first image pyramid and the lower layer image of the m second image pyramids; A lower layer image of an image pyramid is used as a guide image, and the second matching loss body is locally optimized by a guided filtering algorithm to obtain a third Determining a loss body; selecting, according to the third matching loss body, a depth value of a minimum matching loss in the second matching loss body for each pixel of the lower layer image of the first image pyramid, to obtain the reference image Scene depth map.
The image background blurring device according to any one of claims 9 to 14, wherein:

The third determining module is specifically configured to acquire a specified pixel point of the target position of the reference image; determine, in the scene depth map, a pixel value corresponding to the specified pixel point; and corresponding to the specified pixel point A pixel value determines a target depth layer in which the specified pixel point is located in the n depth layers.
The image background blurring device according to any one of claims 9 to 14, wherein:

The blur processing module is specifically configured to determine L depth layers where the pixel to be processed is located, L is greater than or equal to 2 and less than n; calculating a depth difference between the L depth layers and the target depth layer; according to the depth Performing a blurring process on a pixel of each of the L depth layers by a predetermined ratio, and a degree of blur of a pixel of each of the L depth layers is proportional to the depth difference .
An image background blurring device, comprising: a processor and a memory, wherein the memory stores an operation instruction executable by the processor, and the processor reads an operation instruction in the memory for implementing the following claim 1 The method of any of 8.
A computer readable storage medium, comprising instructions that, when run on a computer, cause the computer to perform the method of any one of claims 1-8.
A computer program product, characterized in that it, when run on a computer, causes the computer to perform the method of any one of claims 1 to 8.