CN116206114A

CN116206114A - Portrait extraction method and device under complex background

Info

Publication number: CN116206114A
Application number: CN202310474489.5A
Authority: CN
Inventors: 向雷; 吕磊; 黄德頔
Original assignee: Chengdu Yundun Technology Co ltd
Current assignee: Chengdu Yundun Technology Co ltd
Priority date: 2023-04-28
Filing date: 2023-04-28
Publication date: 2023-06-02
Anticipated expiration: 2043-04-28
Also published as: CN116206114B

Abstract

The invention discloses a method and a device for extracting a portrait in a complex background, which relate to the technical field of picture processing, and can automatically and accurately extract the portrait in the complex background without manually extracting the portrait by a user; the picture to be extracted is free from the limitation of places and photographing positions, and the application range is wide. The key points of the scheme are as follows: dividing a picture to be extracted into a first non-overlapping block; mapping the first non-overlapping block to any dimension to obtain a dimension parameter; extracting characteristic quantities in the dimension parameters; splicing the blocks according to the feature quantity to obtain a first resolution feature map; dividing a portrait characteristic diagram according to the first resolution characteristic diagram; dividing the blocks of the portrait characteristic map into second non-overlapping blocks; extracting the portrait characteristic quantity of the second non-overlapping block, and fusing the portrait characteristic quantity with the multidimensional characteristic quantity obtained by a jump connection algorithm; converting the portrait characteristic quantity into an image output parameter; and outputting the portrait picture according to the image output parameters. The method is mainly used for extracting the image.

Description

Portrait extraction method and device under complex background

Technical Field

The invention relates to the technical field of picture processing, in particular to a method and a device for extracting a portrait under a complex background.

Background

The name of the portrait extraction technology is that the portrait is extracted from the picture, and in recent years, along with the development of scientific technology, the portrait extraction technology still has a plurality of problems, such as inaccurate portrait matting, and incapability of separating the face from the background more accurately; the edge detail is rough; the position of the person is more limited during photographing; without face tilt correction, etc.

The existing geometric feature-based identification method also has difficulty in that a unified and excellent feature extraction standard is not formed. As the facial patterns of adults become variable, even facial images of the same person are difficult to express in a uniform pattern due to time, illumination, camera angles, and the like, resulting in difficulty in feature extraction.

Disclosure of Invention

The invention provides a method and a device for extracting a portrait under a complex background, wherein the method comprises the steps of dividing a picture to be extracted into first non-overlapping blocks; mapping the first non-overlapping block to any dimension to obtain a dimension parameter; extracting characteristic quantities in the dimension parameters; splicing the characteristic quantities into blocks to obtain a first resolution characteristic diagram; dividing a portrait characteristic diagram according to the first resolution characteristic diagram; dividing the blocks of the portrait characteristic map into second non-overlapping blocks; extracting the portrait characteristic quantity of the second non-overlapping block, and fusing the portrait characteristic quantity with the multidimensional characteristic quantity obtained by a jump connection algorithm; converting the portrait characteristic quantity into an image output parameter; compared with the prior art, the invention does not need a user to manually extract the portrait, and can automatically and accurately extract the portrait under a complex background; the picture to be extracted is free from the limitation of places and photographing positions, and the application range is wide.

In order to achieve the above purpose, the invention adopts the following technical scheme:

the first aspect of the present invention provides a method for extracting a portrait in a complex background, including:

and dividing the picture to be extracted into a first non-overlapping block.

Mapping the first non-overlapping block to any dimension to obtain a dimension parameter.

And extracting the characteristic quantity in the dimension parameter.

And splicing the characteristic quantities into blocks to obtain a first resolution characteristic diagram.

And dividing the portrait characteristic diagram according to the first resolution characteristic diagram.

Dividing the blocks of the portrait characteristic map into second non-overlapping blocks.

And extracting the portrait characteristic quantity of the second non-overlapping block, and fusing the portrait characteristic quantity with the multidimensional characteristic quantity obtained by a jump connection algorithm.

And converting the portrait characteristic quantity into an image output parameter.

And outputting the portrait picture according to the image output parameters.

Further, the method for extracting the portrait under the complex background further comprises, after the feature quantity is spliced into blocks to obtain a first resolution feature map:

and extracting the characteristic quantity in the dimension parameter.

Splicing the characteristic quantities into blocks to obtain a second resolution characteristic diagram; the resolution of the second resolution feature map is greater than the resolution of the first resolution feature map.

Further, the method for extracting the portrait under the complex background further comprises, after splicing the feature quantities into blocks to obtain a second resolution feature map:

and extracting the characteristic quantity in the dimension parameter.

Splicing the characteristic quantities into blocks to obtain a third resolution characteristic diagram; the resolution of the third resolution feature map is greater than the resolution of the second resolution feature map.

Further, the method for extracting the portrait under the complex background divides the blocks of the portrait feature map into second non-overlapping blocks, including:

dividing the block of the first resolution feature map into blocks of a second resolution feature map.

Dividing the blocks of the second resolution feature map into blocks of a third resolution feature map.

The block of the third resolution feature map is partitioned into the second non-overlapping blocks.

A second aspect of the present invention provides a portrait extraction device under a complex background, including:

and the first segmentation unit is used for segmenting the picture to be extracted into first non-overlapping blocks.

And the mapping unit is used for mapping the first non-overlapping block to any dimension to obtain a dimension parameter.

And the first extraction unit is used for extracting the characteristic quantity in the dimension parameter.

And the first splicing unit is used for splicing the characteristic quantities into blocks to obtain a first resolution characteristic diagram.

And the second segmentation unit is used for segmenting the portrait characteristic diagram according to the resolution characteristic diagram.

And the third segmentation unit is used for segmenting the blocks of the portrait characteristic diagram into second non-overlapping blocks.

And the second extraction unit is used for extracting the portrait characteristic quantity of the second non-overlapping block and fusing the portrait characteristic quantity with the multidimensional characteristic quantity obtained by the jump connection algorithm.

And the conversion unit is used for converting the portrait characteristic quantity into an image output parameter.

And the output unit is used for outputting the portrait picture according to the image output parameters.

Further, the device for extracting the portrait under the complex background further comprises:

and the third extraction unit is used for extracting the characteristic quantity in the dimension parameter.

The second splicing unit is used for splicing the characteristic quantities into blocks to obtain a second resolution characteristic diagram; the resolution of the second resolution feature map is greater than the resolution of the first resolution feature map.

and a fourth extraction unit, configured to extract a feature quantity in the dimension parameter.

The third splicing unit is used for splicing the characteristic quantities into blocks to obtain a third resolution characteristic diagram; the resolution of the third resolution feature map is greater than the resolution of the second resolution feature map.

Further, in the image extraction device under a complex background, the third segmentation unit includes:

and the first segmentation module is used for segmenting the block of the first resolution characteristic map into blocks of the second resolution characteristic map.

And the second segmentation module is used for segmenting the blocks of the second resolution characteristic map into blocks of the third resolution characteristic map.

And a third segmentation module, configured to segment the block of the third resolution feature map into the second non-overlapping block.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are used in the description of the embodiments will be briefly described below, which are only for the purpose of illustrating the embodiments and are not to be construed as limiting the present invention.

FIG. 1 is a schematic flow chart of a method for extracting images under a complex background in an embodiment of the invention;

FIG. 2 is a schematic flow chart of a method for extracting images under another complex background according to an embodiment of the invention;

FIG. 3 is a schematic diagram of a configuration of a portrait extraction device under a complex background according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a configuration of a portrait extraction device under another complex background according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs; the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention and the terms "comprising" and "having" and any variations thereof, as described in the specification and claims of the invention and the above description of the drawings, are intended to cover a non-exclusive inclusion.

In the description of embodiments of the present invention, the technical terms "first," "second," and the like are used merely to distinguish between different objects and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated, a particular order or a primary or secondary relationship. In the description of the embodiments of the present invention, the meaning of "plurality" is two or more unless explicitly defined otherwise.

In the description of the embodiments of the present invention, the term "and/or" is merely an association relationship describing an association object, and indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.

In the description of the embodiments of the present invention, the term "plurality" means two or more (including two), and similarly, "plural sets" means two or more (including two), and "plural sheets" means two or more (including two).

In the description of the embodiments of the present invention, the orientation or positional relationship indicated by the technical terms "center", "longitudinal", "transverse", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", "axial", "radial", "circumferential", etc. are based on the orientation or positional relationship shown in the drawings, and are merely for convenience of describing the embodiments of the present invention and for simplifying the description, rather than indicating or implying that the apparatus or component to be referred to must have a specific orientation, be constructed and operated in a specific orientation, and therefore should not be construed as limiting the embodiments of the present invention.

In the description of the embodiments of the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," "secured" and the like should be construed broadly and may be, for example, fixedly connected, detachably connected, or integrally formed; or may be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between the two components or interaction relationship between the two components. The specific meaning of the above terms in the examples of the present invention will be understood by those skilled in the art according to the specific circumstances.

Example 1

The embodiment of the invention provides a method for extracting a human image under a complex background, which is shown in fig. 1 and comprises the following steps:

s1, dividing a picture to be extracted into a first non-overlapping block.

What needs to be explained here is: the embodiment of the invention does not limit the picture to be extracted, but the picture to be extracted necessarily contains the portrait.

Wherein, the non-overlapping blocks, as the name implies, are non-overlapping picture blocks, the size of the first non-overlapping block after division is not limited in the embodiments of the present invention.

S2, mapping the first non-overlapping block to any dimension to obtain a dimension parameter.

Where dimension is also known as dimension, is the number of independent parameters in mathematics. In the field of physics and philosophy, the number of independent space-time coordinates is meant.

And S3, extracting the characteristic quantity in the dimension parameter.

And S4, splicing the characteristic quantities into blocks to obtain a first resolution characteristic diagram.

S5, dividing the portrait characteristic diagram according to the first resolution characteristic diagram.

S6, dividing the blocks of the portrait characteristic diagram into second non-overlapping blocks.

And S7, extracting the portrait characteristic quantity of the second non-overlapping block, and fusing the portrait characteristic quantity with the multidimensional characteristic quantity obtained through a jump connection algorithm.

S8, converting the portrait characteristic quantity into an image output parameter.

S9, outputting the portrait picture according to the image output parameters.

The present invention achieves the above steps by FT-UNet, which consists of encoder, bottleneck, decoder and skip connections. The basic unit of FT-UNet is the Focal Transformer module. For the encoder, to convert the input into sequence embedding, the image is divided into non-overlapping blocks of 4×4 size. The patch metering layer is responsible for downsampling and upscaling dimensions, and Focal Transformer is responsible for learning feature representations. The extracted context features are fused with the multi-scale features from the encoder via skip connection to supplement the spatial information loss caused by downsampling. The patch merge layer duplicates the feature map of the adjacent dimension to a large feature map with twice the resolution of the upsampling. Finally, 4 x upsampling is performed using the patch expanding layer to restore the resolution of the feature map to the input resolution W x H, and then a linear projection layer is applied over these upsampled features to output the image extraction.

The Focal Transformer structure comprises a patch part layer, a linear embedded layer, a Focal Transformer layer and a patch raising layer. The patch partition layer is used for block dimension reduction, the linear embedding layer is used for linear transformation, and the patch raising layer is used for downsampling. The core module is a Focal Transformer block that contains focus self-attention for moving windows.

Wherein the focus self-attention model is as follows:

assume that a feature map is input

Wherein->

And d is a feature dimension. All L-level pool windows. L is the number of granularity levels of focus self-attention extraction tags. For the focus level l, the input feature map x is first divided into a size +.>

Is provided. />

Is indicated at->

The level gets the size of the sub-window of the summary mark by using the linear layer +.>

Spatially merging sub-windows: />

，

（1）

Wherein x represents an input feature map; l represents a focus level; m, N the spatial dimension; d represents a feature dimension;

is indicated at->

The level obtains the size of the sub-window of the summary mark; />

Representing a linear layer.

When the collected feature map of all L layers is obtained

When three linear projection layers are used +.>

、/>

And->

To calculate the first tier query and the keys and values of all tiers:

，/>

，

（2）

wherein Q, K, V is a query, key, and value matrix, respectively; x represents the input feature map;

、/>

and->

Representing a linear projection layer.

To perform the focus self-intent, surrounding labels for each query label in the feature map are first extracted. For the ith window

Inquiry in->

Representing window division +.>

Representing the number of horizontal and vertical sub-windows in the level I participation area, from +.>

And->

Extract->

Key and value, then collecting key and value from all L +.>

And->

Wherein s is the sum of the focal areas of all layers, i.e. +.>

. Here, the +.>

Focus self-attention of (c):

（3）

in the method, in the process of the invention,

、/>

and->

Respectively representing the query, the key and the value matrix, d is the vector dimension, B is the deflection matrix, and SoftMax is the multi-class activation function.

The invention provides a portrait extraction method under a complex background, which comprises the steps of dividing a picture to be extracted into first non-overlapping blocks; mapping the first non-overlapping block to any dimension to obtain a dimension parameter; extracting characteristic quantities in the dimension parameters; splicing the characteristic quantities into blocks to obtain a first resolution characteristic diagram; dividing a portrait characteristic diagram according to the first resolution characteristic diagram; dividing the blocks of the portrait characteristic map into second non-overlapping blocks; extracting the portrait characteristic quantity of the second non-overlapping block, and fusing the portrait characteristic quantity with the multidimensional characteristic quantity obtained by a jump connection algorithm; converting the portrait characteristic quantity into an image output parameter; compared with the prior art, the invention does not need a user to manually extract the portrait, and can automatically and accurately extract the portrait under a complex background; the picture to be extracted is free from the limitation of places and photographing positions, and the application range is wide.

Example 2

The embodiment of the invention provides a method for extracting a human image under a complex background, which is shown in fig. 2 and comprises the following steps:

s201, dividing a picture to be extracted into first non-overlapping blocks.

Specifically, the picture to be extracted is divided into first non-overlapping blocks with the same size, for example: the picture to be extracted is partitioned into non-overlapping blocks of 4 x 4 size, and what needs to be explained here is: the size of the non-overlapping blocks is not limited in the embodiment of the invention, and an implementer can correspondingly determine according to the size of the picture to be extracted.

S202, mapping the first non-overlapping block to any dimension to obtain a dimension parameter.

Specifically, the first non-overlapping block is mapped to three dimensions to obtain its parameters, such as: non-overlapping blocks of size 4 x 4 are mapped to three dimensions, resulting in dimension parameters of 4 x 3.

S203, extracting the characteristic quantity in the dimension parameter.

S204, splicing the characteristic quantities into blocks to obtain a first resolution characteristic diagram.

S205, extracting the characteristic quantity in the dimension parameter.

S206, splicing the characteristic quantities into blocks to obtain a second resolution characteristic diagram; the resolution of the second resolution feature map is greater than the resolution of the first resolution feature map.

S207, extracting the characteristic quantity in the dimension parameter.

S208, splicing the characteristic quantities into blocks to obtain a third resolution characteristic diagram; the resolution of the third resolution feature map is greater than the resolution of the second resolution feature map.

S209, dividing a portrait characteristic diagram according to the first resolution characteristic diagram.

S210, dividing the blocks of the portrait characteristic diagram into second non-overlapping blocks.

S2101, dividing the block of the first resolution feature map into blocks of a second resolution feature map.

S2102, dividing the block of the second resolution feature map into blocks of a third resolution feature map.

S2103, dividing the block of the third resolution feature map into the second non-overlapping blocks.

In the case of insufficient raw data, the data set is expanded by data enhancement. A common way to augment a data set is to add new data, but in reality this is difficult to implement. In another mode, the data set is subjected to operations such as overturning, rotating and shearing, namely the data is enhanced, so that the data is expanded, and the mode is high in operability and convenient to realize. Data enhancement is performed by pixel-to-pixel spatial transformation in the image. The following formula is used for the coordinate transformation:

（4）

where (v, w) represents coordinates of pixels in the original image and (x, y) represents coordinates in the transformed graphic.

Affine transformation is a common transformation, the general form of which is given by the formula:

（5）

in the method, in the process of the invention,

representing coordinates of pixels in the original image, +.>

Representing coordinates in the transformed graph.

The embodiment of the invention provides a data enhancement method Auto-segment based on search. The basic idea is to find the best image transformation strategy from the data itself using reinforcement learning and evaluate the quality of a particular strategy directly on some datasets. In the invention, a search space is designed, each policy is composed of a plurality of sub-policies, and one sub-policy is randomly selected for each image in each batch. The sub-strategy contains two operations, each of which is an image processing method such as translation, rotation or clipping, and for each of which there is a set of probabilities and magnitudes to characterize the nature of the use of this operation.

The search algorithm has two components: one part is the controller and the other part is the training algorithm PPO (Proximal Policy Optimization) algorithm. In each operation, the controller predicts the result output by SoftMax to generate a feature vector, and then takes the feature vector as an embedded vector of the next operation. The search algorithm has two components, one controller is a recurrent neural network and one training algorithm is an approximate policy optimization algorithm. Each step controller predicts a decision made by SoftMax. And then sent to the next step as an embedding. The controller predicts 30 SoftMax predictions for a total of 5 sub-policies, each with 2 operations, each requiring an operation type, intensity, probability.

Training of a controller: the controller is trained by the reward signal, i.e., how much the strategy can promote the generalization of the submodel, a neural network is trained as part of the search process. In the embodiment of the invention, the generalization capability of the sub-model is measured by setting a verification set. One sub-model is trained using the generation of enhanced data by applying 5 sub-strategies to the training set. For each small lot case, one of the 5 sub-policies would be randomly chosen to enhance the picture. This sub-model would then be measured for accuracy over the validation set and would then be used as a feedback signal to train the recurrent neural network. On each dataset, the controller would sample 1.5 tens of thousands of policies.

At the end of the search, a single strategy is spliced from the sub-strategies of the 5 best strategies. This final strategy, which has 25 sub-strategies, would be used to train the model for each dataset.

S211, extracting the portrait characteristic quantity of the second non-overlapping block, and fusing the portrait characteristic quantity with the multidimensional characteristic quantity obtained through a jump connection algorithm.

S212, converting the portrait characteristic quantity into an image output parameter.

S213, outputting the portrait picture according to the image output parameters.

The embodiment of the invention provides an improved structure segmentation loss function:

（6）

in the method, in the process of the invention,

representing the loss of structure segmentationA loss function; />

Representation->

Weights of (2); />

Representing the exponential logarithmic Dice loss; />

Representation->

Weights of (2); />

Representing weighted exponential cross entropy.

Using

And->

Respectively calculating the exponential logarithmic Dice loss (+)>

) And weighted exponential cross entropy

) Weight of (2):

（7）

in the method, in the process of the invention,

representing the exponential logarithmic Dice loss; i represents a label; />

Representation->

Average value for i.

（8）/>

Wherein X represents a pixel position; i represents a label; l represents the real label at X;

represents Kronecker delta; />

Representing the probability of SoftMax; />

Is the pseudo count of missing labels in the addition smoothing training samples.

（9）

Where X represents the pixel location and l represents the real label at X.

Representation->

In relation to the average value of X,

representing the probability of SoftMax.

When calculating

When (I)>

Serving as part of the pixel X owned by tag i. />

Is the pseudo count of missing labels in the addition smoothing training samples. />

Wherein->

Is the frequency of tag k and is the tag weight used to reduce the impact of more frequently occurring tags. By introducing index->

And->

Further controlling the nonlinearity of the loss. For simplicity the invention is used here +.>

。

The penalty function uses tag weights

To balance the tag frequency, this focus loss also balances between simple and difficult samples. The combination of exponential and logarithmic conversions for Focal and Dice Loss allows the network to be forced to focus on portions of the prediction inaccuracy to merge finer segmentation boundaries with accurate data distribution.

Example 3

An embodiment of the present invention provides a device for extracting a portrait in a complex background, as shown in fig. 3, including:

a first dividing unit 31, configured to divide the picture to be extracted into first non-overlapping blocks.

And a mapping unit 32, configured to map the first non-overlapping block to any dimension, so as to obtain a dimension parameter.

A first extraction unit 33, configured to extract a feature quantity in the dimension parameter.

And the first splicing unit 34 is used for splicing the feature quantities into blocks to obtain a first resolution feature map.

And a second segmentation unit 35, configured to segment a portrait feature map according to the first resolution feature map.

A third segmentation unit 36 for segmenting the blocks of the portrait feature map into second non-overlapping blocks.

A second extracting unit 37 for extracting the portrait characteristic amount of the second non-overlapping block, and fusing the portrait characteristic amount with the multi-dimensional characteristic amount obtained by the jump connection algorithm.

A conversion unit 38 for converting the portrait characteristic amount into an image output parameter.

An output unit 39 for outputting a portrait picture according to the image output parameter.

What needs to be explained here is: the detailed description of each part of this embodiment may refer to the corresponding parts of other embodiments, and will not be repeated here.

The invention provides a portrait extraction device under a complex background, which comprises the steps of dividing a picture to be extracted into first non-overlapping blocks; mapping the first non-overlapping block to any dimension to obtain a dimension parameter; extracting characteristic quantities in the dimension parameters; splicing the characteristic quantities into blocks to obtain a first resolution characteristic diagram; dividing a portrait characteristic diagram according to the first resolution characteristic diagram; dividing the blocks of the portrait characteristic map into second non-overlapping blocks; extracting the portrait characteristic quantity of the second non-overlapping block, and fusing the portrait characteristic quantity with the multidimensional characteristic quantity obtained by a jump connection algorithm; converting the portrait characteristic quantity into an image output parameter; compared with the prior art, the invention does not need a user to manually extract the portrait, and can automatically and accurately extract the portrait under a complex background; the picture to be extracted is free from the limitation of places and photographing positions, and the application range is wide.

Example 4

The embodiment of the invention provides a portrait extraction device under a complex background, as shown in fig. 4, comprising:

a first dividing unit 41 for dividing the picture to be extracted into first non-overlapping blocks.

And a mapping unit 42, configured to map the first non-overlapping block to any dimension, to obtain a dimension parameter.

A first extraction unit 43, configured to extract a feature quantity in the dimension parameter.

And a first stitching unit 44, configured to stitch the blocks according to the feature values to obtain a first resolution feature map.

A third extracting unit 45 for extracting the feature quantity in the dimension parameter.

And a second splicing unit 46, configured to splice the blocks according to the feature values, and obtain a second resolution feature map. The resolution of the second resolution feature map is greater than the resolution of the first resolution feature map.

A fourth extraction unit 47 for extracting feature amounts in the dimensional parameters.

A third splicing unit 48, configured to splice the features into blocks according to the feature values, so as to obtain a third resolution feature map; the resolution of the third resolution feature map is greater than the resolution of the second resolution feature map.

And a second segmentation unit 49, configured to segment the portrait feature map according to the first resolution feature map.

A third segmentation unit 410, configured to segment the block of the portrait characteristic map into a second non-overlapping block.

A first segmentation module 4101 for segmenting the block of the first resolution feature map into blocks of a second resolution feature map.

A second segmentation module 4102 for segmenting the block of the second resolution profile into blocks of a third resolution profile.

A third partitioning module 4103 for partitioning a block of the third resolution feature map into the second non-overlapping block.

A second extracting unit 411 for extracting the portrait feature quantity of the second non-overlapping block, and fusing the portrait feature quantity with the multi-dimensional feature quantity obtained by the jump connection algorithm.

A conversion unit 412 for converting the portrait characteristic amount into an image output parameter.

An output unit 413 for outputting a portrait picture according to the image output parameter.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limited thereto; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention, and are intended to be included within the scope of the appended claims and description. In particular, the technical features mentioned in the respective embodiments may be combined in any manner as long as there is no structural conflict. The present invention is not limited to the specific embodiments disclosed herein, but encompasses all technical solutions falling within the scope of the claims.

Claims

1. The method for extracting the portrait under the complex background is characterized by comprising the following steps:

dividing a picture to be extracted into a first non-overlapping block;

mapping the first non-overlapping block to any dimension to obtain a dimension parameter;

extracting characteristic quantities in the dimension parameters;

splicing the characteristic quantities into blocks to obtain a first resolution characteristic diagram;

dividing a portrait characteristic diagram according to the first resolution characteristic diagram;

dividing the blocks of the portrait characteristic map into second non-overlapping blocks;

extracting the portrait characteristic quantity of the second non-overlapping block, and fusing the portrait characteristic quantity with the multidimensional characteristic quantity obtained by a jump connection algorithm;

converting the portrait characteristic quantity into an image output parameter;

and outputting the portrait picture according to the image output parameters.

2. The method for extracting a person image under a complex background according to claim 1, further comprising, after the block is spliced according to the feature quantity to obtain a first resolution feature map:

extracting characteristic quantities in the dimension parameters;

3. The method for extracting a person image under a complex background according to claim 2, further comprising, after the blocks are spliced according to the feature quantity to obtain a second resolution feature map:

extracting characteristic quantities in the dimension parameters;

4. The method for extracting a figure under a complex background according to claim 1, wherein dividing the block of the figure feature map into second non-overlapping blocks comprises:

dividing the block of the first resolution feature map into blocks of a second resolution feature map;

dividing the blocks of the second resolution feature map into blocks of a third resolution feature map;

5. A portrait extraction device under a complex background, comprising:

the first segmentation unit is used for segmenting the picture to be extracted into first non-overlapping blocks;

the mapping unit is used for mapping the first non-overlapping block to any dimension to obtain a dimension parameter;

a first extraction unit for extracting feature quantities in the dimension parameters;

the first splicing unit is used for splicing the characteristic quantities into blocks to obtain a first resolution characteristic diagram;

the second segmentation unit is used for segmenting the portrait characteristic diagram according to the first resolution characteristic diagram;

a third dividing unit for dividing the blocks of the portrait characteristic map into second non-overlapping blocks;

a second extraction unit for extracting the portrait characteristic quantity of the second non-overlapping block, and fusing the portrait characteristic quantity with the multidimensional characteristic quantity obtained by the jump connection algorithm;

a conversion unit for converting the portrait characteristic quantity into an image output parameter;

6. The complex background portrait extraction apparatus of claim 5 further comprising:

a third extraction unit for extracting feature quantities in the dimension parameters;

7. The complex background portrait extraction apparatus of claim 6 further comprising:

a fourth extraction unit for extracting feature quantities in the dimension parameters;

8. The apparatus for extracting a person from a complex background according to claim 5, wherein the third dividing unit includes:

a first segmentation module for segmenting the block of the first resolution feature map into blocks of a second resolution feature map;

a second segmentation module for segmenting the blocks of the second resolution feature map into blocks of a third resolution feature map;