CN111047604A - Transparency mask extraction method and device for high-definition image and storage medium - Google Patents

Transparency mask extraction method and device for high-definition image and storage medium Download PDF

Info

Publication number
CN111047604A
CN111047604A CN201911203685.9A CN201911203685A CN111047604A CN 111047604 A CN111047604 A CN 111047604A CN 201911203685 A CN201911203685 A CN 201911203685A CN 111047604 A CN111047604 A CN 111047604A
Authority
CN
China
Prior art keywords
pixel
region
node
value
foreground
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911203685.9A
Other languages
Chinese (zh)
Other versions
CN111047604B (en
Inventor
冯夫健
王林
黄翰
谭棉
刘爽
魏嘉银
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Minzu University
Original Assignee
Guizhou Minzu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Minzu University filed Critical Guizhou Minzu University
Priority to CN201911203685.9A priority Critical patent/CN111047604B/en
Publication of CN111047604A publication Critical patent/CN111047604A/en
Application granted granted Critical
Publication of CN111047604B publication Critical patent/CN111047604B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention provides a method, a device and a storage medium for extracting a transparency mask of a high-definition image, wherein the method comprises the steps of marking an unknown area in the high-definition image; dividing the unknown region into a plurality of sub-regions according to the pixel information in the unknown region; converting each subregion into a node of a graph structure, calculating edge weights between adjacent nodes, and generating the graph structure according to each edge weight; and generating a node optimization queue according to the edge weight between the nodes, determining a foreground region and a background region according to the node optimization queue, selecting a pixel value, and solving an optimal value of the pixel value. According to the method, the high-definition image is subjected to region division through the pixel points, the divided regions are expressed in a node form of a graph structure, the edge weight is calculated, a node optimization queue is obtained through the edge weight, the foreground region and the background region are quickly determined in the node optimization queue, optimization solution is carried out, and finally the optimal foreground mask value is obtained, so that the calculation precision is high and quick.

Description

Transparency mask extraction method and device for high-definition image and storage medium
Technical Field
The invention mainly relates to the technical field of image processing, in particular to a transparency mask extraction method and device for a high-definition image and a storage medium.
Background
At present, mobile devices such as mobile phones and cameras have higher and higher resolution of shot images, i.e., a transparency mask extraction technology of high-definition images is mainly applied to special effects of movies and televisions, different foreground objects are synthesized into a specified scene, the higher the extraction precision is, the better the visual effect of image synthesis is, and the problems of too long calculation time and low calculation precision exist in the conventional method for extracting the transparency mask of the high-definition images.
Disclosure of Invention
The invention provides a transparency mask extraction method and device for a high-definition image and a storage medium, aiming at the defects of the prior art.
The technical scheme for solving the technical problems is as follows: a transparency mask extraction method of a high-definition image comprises the following steps:
inputting a high-definition image, and marking an unknown region, a foreground region and a background region in the high-definition image;
dividing the unknown region into a plurality of sub-regions according to the pixel information in the unknown region;
converting each subregion into a node of a graph structure, calculating edge weights between adjacent nodes, and generating the graph structure according to each edge weight;
and generating a node optimization queue according to the edge weights among the nodes, selecting pixel values in the sub-regions, the foreground region and the background region, solving the optimal value of the selected pixel values according to the node optimization queue, and taking the optimal value obtained by the solution as the optimal foreground mask value.
Another technical solution of the present invention for solving the above technical problems is as follows: a transparency mask extraction device for high definition images comprises:
the calibration module is used for inputting a high-definition image and calibrating an unknown region, a foreground region and a background region in the high-definition image;
the region segmentation module is used for segmenting the unknown region into a plurality of sub-regions according to the pixel information in the unknown region;
the graph structure generating module is used for converting each subregion into a node of the graph structure, calculating edge weights between adjacent nodes and generating the graph structure according to each edge weight;
and the optimization module is used for generating a node optimization queue according to the edge weight among the nodes, selecting pixel values in the sub-regions, the foreground region and the background region, carrying out optimal value solution on the selected pixel values according to the node optimization queue, and taking the optimal value obtained by the solution as an optimal foreground mask value.
Another technical solution of the present invention for solving the above technical problems is as follows: a transparency mask extraction apparatus for high definition images comprises a memory, a processor and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the transparency mask extraction method for high definition images as described above is realized.
Another technical solution of the present invention for solving the above technical problems is as follows: a computer-readable storage medium, storing a computer program which, when executed by a processor, implements a transparency mask extraction method for high definition images as described above.
The invention has the beneficial effects that: the high-definition image is subjected to region division through pixel points, the divided regions are expressed in a node form of a graph structure, edge weights are calculated, a node optimization queue is obtained through the edge weights, a foreground region and a background region are rapidly determined in the node optimization queue, pixel values in the region are optimized and solved, and finally an optimal foreground mask value is obtained.
Drawings
Fig. 1 is a schematic flowchart of a transparency mask extraction method for a high definition image according to an embodiment of the present invention;
fig. 2 is a schematic diagram of functional modules of a transparency mask extraction apparatus for high definition images according to an embodiment of the present invention;
fig. 3 is a schematic node diagram of a graph structure according to an embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a schematic flowchart of a transparency mask extraction method for a high definition image according to an embodiment of the present invention.
As shown in fig. 1, a method for extracting a transparency mask of a high definition image includes the following steps:
inputting a high-definition image, and marking an unknown area in the high-definition image;
dividing the unknown region into a plurality of sub-regions according to the pixel information in the unknown region;
converting each subregion into a node of a graph structure, calculating edge weights between adjacent nodes, and generating the graph structure according to each edge weight;
and generating a node optimization queue according to the edge weights among the nodes, selecting pixel values in the sub-regions, the foreground region and the background region, solving the optimal value of the selected pixel values according to the node optimization queue, and taking the optimal value obtained by the solution as the optimal foreground mask value.
Specifically, the generation of the node optimization queue is performed by a minimum spanning tree method and an edge weight.
It should be understood that a High Definition image refers to a High resolution image, and a High resolution image (High Definition) refers to an image with a vertical resolution of 720p or more.
Specifically, the unknown region, the foreground region and the background region are specified in the high-definition image, namely, the target texture edge of the high-definition image is subjected to expansion processing through a preset template, the expanded region is taken as the unknown region, the target region except the expanded region is taken as the background region, and the other regions except the target region are taken as the foreground region.
In the embodiment, the high-definition image is subjected to region division through the pixel points, the divided regions are expressed in a node form of a graph structure, the edge weight is calculated, the node optimization queue is obtained through the edge weight, the foreground region and the background region are quickly determined in the node optimization queue, the pixel values in the region are optimized and solved, the optimal foreground mask value is finally obtained, and the calculation precision is high and quick.
Optionally, as an embodiment of the present invention, the process of dividing the unknown region into a plurality of sub-regions according to pixel information in the unknown region includes:
let the i-th pixel point on the unknown region be p i1, 2, n, n is a positive integer;
calculating a mean shift vector m (p) corresponding to the pixel according to a pixel mean shift calculation formula and each pixel information on the unknown regioni) The pixel mean shift calculation formula is as follows:
Figure BDA0002296478830000041
wherein any one pixel piThe method is characterized by comprising five dimensions, wherein the five dimensions respectively represent R, G, B, x, y, R, G and B represent a pixel point piThe color of (1) in RGB space, x and y representing a pixel point piA planar coordinate on the high-definition image,
Figure BDA0002296478830000042
h represents the bandwidth, h is more than 0, | | | | non-woven phosphor2The expression distance;
calculating piMean shift vector m (p)i) Until the five-dimensional data points converge, so that each point reaches the maximum local density;
dividing the mean shift vectors corresponding to the n calculated pixel points into w classes, wherein the Euclidean distance of any two pixel points in each class in the five-dimensional space is less than the bandwidth h;
and merging the classes with the pixel number smaller than the preset pixel number threshold value M into adjacent classes to generate w' classes, wherein each class represents a sub-region.
It should be understood that the five-dimensional space represents five dimensions, namely R, G, B, x, y.
In the above embodiment, the local density is obtained by calculating the color of each pixel and the distance between the pixels, the euclidean distance of any two pixels in each class in the five-dimensional space is determined according to the density, the classes are merged according to the number of pixels in the classes, and a plurality of classes are obtained, each class representing a sub-region.
Optionally, as an embodiment of the present invention, the process of calculating edge weights between adjacent nodes and generating a graph structure according to each edge weight includes:
defining edge weights b between nodesi,jThe definition is as follows:
Figure BDA0002296478830000051
wherein ,CiIs the color information of the pixel point, and the color information of the pixel point,
Figure BDA0002296478830000052
Siis the information of the plane coordinates of the pixel points,
Figure BDA0002296478830000053
Xwi' RGB three-dimensional color information of a midpoint pixel of an i-th class region,
Figure BDA0002296478830000054
an average value of three-dimensional color information representing the midpoint pixel of each class region,
Figure BDA0002296478830000055
RGB three-dimensional color information representing midpoint pixels of respective class regionsVariance value of (A), XSi' plane coordinate information of a midpoint pixel of the i-th type region,
Figure BDA0002296478830000056
an average value of plane coordinate information representing midpoint pixels of the respective class areas,
Figure BDA0002296478830000057
variance values representing plane coordinate information of the midpoint pixels of the respective class areas,
according to the edge weight b between the node and each nodei,jA graph structure is generated.
The midpoint pixel is defined as
Figure BDA00022964788300000512
wherein ,
Figure BDA0002296478830000058
Figure BDA0002296478830000059
Nidenotes the number of pixels, Ω, of the i-th areaiA set of pixels representing the i-th class region,
Figure BDA00022964788300000510
a planar x-coordinate value representing a point pixel in the ith class region,
Figure BDA00022964788300000511
y coordinate value, x, of plane representing point pixel in i-th class regionjX-coordinate value, y, of a plane representing the jth pixel of the ith-type regionjAnd a plane y coordinate value representing the jth pixel of the ith type region.
The process of converting each sub-region into a node of a graph structure is as follows: the sub-regions are labeled, each label representing a node of the graph structure.
Specifically, before defining the edge weight, each node needs to be labeled, as shown in fig. 3, different regions of the unknown region U of the original image corresponding to w 'classes that are finally generated are labeled, where each label represents one divided region (total w' regions), and each label is a node of the graph structure.
In the above embodiment, each of the divided regions is converted into a graph node representation, and the edge weight relationship between the regions is defined by color and distance, which can facilitate generation of a node optimization queue.
Optionally, as an embodiment of the present invention, the process of determining the foreground region and the background region according to the node optimization queue includes:
optimizing a sub-region, a foreground region and a background region corresponding to an ith node of a node optimization queue to obtain an optimal value of the ith node, and taking the optimal value as an optimal foreground mask value of the ith node;
taking the optimal foreground mask value of the ith node as initial solution information of an (i + 1) th node area, and optimizing the foreground area and the background area of the (i + 1) th node according to the initial solution information to obtain the optimal foreground mask value of the (i + 1) th node;
and until all the node areas in the node optimization queue are optimized, and the optimal foreground mask value of the whole unknown area is obtained.
In the above embodiment, each node in the node optimization queue is optimized to obtain the optimal foreground mask value of the whole unknown region, so that the extracted transparency mask result is more accurate.
Optionally, as an embodiment of the present invention, the optimizing a sub-region, a foreground region, and a background region corresponding to an ith node of the node optimization queue, and obtaining an optimal value of the ith node includes:
calculating each pixel in the foreground area and the background area corresponding to the ith node according to a pixel calculation formula
Figure BDA0002296478830000071
wherein ,
Figure BDA0002296478830000072
Figure BDA0002296478830000073
representing the color value of the k unknown pixel in the unknown region,
Figure BDA0002296478830000074
representing the k-th background value selected in the background area,
Figure BDA0002296478830000075
representing the k-th foreground value selected from the foreground area;
all pixels in the foreground area and the background area corresponding to the ith node are used as optimization variables X, pixel values are randomly selected from the foreground area and the background area, and the optimization variables X are assigned according to the selected pixel values to obtain a solution set P ═ X (X ═ X)1,X2,…,XN) N represents the number of solutions,
evaluating each solution in the solution set P to obtain the optimal value of the ith node, wherein the evaluation process comprises the following steps:
if f (X)i)>f(Xj) Then XjTo XiLearning, the learning process includes: according to learning formula Xj=Xj+λ(Xi-Xj) Study, XiAnd continuously comparing the current value with the next solution in the solution set P to obtain a comparison error value, and stopping comparison until the comparison error values of the N solutions are smaller than a preset error value to obtain an optimal solution, wherein the optimal solution is used as the optimal value of the ith node.
In the above embodiment, the solution values of the pixel values are calculated, the solution values are compared with each other to obtain the error value, and the optimal value is obtained by comparing the calculated error value with the preset error value, so that a more accurate foreground mask value can be obtained.
Fig. 2 is a schematic diagram of functional modules of a transparency mask extraction apparatus for high definition images according to an embodiment of the present invention.
Optionally, as another embodiment of the present invention, as shown in fig. 2, a transparency mask extracting apparatus for a high definition image includes:
the calibration module is used for inputting a high-definition image and calibrating an unknown region, a foreground region and a background region in the high-definition image;
the region segmentation module is used for segmenting the unknown region into a plurality of sub-regions according to the pixel information in the unknown region;
the graph structure generating module is used for converting each subregion into a node of the graph structure, calculating edge weights between adjacent nodes and generating the graph structure according to each edge weight;
and the optimization module generates a node optimization queue according to the edge weights among the nodes, selects pixel values in the sub-regions, the foreground region and the background region, performs optimal value solution on the selected pixel values according to the node optimization queue, and takes the optimal value obtained by the solution as an optimal foreground mask value.
Optionally, as an embodiment of the present invention, the region segmentation module is specifically configured to:
let the i-th pixel point on the unknown region be p i1, 2, n, n is a positive integer;
calculating a mean shift vector m (p) corresponding to the pixel according to a pixel mean shift calculation formula and each pixel information on the unknown regioni) The pixel mean shift calculation formula is as follows:
Figure BDA0002296478830000081
wherein any one pixel piThe method is characterized by comprising five dimensions, wherein the five dimensions respectively represent R, G, B, x, y, R, G and B represent a pixel point piThe color of (1) in RGB space, x and y representing a pixel point piA planar coordinate on the high-definition image,
Figure BDA0002296478830000082
h meterIndicating bandwidth, h is more than 0, | | | | non-woven phosphor2The expression distance;
calculating piMean shift vector m (p)i) Until the five-dimensional data points converge, so that each point reaches the maximum local density;
dividing the mean shift vectors corresponding to the n calculated pixel points into w classes, wherein the Euclidean distance of any two pixel points in each class in the five-dimensional space is less than the bandwidth h;
and merging the classes with the pixel number smaller than the preset pixel number threshold value M into adjacent classes to generate w' classes, wherein each class represents a sub-region.
Optionally, as an embodiment of the present invention, the graph structure generating module is specifically configured to:
defining edge weights b between nodesi,jThe definition is as follows:
Figure BDA0002296478830000091
wherein ,CiIs the color information of the pixel point, and the color information of the pixel point,
Figure BDA0002296478830000092
Siis the information of the plane coordinates of the pixel points,
Figure BDA0002296478830000093
Xw′iRGB three-dimensional color information representing a midpoint pixel of the i-th class region,
Figure BDA0002296478830000094
an average value of three-dimensional color information representing the midpoint pixel of each class region,
Figure BDA0002296478830000095
variance value, X, representing RGB three-dimensional color information of the center point pixel of each class regionSi' plane coordinate information of a midpoint pixel of the i-th type region,
Figure BDA0002296478830000096
an average value of plane coordinate information representing midpoint pixels of the respective class areas,
Figure BDA0002296478830000097
variance values representing plane coordinate information of the midpoint pixels of the respective class areas,
according to the edge weight b between the node and each nodei,jA graph structure is generated.
Optionally, as another embodiment of the present invention, a transparency mask extracting apparatus for a high definition image includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the computer program is executed by the processor, the transparency mask extracting method for a high definition image as described above is implemented.
Alternatively, as an embodiment of the present invention, a computer-readable storage medium stores a computer program which, when executed by a processor, implements the transparency mask extraction method for high definition images as described above.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A transparency mask extraction method of a high-definition image is characterized by comprising the following steps:
inputting a high-definition image, and marking an unknown region, a foreground region and a background region in the high-definition image;
dividing the unknown region into a plurality of sub-regions according to the pixel information in the unknown region;
converting each subregion into a node of a graph structure, calculating edge weights between adjacent nodes, and generating the graph structure according to each edge weight;
and generating a node optimization queue according to the edge weights among the nodes, selecting pixel values in the sub-regions, the foreground region and the background region, solving the optimal value of the selected pixel values according to the node optimization queue, and taking the optimal value obtained by the solution as the optimal foreground mask value.
2. The transparency mask extraction method for high definition images according to claim 1, wherein the process of dividing the unknown region into a plurality of sub-regions according to the pixel information in the unknown region comprises:
let the i-th pixel point on the unknown region be pi1, 2, n, n is a positive integer;
calculating a mean shift vector m (p) corresponding to the pixel according to a pixel mean shift calculation formula and each pixel information on the unknown regioni) The pixel mean shift calculation formula is as follows:
Figure FDA0002296478820000011
wherein any one pixel piThe method is characterized by comprising five dimensions, wherein the five dimensions respectively represent R, G, B, x, y, R, G and B represent a pixel point piThe color of (1) in RGB space, x and y representing a pixel point piA planar coordinate on the high-definition image,
Figure FDA0002296478820000012
h represents the bandwidth, h is more than 0, | | | | non-woven phosphor2The expression distance;
calculating piMean shift vector m (p)i) Until the five-dimensional data points converge, so that each point reaches the maximum local density;
dividing the mean shift vectors corresponding to the n calculated pixel points into w classes, wherein the Euclidean distance of any two pixel points in each class in the five-dimensional space is less than the bandwidth h;
and merging the classes with the pixel number smaller than the preset pixel number threshold value M into adjacent classes to generate w' classes, wherein each class represents a sub-region.
3. The method for transparency mask extraction of high definition images according to claim 1, wherein the calculating of edge weights between adjacent nodes and the generating of graph structures according to the respective edge weights comprises:
defining edge weights b between nodesi,jThe definition is as follows:
Figure FDA0002296478820000021
wherein ,CiIs the color information of the pixel point, and the color information of the pixel point,
Figure FDA0002296478820000022
Siis the information of the plane coordinates of the pixel points,
Figure FDA0002296478820000023
Figure FDA0002296478820000024
RGB three-dimensional color information representing a midpoint pixel of the i-th class region,
Figure FDA0002296478820000025
an average value of three-dimensional color information representing the midpoint pixel of each class region,
Figure FDA0002296478820000026
variance values of RGB three-dimensional color information representing the midpoint pixels of the respective class areas,
Figure FDA0002296478820000027
plane coordinate information representing a midpoint pixel of the i-th type region,
Figure FDA0002296478820000028
an average value of plane coordinate information representing midpoint pixels of the respective class areas,
Figure FDA0002296478820000029
variance values representing plane coordinate information of the midpoint pixels of the respective class areas,
according to the edge weight b between the node and each nodei,jA graph structure is generated.
4. The method for extracting transparency masks of high-definition images according to claim 1, wherein the process of solving the optimal value of the selected pixel values according to the node optimization queue comprises:
optimizing a sub-region, a foreground region and a background region corresponding to an ith node of the node optimization queue to obtain an optimal value of the ith node, and taking the optimal value as an optimal foreground mask value of the ith node;
taking the optimal foreground mask value of the ith node as initial solution information of an (i + 1) th node area, and optimizing the foreground area and the background area of the (i + 1) th node according to the initial solution information to obtain the optimal foreground mask value of the (i + 1) th node;
and until all the node areas in the node optimization queue are optimized, and the optimal foreground mask value of the whole unknown area is obtained.
5. The method for extracting transparency masks of high-definition images according to claim 4, wherein the process of optimizing the sub-region, the foreground region and the background region corresponding to the ith node of the node optimization queue to obtain the optimal value of the ith node comprises:
calculating each pixel in the foreground area and the background area corresponding to the ith node according to a pixel calculation formula
Figure FDA0002296478820000031
wherein ,
Figure FDA0002296478820000032
Figure FDA0002296478820000033
representing the color value of the k unknown pixel in the unknown region,
Figure FDA0002296478820000034
representing the k-th background value selected in the background area,
Figure FDA0002296478820000035
representing the k-th foreground value selected from the foreground area;
all pixels in the foreground area and the background area corresponding to the ith node are used as optimization variables X, pixel values are randomly selected from the foreground area and the background area, and the optimization variables X are assigned according to the selected pixel values to obtain a solution set P ═ X (X ═ X)1,X2,…,XN) N represents the number of solutions;
evaluating each solution in the solution set P to obtain the optimal value of the ith node, wherein the evaluation process comprises the following steps:
if f (X)i)>f(Xj) Then XjTo XiLearning, the learning process includes: according to learning formula Xj=Xj+λ(Xi-Xj) Study, XiContinue and in the solution set PComparing the next solution to obtain a comparison error value, and stopping the comparison until the comparison error values of the N solutions are smaller than a preset error value to obtain an optimal solution, wherein the optimal solution is used as the optimal value of the ith node.
6. The utility model provides a transparency shade extraction element of high definition image which characterized in that includes:
the calibration module is used for inputting a high-definition image and calibrating an unknown region, a foreground region and a background region in the high-definition image;
the region segmentation module is used for segmenting the unknown region into a plurality of sub-regions according to the pixel information in the unknown region;
the graph structure generating module is used for converting each subregion into a node of the graph structure, calculating edge weights between adjacent nodes and generating the graph structure according to each edge weight;
and the optimization module is used for generating a node optimization queue according to the edge weight among the nodes, selecting pixel values in the sub-regions, the foreground region and the background region, carrying out optimal value solution on the selected pixel values according to the node optimization queue, and taking the optimal value obtained by the solution as an optimal foreground mask value.
7. The apparatus according to claim 6, wherein the region segmentation module is specifically configured to:
let the i-th pixel point on the unknown region be pi1, 2, n, n is a positive integer;
calculating a mean shift vector m (p) corresponding to the pixel according to a pixel mean shift calculation formula and each pixel information on the unknown regioni) The pixel mean shift calculation formula is as follows:
Figure FDA0002296478820000041
wherein any one pixel piIs composed of five dimensions, which are divided into partsRespectively indicating R, G, B, x, y, R, G and B indicating a pixel point piThe color of (1) in RGB space, x and y representing a pixel point piA planar coordinate on the high-definition image,
Figure FDA0002296478820000042
h represents the bandwidth, h is more than 0, | | | | non-woven phosphor2The expression distance;
calculating piMean shift vector m (p)i) Until the five-dimensional data points converge, so that each point reaches the maximum local density;
dividing the mean shift vectors corresponding to the n calculated pixel points into w classes, wherein the Euclidean distance of any two pixel points in each class in the five-dimensional space is less than the bandwidth h;
and merging the classes with the pixel number smaller than the preset pixel number threshold value M into adjacent classes to generate w' classes, wherein each class represents a sub-region.
8. The apparatus according to claim 6, wherein the graph structure generating module is specifically configured to:
defining edge weights b between nodesi,jThe definition is as follows:
Figure FDA0002296478820000051
wherein ,CiIs the color information of the pixel point, and the color information of the pixel point,
Figure FDA0002296478820000052
Siis the information of the plane coordinates of the pixel points,
Figure FDA0002296478820000053
Figure FDA0002296478820000054
RGB three-dimensional color information representing a midpoint pixel of the i-th class region,
Figure FDA0002296478820000055
represents an average value of three-dimensional color information of the midpoint pixel in each class region,
Figure FDA0002296478820000056
variance values of RGB three-dimensional color information representing the midpoint pixels of the respective class areas,
Figure FDA0002296478820000057
plane coordinate information representing a midpoint pixel of the i-th type region,
Figure FDA0002296478820000058
an average value of plane coordinate information representing midpoint pixels of the respective class areas,
Figure FDA0002296478820000059
variance values representing plane coordinate information of the midpoint pixels of the respective class areas,
according to the edge weight b between the node and each nodei,jA graph structure is generated.
9. A transparency mask extraction device for high definition images, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that when the computer program is executed by the processor, the transparency mask extraction method for high definition images according to any of claims 1 to 5 is implemented.
10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the transparency mask extraction method for high definition images according to any one of claims 1 to 5.
CN201911203685.9A 2019-11-29 2019-11-29 Transparency mask extraction method and device for high-definition image and storage medium Active CN111047604B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911203685.9A CN111047604B (en) 2019-11-29 2019-11-29 Transparency mask extraction method and device for high-definition image and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911203685.9A CN111047604B (en) 2019-11-29 2019-11-29 Transparency mask extraction method and device for high-definition image and storage medium

Publications (2)

Publication Number Publication Date
CN111047604A true CN111047604A (en) 2020-04-21
CN111047604B CN111047604B (en) 2023-04-28

Family

ID=70233222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911203685.9A Active CN111047604B (en) 2019-11-29 2019-11-29 Transparency mask extraction method and device for high-definition image and storage medium

Country Status (1)

Country Link
CN (1) CN111047604B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101989353A (en) * 2010-12-10 2011-03-23 中国科学院深圳先进技术研究院 Image matting method
CN102291520A (en) * 2006-05-26 2011-12-21 佳能株式会社 Image processing method and image processing apparatus
CN102651135A (en) * 2012-04-10 2012-08-29 电子科技大学 Optimized direction sampling-based natural image matting method
CN103942794A (en) * 2014-04-16 2014-07-23 南京大学 Image collaborative cutout method based on confidence level
CN104134192A (en) * 2014-07-23 2014-11-05 中国科学院深圳先进技术研究院 Image defogging method and system
CN105931244A (en) * 2016-04-29 2016-09-07 中科院成都信息技术股份有限公司 Supervision-free image matting method and apparatus
CN106056606A (en) * 2016-05-30 2016-10-26 乐视控股(北京)有限公司 Image processing method and device
CN110400323A (en) * 2019-07-30 2019-11-01 上海艾麒信息科技有限公司 It is a kind of to scratch drawing system, method and device automatically
CN110503704A (en) * 2019-08-27 2019-11-26 北京迈格威科技有限公司 Building method, device and the electronic equipment of three components

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102291520A (en) * 2006-05-26 2011-12-21 佳能株式会社 Image processing method and image processing apparatus
CN101989353A (en) * 2010-12-10 2011-03-23 中国科学院深圳先进技术研究院 Image matting method
CN102651135A (en) * 2012-04-10 2012-08-29 电子科技大学 Optimized direction sampling-based natural image matting method
CN103942794A (en) * 2014-04-16 2014-07-23 南京大学 Image collaborative cutout method based on confidence level
CN104134192A (en) * 2014-07-23 2014-11-05 中国科学院深圳先进技术研究院 Image defogging method and system
CN105931244A (en) * 2016-04-29 2016-09-07 中科院成都信息技术股份有限公司 Supervision-free image matting method and apparatus
CN106056606A (en) * 2016-05-30 2016-10-26 乐视控股(北京)有限公司 Image processing method and device
CN110400323A (en) * 2019-07-30 2019-11-01 上海艾麒信息科技有限公司 It is a kind of to scratch drawing system, method and device automatically
CN110503704A (en) * 2019-08-27 2019-11-26 北京迈格威科技有限公司 Building method, device and the electronic equipment of three components

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ARUNAVA DE等: "Masking Based Segmentation of Diseased MRI Images" *
颜学名: ""基于二部图结构信息的启发式算法研究"" *

Also Published As

Publication number Publication date
CN111047604B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
CN109816012B (en) Multi-scale target detection method fusing context information
JP6088792B2 (en) Image detection apparatus, control program, and image detection method
CN109829398B (en) Target detection method in video based on three-dimensional convolution network
US20210042929A1 (en) Three-dimensional object detection method and system based on weighted channel features of a point cloud
CN109903331B (en) Convolutional neural network target detection method based on RGB-D camera
CN108986152B (en) Foreign matter detection method and device based on difference image
CN111462207A (en) RGB-D simultaneous positioning and map creation method integrating direct method and feature method
CN112967341B (en) Indoor visual positioning method, system, equipment and storage medium based on live-action image
CN111144213B (en) Object detection method and related equipment
CN112184759A (en) Moving target detection and tracking method and system based on video
CN110910421B (en) Weak and small moving object detection method based on block characterization and variable neighborhood clustering
WO2016165064A1 (en) Robust foreground detection method based on multi-view learning
CN104766065B (en) Robustness foreground detection method based on various visual angles study
JP2018124890A (en) Image processing apparatus, image processing method, and image processing program
CN110147816B (en) Method and device for acquiring color depth image and computer storage medium
US11741615B2 (en) Map segmentation method and device, motion estimation method, and device terminal
CN112465021B (en) Pose track estimation method based on image frame interpolation method
JP2018195084A (en) Image processing apparatus, image processing method, program, and storage medium
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
CN111047604A (en) Transparency mask extraction method and device for high-definition image and storage medium
CN112417961B (en) Sea surface target detection method based on scene prior knowledge
CN112418344A (en) Training method, target detection method, medium and electronic device
JP2018010359A (en) Information processor, information processing method, and program
JP2020042608A (en) Detection apparatus and program
JP2011113177A (en) Method and program for structuring three-dimensional object model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant