CN112634278B - Super-pixel-based just noticeable distortion method - Google Patents

Super-pixel-based just noticeable distortion method Download PDF

Info

Publication number
CN112634278B
CN112634278B CN202011188873.1A CN202011188873A CN112634278B CN 112634278 B CN112634278 B CN 112634278B CN 202011188873 A CN202011188873 A CN 202011188873A CN 112634278 B CN112634278 B CN 112634278B
Authority
CN
China
Prior art keywords
model
region
texture
roughness
pixel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011188873.1A
Other languages
Chinese (zh)
Other versions
CN112634278A (en
Inventor
王永芳
王闯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN202011188873.1A priority Critical patent/CN112634278B/en
Publication of CN112634278A publication Critical patent/CN112634278A/en
Application granted granted Critical
Publication of CN112634278B publication Critical patent/CN112634278B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Abstract

The invention discloses a novel super-pixel-based just noticeable distortion method. The method comprises the following establishing steps: 1. processing an input image, 2, calculating regional characteristics, 3, establishing a regional weight model, 4, establishing a contrast masking model of joint texture roughness, and 5, obtaining a perception model based on superpixels. The invention is tested on the relevant test images, and the experimental result shows that the method has good accuracy and robustness. The method introduces a region weight model and a region roughness modulation model, wherein the region weight model combines the color contrast of the region level and the concave modulation effect to estimate the visual importance degree of each region, and the region roughness modulation model refines the selection of the window size on the basis of Tamura texture roughness and considers the influence of the average gray difference between windows to estimate the texture condition of each region.

Description

Super-pixel-based just noticeable distortion method
Technical Field
The invention relates to an image visual redundancy estimation method, in particular to a Just Noticeable Distortion (JND) model based on superpixels, belonging to the field of image processing and computer vision.
Background
With the rapid development of internet and multimedia information technology, images and videos have become the main information transmission media, and the increasing data volume brings huge challenges to the existing coding technology. Considering that the Human eye is the final receiver of the image/video, and the Human eye is imperceptible to the change of the pixel value below a certain threshold, which is also called as the JND threshold, the characteristics of the Human Visual System (HVS) can be utilized to remove the Visual redundancy in the image, so as to improve the existing coding efficiency, and further save the precious network bandwidth resources. The JND model is an effective method for estimating visual redundancy, and has wide applications, such as video coding, subjective evaluation, digital watermarking, and the like.
The JND models are generally classified into two categories, namely pixel domain JND models and transform domain JND models, wherein the pixel domain JND models are models built according to the characteristics of pixel values of images, and the transform domain JND models are models built according to the characteristics of the pixel values in the transform domain. The JND model is developed based on the research of the human visual system, the pixel domain JND model mainly includes a Luminance Masking (LM) effect and a Contrast Masking (CM) effect, and the transform domain JND model mainly uses a spatial Contrast Sensitivity Function (CSF) model, a background Luminance Adaptation (LA) effect, and a Texture Masking (TM) effect. With the continuous exploration and mining of the HVS, more visual characteristics are proposed and applied to the establishment of new JND models, such as free energy principle, mode complexity, disorder masking effect and the like. More and more JND models have been proposed in recent years, but the creation of these models is entirely pixel-based and does not take into account the different contribution of different regions to vision at the region level. Therefore, according to the difference of attention and sensitivity of human eyes to each region, how to establish modulation factors of a region level through color features and texture features of the region and how to improve the prediction accuracy of a perception model to a human eye visual system by using the modulation factors are the problems to be further solved at present.
Disclosure of Invention
The invention aims to provide a super-pixel-based just noticeable distortion method for accurately estimating visual redundancy information of an image. Because the recognition capability of human eyes is limited, the attention degree and the sensitivity degree of human eyes to each region are different when an image is observed due to the difference of the color, the texture and other characteristics of each region.
Image segmentation is performed by using a single Linear adaptive Clustering (SLIC) algorithm, and pixel points in each segmented super pixel have the same or similar characteristics of color, texture and the like, so that a pixel block with certain visual information is formed. According to the difference of attention of human eyes to different regions, calculating the average value of all pixel points in the superpixel in space coordinates and the average value of color components to establish a region-level color contrast model, wherein the higher the contrast is, the more easily the attention is paid. Meanwhile, considering the concave characteristic of human eyes, namely that the human visual system is most sensitive to the area which is mapped to the central concave area of the human eyes, and the sensitivity of the human eyes is rapidly reduced along with the distance from the central concave area to the human eyes, the area with the highest contrast is selected as the central concave area, and the concave modulation model of the area level is established, and the closer to the central concave area, the more easily the concave modulation model is concerned. In addition, since the texture has a masking effect on human eyes, the texture roughness of the region is calculated and incorporated into the perceptual model as a modulation factor. The method can accurately estimate the visual redundancy information of the test image, better conforms to the visual characteristics of human eyes, and can effectively remove more visual redundancy information while ensuring the subjective perception quality. In addition, the method has important reference significance for researching the application of the super-pixel characteristics in human vision and perception models.
In order to achieve the purpose, the invention has the following conception:
first, the input image is area-divided by the superpixel segmentation algorithm SLIC, and the spatial coordinates and color components of each area are calculated and represented by the average value of the spatial coordinates X, Y and the average value of the color components L, A, B of all pixels in the area. Then, the color contrast is calculated using the color component and the spatial coordinates of the super-pixel, considering that the region with a higher contrast is more likely to be noticed by the human eye according to the attention of the human eye to each region. According to the pit characteristics of human eyes, a concave modulation model based on regions is established, so that the visual importance degree of each region is accurately estimated. Further, a texture roughness modulation model at a region level is designed in consideration of the masking effect of the region texture, and the masking effect of the region texture on distortion is considered. The input image is processed through the idea, namely, the superpixel-based JND model according to the present invention.
According to the conception, the invention adopts the following technical scheme:
a JND method based on superpixels, comprising the steps of:
step 1, preprocessing an input image: including superpixel segmentation and conversion of color space. Performing region segmentation on the input image by using a superpixel segmentation algorithm SLIC, and simultaneously converting the input image into an LAB color space with uniform perception;
step 2, calculating regional characteristics: including color features and texture features. For color features, firstly, in each super-pixel region, an average value of spatial coordinates X, Y and an average value of color components L, A, B of all pixel points in the region are obtained to represent the spatial coordinates and the color components of the super-pixel region, and then spatial distances and color differences between the super-pixel region and other super-pixels are calculated in sequence. Obviously, if the color of a certain area is greatly different from that of the surrounding area, the color contrast of the area is high; if super-pixels with high color contrast are adjacent or concentrated, the part is considered to be more attractive to human eyes, and therefore the color contrast is calculated by combining the color difference and the spatial distance. In addition, 5 areas with the highest color contrast are selected as concave areas, and a concave modulation model is established according to the concave characteristics of human eyes. For texture features, considering the masking effect of regional textures on vision, further refining the size of a window on the basis of Tamura texture roughness, considering the influence of average gray difference between windows, establishing a texture roughness model based on the regions, and taking the texture roughness model as a modulation factor to improve the accuracy of the model;
step 3, establishing a region weight model: according to the visual characteristics of human eyes, the attention and the sensitivity of human eyes to different areas in the same image are different. In step 2, a contrast model and a concave modulation model based on the regions are established through the color components and the space coordinates of the regions, so as to estimate the visual importance degree of the regions and establish a region weight model;
step 4, comparing and masking models based on the region texture roughness: since texture has a masking capability for distortion, in step 2, the texture condition of each region is estimated by a region-level texture roughness model. Considering that the comparison masking model still needs to be further improved, therefore, the comparison masking model of the joint texture roughness is established to improve the accuracy of the model;
step 5, a perception model based on the super pixels: and finally, on the basis of the step 3 and the step 4, fusing the contrast Masking Model combined with the texture roughness with the brightness Masking Model under the guidance of a Nonlinear Addition Model for Masking (NAMM), and obtaining the final perception Model under the weighting of the region weight Model.
Preferably, in the step 1, the image is divided into K regions by using a superpixel segmentation algorithm SLIC;
K=[w·h/n] (1)
wherein K is a positive integer, and w and h are the width and height of the image, respectively; the super pixel is composed of adjacent pixel points with the same or similar color, texture and other characteristics, and can be regarded as a visual input unit.
Preferably, in the step 2, considering the relationship between the region feature and the human visual feature, the three links of the region-based color contrast, the region-based concave modulation effect and the region-based texture roughness are mainly used:
(1) region-based color contrast:
for color features, the average of the spatial coordinates X, Y of all the pixels and the average of the color component L, A, B are calculated for each superpixel region in turn as the spatial coordinates (x) of the respective superpixelsk,yk) And color component values (l)k,ak,bk) Calculating the color difference and the space distance between the super pixel areas:
Figure GDA0003517918390000031
Figure GDA0003517918390000032
where k and i are the indices of the superpixel, (l)k,ak,bk,xk,yk) And (l)i,ai,bi,xi,yi) Respectively representing the super-pixels k and i,
Figure GDA0003517918390000033
representing superpixels k and superpixelsThe difference in the color of the element i,
Figure GDA0003517918390000034
represents the spatial distance of the superpixel k from the superpixel i;
according to HVS characteristics, the color of a certain area is greatly different from that of the surrounding area, which indicates that the color contrast of the area is relatively large, and if super pixels with high color contrast are adjacent or concentrated, the part is considered to attract the attention of human eyes; the color contrast is therefore proportional to the euclidean distance of the color and inversely proportional to the euclidean distance of the location, the color contrast being calculated:
Figure GDA0003517918390000035
Figure GDA0003517918390000041
wherein is ckTotal color contrast of super-pixel k, c(k,i)For the color contrast between regions k and i,
Figure GDA0003517918390000042
and
Figure GDA0003517918390000043
has been normalized to [0, 1]Where c is1Set to 3;
(2) region-based concave modulation model
According to the human eye concave feature, the human visual system is most sensitive to the area mapped to the human eye foveal area, and the sensitivity of the human eye decreases rapidly with increasing distance from the foveal area; selecting 5 areas with highest color contrast as central concave areas; the eccentricity is defined as follows:
Figure GDA0003517918390000044
wherein i is 1, 2,xiAnd yiRepresenting the space coordinate value of the selected area, x and y representing the space coordinate values of other positions, and d representing the observation distance;
the closer to the fovea, the smaller the eccentricity, and the more easily the human eye pays attention to, defining a concave modulation model:
Figure GDA0003517918390000045
fi=ln(λei) (8)
where m is the total number of selected foveal areas and has a value of 5, fiA modulation model representing a single foveal region; e.g. of the typeiDenotes the eccentricity, λ is the adjustment parameter, here set to 1;
(3) region-based roughness model
According to the relationship between the texture and the visual characteristic, the more complex the texture is, the stronger the masking effect on the distortion is; texture roughness is an important texture feature, and the masking capability of the region texture on noise is estimated by using the texture roughness.
Calculating roughness by using the optimal size of each pixel point in the target image according to Tamura texture roughness, wherein the optimal size refers to the window size corresponding to the maximum average gray difference between adjacent active windows; calculating the texture roughness of the area on the basis of Tamura texture roughness, wherein the calculation process is as follows:
first, the average gray value in a multi-size window at a pixel point is calculated, the window size is 2n × 2n, here, the selection of the window size is refined:
Figure GDA0003517918390000046
where (x, y) is the pixel coordinate, and I (x, y) is the pixel value at pixel (x, y); n is 1, 2, 5, and the texture change near the pixel point can be estimated more accurately through careful window division;
then, the average gray difference between two windows in the horizontal direction and the vertical direction of each pixel is calculated respectively,
En,h(x,y)=|An(x+n,y)-An(x-n,y)| (10)
En,v(x,y)=|An(x,y+n)-An(x,y-n)| (11)
wherein En,hAnd En,vRespectively representing the difference of the mean gray values in the horizontal and vertical directions, AnRepresenting the mean gray value within the window;
then, selecting the window size corresponding to the maximum average gray scale difference value to calculate the roughness,
S(x,y)=2n (12)
En=Emax=max(E1,E2,...EN) (13)
wherein n is a value corresponding to the maximum average gray difference, including the horizontal direction and the vertical direction;
finally, the roughness of the superpixel is calculated:
Figure GDA0003517918390000051
wherein n iskS (x, y) represents the optimal size corresponding to the pixel point, wherein S is the total number of the pixel points in the super pixel;
the Tamura roughness describes the average roughness of a super-pixel area, and the larger the optimal size corresponding to a pixel point is, namely the larger the size of a texture element is, the larger the roughness value is; however, the texture structure of the areas is sparse, and the visual masking effect is weak; since the JND threshold reflects the maximum allowable distortion level, the final modulation model is inversely proportional to the roughness value; considering the influence of the average gray difference, a roughness modulation model is defined as follows:
Fk=σ·Emax/Fcrs (15)
wherein EmaxIs the maximum averageGray difference, FcrsFor roughness values, σ is an adjustment parameter less than 1, related to the window size.
Preferably, in step 3, the color contrast and the concave modulation model obtained in step 2 are used to establish a region weight model:
Wk=(c2-ck)·ffov (16)
wherein c iskColor contrast of the representation area, ffovRepresenting a concave modulation model, c2A constant parameter is set here to 1.75.
Preferably, in the step 4, the texture roughness is used to improve the accuracy of the model, and the texture roughness is combined with the comparison masking model to establish a more accurate visual masking model:
CM′=CM+Fk (17)
wherein CM' is a contrast masking model of the joint texture roughness, CM is an original contrast masking model, FkFor the region-based roughness modulation model:
CM(x,y)=β·G(x,y)·W(x,y) (18)
where β is a control parameter, and takes a value of 0.117, G (x, y) is a maximum weighted average gradient value at the pixel coordinate (x, y), and W (x, y) represents an edge weighting factor at the pixel coordinate (x, y):
Figure GDA0003517918390000061
Figure GDA0003517918390000062
W(x,y)=L·h (21)
wherein g isk(i, j) are four-way high pass filters; l is the image after it has passed through the Canny operator edge detector and h is a k × k gaussian low pass filter.
Preferably, in the above steps 3 and 4, a region weight model and a comparison masking model based on texture roughness are obtained respectively, and are used for estimating the visual importance of the region and improving the visual masking effect of the model, and the two are merged into the perceptual model to obtain a final perceptual model based on the region:
RJND=JNDNAMM·Wk (22)
wherein WkFor regional weight model, JNDNAMMJND model for joint texture coarseness:
JNDNAMM=LA+CM′-α·min{LA,CM′} (23)
Figure GDA0003517918390000063
wherein LA is a background brightness masking model, CM' is a contrast masking model of the joint texture roughness, and alpha is an adjusting parameter, and 0.3 is taken.
Figure GDA0003517918390000064
Is the average background luminance at pixel (x, y).
The method mainly considers the relationship between the color texture characteristics of different areas in the same image and the response of a human visual system to visual information, pixel points in the same superpixel have the same or similar characteristics of color, texture and the like, the superpixel can be regarded as a processing unit, the attention and sensitivity of human eyes to different areas are different, and the visual importance degree of each area is estimated by using the characteristics of the superpixel. Establishing a color contrast model of a region level by using the average value of the space coordinates X, Y of all pixel points in the super-pixel and the average value of the color component L, A, B, then selecting a region with the highest contrast as a central concave region, and establishing a concave modulation model of the region level. Meanwhile, the method is improved on the basis of Tamura texture roughness, the influence of average gray difference is increased while the window size is refined, the texture roughness based on the region is calculated, and the texture roughness is combined with a comparison masking model to improve the accuracy of the model. Further, the contrast masking model of the joint texture roughness and the brightness masking model are subjected to nonlinear superposition, and finally multiplied and combined with the region weight model, so that the sensitivity and attention of human eyes to different regions are fully considered, and the JND model based on the super pixels is obtained.
Compared with the prior art, the invention has the following obvious and prominent substantive characteristics and remarkable advantages:
1. the method uses the super-pixels for establishing the JND model, fully considers the relation between the color and the textural features of the regional hierarchy and the visual characteristics of human eyes, and provides the JND model based on the super-pixels;
2. the method fully considers the relation between the regional characteristics and the attention of human eyes, not only calculates the color contrast between the regions, but also considers the fovea effect of the regional hierarchy, and the two jointly establish a regional weight model for estimating the visual importance degree of each region;
3. the method introduces a texture roughness modulation model of a region level, improves on the basis of Tamura texture roughness, further refines the selection of window sizes, considers the influence of average gray level difference between windows and is used for estimating the texture condition of each region;
the invention performs experiments on related test images, and the experimental result shows that the method has good accuracy and robustness.
Drawings
FIG. 1 is a block diagram of the process of the present invention.
Fig. 2 is a schematic diagram of a method of the superpixel-based JND model according to the present invention.
Fig. 3 is a super pixel segmentation diagram of the test image "basetballkill".
Fig. 4 is a schematic diagram of color contrast based on regions.
Fig. 5 shows the 5 regions with the highest contrast selected in the present invention.
FIG. 6 is a schematic diagram of a concave modulation centered on a single area and a general concave modulation diagram.
FIG. 7 is a schematic of texture roughness based on region.
FIG. 8 is a schematic diagram of a region weight model.
Fig. 9 is a comparison between noise-added and noise-not-added image areas in the original.
Fig. 10 is a comparison between models.
Fig. 11 is a comparison between image regions.
Detailed Description
Preferred embodiments of the present invention are described in detail below with reference to the accompanying drawings:
the first embodiment is as follows:
referring to fig. 1, the method for super-pixel based just noticeable distortion comprises the following steps:
step 1, preprocessing an input image:
the method comprises the steps of super-pixel segmentation and color space conversion, wherein simple linear iterative clustering in a super-pixel segmentation algorithm is used for carrying out region segmentation on an input image, and super-pixels are taken as visual input units; simultaneously converting the input image into an LAB color space with uniform visual perception so as to facilitate subsequent operation processing;
step 2, calculating regional characteristics:
the method comprises the steps of including color features and texture features, for the color features, firstly calculating the average value of spatial coordinates X, Y and the average value of color components L, A, B of all pixels in a super-pixel region, then sequentially calculating the spatial distance and color difference with other super-pixels, and calculating the color contrast between the super-pixels; in addition, in consideration of the concave characteristic of human eyes, 5 areas with high color contrast are selected to establish a concave modulation model based on the areas; for texture features, further refining the selection of window sizes on the basis of Tamura texture roughness according to the masking effect of the texture roughness on human eyes, and meanwhile, considering the influence of average gray difference between windows to establish a texture roughness model based on regions;
step 3, establishing a region weight model:
according to the visual characteristics of human eyes, the attention degrees of the human eyes to different regions in the same image are different, in step 2, a region-based contrast model and a concave modulation model are established through the color components and the space coordinates of the regions, the attention degree of the regions is estimated, and a region weight model is established; the higher the contrast, the more easily the attention is paid; the closer to the foveal region, the more susceptible to attention;
step 4, combining a comparison masking model of texture roughness:
in step 2, establishing a texture roughness model based on the region according to the visual masking effect of the texture on human eyes; considering that the comparison masking model still needs to be further improved, the accuracy of the model is improved by utilizing the texture roughness, and the comparison masking model combining the texture roughness is established;
step 5, a perception model based on the super pixels:
and (4) fusing the comparison masking model of the joint texture roughness and the brightness masking model on the basis of the steps 3 and 4, and obtaining a final perception model under the weighting of the region weight model.
The method introduces a region weight model and a region roughness modulation model, wherein the region weight model combines the color contrast and the concave modulation effect of a region level to estimate the visual importance degree of each region, and the region roughness modulation model refines the selection of the window size on the basis of Tamura texture roughness and considers the influence of the average gray difference between windows to estimate the texture condition of each region; the invention performs experiments on related test images, and the experimental result shows that the method has good accuracy and robustness.
Example two:
the basic flowchart of the JND model based on superpixel and region features of this embodiment is shown in fig. 2. The method is realized in programming simulation under windows 10 and matlab 2016 environments. First, the input image is pre-processed, including superpixel segmentation and color space conversion. Then, carrying out feature calculation on each region, including color features and texture features, calculating color contrast through the average pixel value and the average coordinate value of the super-pixels, establishing a concave modulation model according to the concave characteristics of human eyes, and combining the two to establish a region weight model. In addition, a texture roughness model of the region is established to improve the masking capability of the region texture to distortion, and is combined with the comparison masking model. And finally, carrying out nonlinear superposition on the contrast masking model of the joint texture roughness and the brightness masking model, and multiplying the result by the region weight model to obtain a final new JND model, wherein the attention and the sensitivity of human eyes to each region are considered. The idea is the super-pixel-based JND model of the invention.
The method of the embodiment specifically comprises the following steps:
step 1, preprocessing an input image:
the method comprises the steps of super-pixel segmentation and color space conversion, wherein a super-pixel segmentation algorithm SLIC is used for segmenting an image into K regions;
K=[w·h/n] (1)
wherein K is a positive integer, and w and h are the width and height of the image, respectively; the super-pixel is composed of adjacent pixel points with the same or similar color, texture and other characteristics, the super-pixel is taken as a visual input unit, and a super-pixel segmentation graph is shown in fig. 3;
converting the input image into the LAB color space for subsequent operation processing, because the perceived uniform LAB color space is more consistent with human perception characteristics;
step 2, calculating regional characteristics:
the method comprises color features and texture features, according to HVS characteristics, due to the difference of the color, texture and other features of each region, the attention degree and sensitivity degree of human eyes to different regions are different, the attention degree to salient regions is higher, and the sensitivity to texture complex regions is lower; considering the relationship between the region characteristics and the human visual characteristics, the three links of region-based color contrast, region-based concave modulation effect and region-based texture roughness are mainly used:
1. region-based color contrast
For color features, the average of the spatial coordinates X, Y of all the pixels and the average of the color component L, A, B are calculated for each superpixel region in turn as the spatial coordinates (x) of the respective superpixelsk,yk) And color component values (l)k,ak,bk) And calculating the color difference and the space distance between the super pixel areas.
Figure GDA0003517918390000091
Figure GDA0003517918390000092
Where k and i are the indices of the superpixel, (l)k,ak,bk,xk,yk) And (l)i,ai,bi,xi,yi) Respectively representing the super-pixels k and i,
Figure GDA0003517918390000101
representing the color difference of the super pixel k and the super pixel i,
Figure GDA0003517918390000102
representing the spatial distance of the superpixel k from the superpixel i.
According to the HVS characteristics, the color of a certain region is greatly different from the color of the surrounding region, indicating that the color contrast of the region is large, and if the regions with high color contrast are adjacent or concentrated, the region is considered to attract more attention of human eyes. The color contrast is therefore proportional to the euclidean distance of the color and inversely proportional to the euclidean distance of the location, the color contrast being calculated:
Figure GDA0003517918390000103
Figure GDA0003517918390000104
wherein is ckTotal color contrast of super-pixel k, c(k,i)For the color contrast between regions k and i,
Figure GDA0003517918390000105
and
Figure GDA0003517918390000106
has been normalized to [0, 1 ]]Where c is1Set to 3. The resulting color contrast plot is shown in fig. 4.
2. Region-based concave modulation model
According to the human eye's concave features, the human visual system is most sensitive to areas mapped to the human eye's fovea, and the sensitivity of the human eye decreases rapidly with increasing distance from the fovea. The 5 regions with the highest color contrast were chosen as the foveal region, as shown in fig. 5. The eccentricity is defined as follows:
Figure GDA0003517918390000107
wherein i is 1, 2iAnd yiThe spatial coordinate values representing the selected region, x and y represent spatial coordinate values of other positions, and d represents the observation distance.
The closer to the fovea, the smaller the eccentricity, and the more easily the human eye pays attention to, defining a concave modulation model:
Figure GDA0003517918390000108
fi=ln(λei) (8)
where m is the total number of selected foveal regions, fiA modulation model representing a single foveal region; e.g. of the typeiDenotes the eccentricity and λ is the regulating parameter, here set to 1. Fig. 6 shows a schematic diagram of concave modulation with one area as a central concave area, and a general concave modulation diagram.
3. Region-based roughness model
According to the relationship between the texture and the visual characteristics, the more complex the texture is, the stronger the masking effect on the distortion is. Texture roughness is an important texture feature, and the masking capability of the region texture on noise is estimated by using the texture roughness.
And calculating the roughness by using the optimal size of each pixel point in the target image according to the Tamura texture roughness, wherein the optimal size refers to the window size corresponding to the maximum average gray difference between adjacent active windows. It is noted that the larger the optimal size at a pixel point, the more sparse the texture near the pixel point. The texture roughness of the regions is calculated here on the basis of Tamura texture roughness, in the following calculation process:
first, the average gray value in a multi-size window at a pixel point is calculated, the window size is 2n × 2n, here, the selection of the window size is refined:
Figure GDA0003517918390000111
where (x, y) is the pixel coordinate and I (x, y) is the pixel value at pixel (x, y). n 1, 2, 5, the texture change near the pixel point can be estimated more accurately by careful window division.
Then, the average gray difference between two windows in the horizontal direction and the vertical direction of each pixel is calculated respectively,
En,h(x,y)=|An(x+n,y)-An(x-n,y)| (10)
En,v(x,y)=|An(x,y+n)-An(x,y-n)| (11)
wherein En,hAnd En,vRepresenting the difference of the mean gray values in the horizontal and vertical direction, respectively, anRepresenting the mean gray value within the window;
then, selecting the window size corresponding to the maximum average gray scale difference value to calculate the roughness,
S(x,y)=2n (12)
En=Emax=max(E1,E2,...EN) (13)
wherein n is a value corresponding to the maximum average gray level difference, including the horizontal direction and the vertical direction.
Finally, the roughness of the superpixel is calculated.
Figure GDA0003517918390000112
Wherein n iskS (x, y) represents the optimal size of the pixel point corresponding to the total number of pixel points in the superpixel.
The larger the optimal window size is, the more sparse the texture near the pixel point is, that is, the larger the output value of S is, the smoother the vision is. Considering that the JND threshold reflects the maximum allowable distortion level, the size of the roughness value of the super-pixel is proportional to the optimal window size, and therefore the final modulation model should be inversely proportional to the roughness value. In addition, considering the influence of the average gray difference, a roughness modulation model is defined as follows:
Fk=σ·Emax/Fcrs (15)
wherein EmaxIs the maximum average gray difference, FcrsFor roughness values, σ is an adjustment parameter less than 1, related to the window size. As shown in fig. 7, a schematic graph of texture roughness for the region is obtained.
Step 3, regional weight model
And (3) establishing a region weight model by using the color contrast and the concave modulation model obtained in the step (2) according to the difference of attention and sensitivity of human eyes to different regions, namely the different visual importance degrees of the different regions.
Wk=(c2-ck)·ffov (16)
Wherein c iskColor contrast of the display area, ffovRepresenting a concave modulation model, c2A constant parameter is set here to 1.75. c. CkThe larger the value is, the higher the color contrast is, and the higher the attention of human eyes is; f. offovThe larger the value, the farther away from the foveal region, the lower the attention of the human eye. The attention of the human eye is inversely proportional to the size of the JND threshold, i.e., the more easily the human eye pays attention to the region, the lower the tolerance to distortion, and therefore, the weightThe value is inversely proportional to the magnitude of the color contrast output value and directly proportional to the magnitude of the concave modulation model output value, i.e. the more visually important, the lower the corresponding JND threshold and the lower the weight value should be, as shown in fig. 8, to fit the human visual characteristics.
Step 4, combining the contrast masking model of the texture roughness
According to the visual characteristics of human eyes, the texture has a visual masking effect on the human eyes, and more distortion can be tolerated in the area with larger roughness of the texture. As shown in fig. 9, the texture region in fig. 9(b) is more tolerant to distortion than fig. 9(c) and 9 (d). The texture roughness is used to improve the accuracy of the model, and is combined with a comparison masking model to establish a more accurate visual masking model:
CM′=CM+Fk (17)
wherein CM' is a comparison masking model of joint texture roughness, CM is an original comparison masking model, FkIs a region-based roughness model.
CM(x,y)=β·G(x,y)·W(x,y) (18)
Where β is a control parameter, and takes a value of 0.117, G (x, y) is the maximum weighted average gradient value at the pixel coordinate (x, y), and W (x, y) represents the edge weighting factor at the pixel coordinate (x, y).
Figure GDA0003517918390000121
Figure GDA0003517918390000122
W(x,y)=L·h (21)
Wherein g isk(i, j) are four-way high pass filters. L is the image after it has passed through the Canny operator edge detector and h is a k × k gaussian low pass filter.
Step 5, perception model based on region
In the above steps 3 and 4, a region weight model and a comparison masking model based on texture roughness are obtained respectively, and are used for estimating the visual importance degree of the region and improving the visual masking effect of the model, and the two are blended into the perception model to obtain the final perception model based on the region.
RJND=JNDNAMM·Wk (22)
Wherein WkJND, a region weight modelNAMMJND model based on texture roughness.
JNDNAMM=LA+CM′-α·min{LA,CM′} (23)
Figure GDA0003517918390000131
Wherein LA is a background brightness masking model, CM' is a contrast masking model of the joint texture roughness, alpha is an adjusting parameter, 0.3 is taken,
Figure GDA0003517918390000132
is the average background luminance at pixel (x, y).
The following is an experiment performed on a test image to evaluate the superpixel-based JND model proposed by the present invention. The experimental environment is a matlab 2016 platform under a windows 10 operating system, the memory is 16G, and the CPU is Intel (R) core (TM) i 7-8700. In the experiment, two indexes, namely Peak Signal to Noise Ratio (PSNR) and Mean Opinion Score (MOS), are used for evaluating the performance of the JND model, the PSNR is used for calculating the objective distortion degree of an image, the MOS is used for measuring the subjective perception quality of the image, 20 observers trained through grading are invited to Score, and the scores sequentially represent that the subjective quality is from poor to good from 0 to 5. The higher the PSNR value is, the smaller the distortion degree of the image is, and the higher the MOS value is, the better the subjective perception quality of the image is. On the basis of the same subjective quality, the lower the PSNR, the better the performance of the model is. 3 different JND models (Liu, Wu's 2013, Wu's 2017) were selected as comparative models in the experiment. Fig. 10 is a comparison of the subjective quality and PSNR of an output image after the JND model processing on a test image. Fig. 11 is a comparison of different areas in the output image.
TABLE 1 comparison of model Performance
Figure GDA0003517918390000133
Figure GDA0003517918390000141
Where the average data of the experimental results are bolded in red font and the data that the model performs better on some test images are bolded in blue font. From the average results in the last column of the table, it can be seen that the present invention can effectively reduce PSNR and remove as much visual redundancy as possible while maintaining subjective quality as compared to the selected reference model. The experiment shows that the method has better robustness and accuracy in the aspect of image visual redundancy estimation, uses the super-pixel and region characteristics in the estimation of the JND threshold, has good biological principle support, and can be better suitable for the estimation of image perception redundancy.
The embodiments of the present invention have been described with reference to the accompanying drawings, but the present invention is not limited to the embodiments, and various changes and modifications can be made according to the purpose of the invention, and any changes, modifications, substitutions, combinations or simplifications made according to the spirit and principle of the technical solution of the present invention shall be equivalent substitutions, as long as the purpose of the present invention is met, and the present invention shall fall within the protection scope of the present invention without departing from the technical principle and inventive concept of the present invention.

Claims (6)

1. A method for super-pixel based just noticeable distortion, comprising the steps of:
step 1, preprocessing an input image:
the method comprises the steps of super-pixel segmentation and color space conversion, wherein simple linear iterative clustering in a super-pixel segmentation algorithm is used for carrying out region segmentation on an input image, and super-pixels are taken as visual input units; simultaneously converting the input image into an LAB color space with uniform visual perception so as to facilitate subsequent operation processing;
step 2, calculating regional characteristics:
the method comprises the steps of including color features and texture features, for the color features, firstly calculating the average value of spatial coordinates X, Y and the average value of color components L, A, B of all pixels in a super-pixel region, then sequentially calculating the spatial distance and color difference with other super-pixels, and calculating the color contrast between the super-pixels; in addition, in consideration of the concave characteristic of human eyes, 5 areas with high color contrast are selected to establish a concave modulation model based on the areas; for texture features, further refining the selection of window sizes on the basis of Tamura texture roughness according to the masking effect of the texture roughness on human eyes, and meanwhile, considering the influence of average gray difference between windows to establish a texture roughness model based on regions;
step 3, establishing a region weight model:
according to the visual characteristics of human eyes, the attention degrees of the human eyes to different regions in the same image are different, in step 2, a region-based contrast model and a concave modulation model are established through the color components and the space coordinates of the regions, the attention degree of the regions is estimated, and a region weight model is established; the higher the contrast, the more easily it is focused on; the closer to the foveal region, the more susceptible to attention;
step 4, combining a texture roughness comparison and masking model:
in step 2, establishing a texture roughness model based on the region according to the visual masking effect of the texture on human eyes; considering that the comparison masking model still needs to be further improved, the accuracy of the model is improved by utilizing the texture roughness, and the comparison masking model combining the texture roughness is established;
step 5, a perception model based on the super pixels:
and (4) fusing the comparison masking model of the joint texture roughness and the brightness masking model on the basis of the steps 3 and 4, and obtaining a final perception model under the weighting of the region weight model.
2. The method of claim 1, wherein in step 1, the image is divided into K regions using a superpixel segmentation algorithm SLIC;
K=[w·h/n] (1)
wherein K is a positive integer, and w and h are the width and height of the image, respectively; the super pixel is composed of adjacent pixel points with the same or similar color, texture and other characteristics, and can be regarded as a visual input unit.
3. The method of claim 1, wherein in step 2, considering the relationship between the region feature and the human visual feature, the three links of region-based color contrast, region-based concave modulation effect and region-based texture roughness are mainly considered:
(1) region-based color contrast:
for color features, the average of the spatial coordinates X, Y of all the pixels and the average of the color component L, A, B are calculated for each superpixel region in turn as the spatial coordinates (x) of the respective superpixelsk,yk) And color component values (l)k,ak,bk) Calculating the color difference and the space distance between the super pixel areas:
Figure FDA0003517918380000021
Figure FDA0003517918380000022
where k and i are the indices of the superpixel, (l)k,ak,bk,xk,yk) And (l)i,ai,bi,xi,yi) Respectively representing the super-pixels k and i,
Figure FDA0003517918380000023
representing the color difference of the super pixel k and the super pixel i,
Figure FDA0003517918380000024
represents the spatial distance of the superpixel k from the superpixel i;
according to HVS characteristics, the color of a certain region is greatly different from that of the surrounding region, which indicates that the color contrast of the region is relatively large, and if the regions with high color contrast are adjacent or concentrated, the region is considered to attract the attention of human eyes; thus, the color contrast is proportional to the Euclidean distance of the color and inversely proportional to the Euclidean distance of the location, and is calculated as follows:
Figure FDA0003517918380000025
Figure FDA0003517918380000026
wherein is ckTotal color contrast of super-pixel k, c(k,i)For the color contrast between regions k and i,
Figure FDA0003517918380000027
and
Figure FDA0003517918380000028
has been normalized to [0, 1 ]]Where c is1Set to 3;
(2) region-based concave modulation model
According to the human eye concave feature, the human visual system is most sensitive to the area mapped to the human eye foveal area, and the sensitivity of the human eye decreases rapidly with increasing distance from the foveal area; selecting 5 areas with highest color contrast as central concave areas; the eccentricity is defined as follows:
Figure FDA0003517918380000029
wherein i is 1, 2iAnd yiRepresenting the space coordinate value of the selected area, x and y representing the space coordinate values of other areas, and d representing the observation distance;
the closer to the fovea, the smaller the eccentricity, and the more easily the human eye pays attention to, defining a concave modulation model:
Figure FDA00035179183800000210
fi=ln(λei) (8)
where m is the total number of selected foveal regions, fiA modulation model representing a single foveal region; e.g. of the typeiDenotes the eccentricity, λ is the adjustment parameter, here set to 1;
(3) region-based roughness model
According to the relationship between the texture and the visual characteristic, the more complex the texture is, the stronger the masking effect on the distortion is; texture roughness is an important texture feature, and the masking capability of regional textures on noise is estimated by the texture roughness;
calculating roughness by using the optimal size of each pixel point in the target image according to Tamura texture roughness, wherein the optimal size refers to the window size corresponding to the maximum average gray difference between adjacent active windows; the texture roughness of the regions is calculated here on the basis of Tamura texture roughness by the following procedure:
first, the average gray value in a multi-size window at a pixel point is calculated, the window size is 2n × 2n, here, the selection of the window size is refined:
Figure FDA0003517918380000031
where (x, y) is the pixel coordinate, and I (x, y) is the pixel value at pixel (x, y); n is 1, 2, 5, and the texture change near the pixel point can be estimated more accurately through careful window division;
then, the average gray difference between two windows in the horizontal direction and the vertical direction of each pixel is calculated respectively,
En,h(x,y)=|An(x+n,y)-An(x-n,y)| (10)
En,v(x,y)=|An(x,y+n)-An(x,y-n)| (11)
wherein En,hAnd En,vRespectively representing the difference of the mean gray values in the horizontal and vertical directions, AnRepresenting the mean gray value within the window;
then, selecting the window size corresponding to the maximum average gray scale difference value to calculate the roughness,
S(x,y)=2n (12)
En=Emax=max(E1,E2,...EN) (13)
wherein n is a value corresponding to the maximum average gray difference, including the horizontal direction and the vertical direction;
finally, the roughness of the superpixel is calculated:
Figure FDA0003517918380000032
wherein n iskS (x, y) represents the optimal size corresponding to the pixel point, wherein S is the total number of the pixel points in the super pixel;
the texture structure of the region with larger roughness value is sparse, and the visual masking effect is weaker; since the JND threshold reflects the maximum allowable distortion level, the final modulation model is inversely proportional to the roughness value; considering the influence of the average gray difference, a roughness modulation model is defined as follows:
Fk=σ·Emax/Fcrs (15)
wherein EmaxIs the maximum average gray difference, FcrsFor roughness values, σ is an adjustment parameter less than 1, related to the window size.
4. The method of claim 1, wherein in step 3, the color contrast and the concave modulation model obtained in step 2 are used to establish a region weight model:
Wk=(c2-ck)·ffov (16)
wherein c iskColor contrast of the representation area, ffovRepresenting a concave modulation model, c2A constant parameter is set here to 1.75.
5. The method of claim 1, wherein texture roughness is used to improve model accuracy in step 4, and combined with a comparative masking model to build a more accurate visual masking model:
CM′=CM+Fk (17)
wherein CM' is a comparison masking model of joint texture roughness, CM is an original comparison masking model, FkFor region-based roughness model:
CM(x,y)=β·G(x,y)·W(x,y) (18)
where β is a control parameter, and takes a value of 0.117, G (x, y) is a maximum weighted average gradient value at the pixel coordinate (x, y), and W (x, y) represents an edge weighting factor at the pixel coordinate (x, y):
Figure FDA0003517918380000041
Figure FDA0003517918380000042
W(x,y)=L·h (21)
wherein g isk(i, j) is a four-direction high pass filter; l is the image after it has passed through the Canny operator edge detector and h is a k × k gaussian low pass filter.
6. The method of claim 1, wherein in steps 3 and 4, a region weight model and a texture roughness-based comparison masking model are obtained respectively for estimating the visual importance of the region and enhancing the visual masking effect of the model, and the two are merged into the perceptual model to obtain a final region-based perceptual model:
RJND=JNDNAMM·Wk (22)
wherein WkJND, a region weight modelNAMMJND model based on texture roughness:
JNDNAMM=LA+CM′-α·min{LA,CM′) (23)
Figure FDA0003517918380000051
wherein LA is a background brightness masking model, CM' is a contrast masking model of the joint texture roughness, alpha is an adjusting parameter, 0.3 is taken,
Figure FDA0003517918380000052
is the average background luminance at pixel (x, y).
CN202011188873.1A 2020-10-30 2020-10-30 Super-pixel-based just noticeable distortion method Active CN112634278B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011188873.1A CN112634278B (en) 2020-10-30 2020-10-30 Super-pixel-based just noticeable distortion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011188873.1A CN112634278B (en) 2020-10-30 2020-10-30 Super-pixel-based just noticeable distortion method

Publications (2)

Publication Number Publication Date
CN112634278A CN112634278A (en) 2021-04-09
CN112634278B true CN112634278B (en) 2022-06-14

Family

ID=75303188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011188873.1A Active CN112634278B (en) 2020-10-30 2020-10-30 Super-pixel-based just noticeable distortion method

Country Status (1)

Country Link
CN (1) CN112634278B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114298922A (en) * 2021-12-10 2022-04-08 华为技术有限公司 Image processing method and device and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102724525A (en) * 2012-06-01 2012-10-10 宁波大学 Depth video coding method on basis of foveal JND (just noticeable distortion) model
CN103514580A (en) * 2013-09-26 2014-01-15 香港应用科技研究院有限公司 Method and system used for obtaining super-resolution images with optimized visual experience
CN103607589A (en) * 2013-11-14 2014-02-26 同济大学 Level selection visual attention mechanism-based image JND threshold calculating method in pixel domain
CN104992419A (en) * 2015-07-08 2015-10-21 北京大学深圳研究生院 Super pixel Gaussian filtering pre-processing method based on JND factor
CN108521572A (en) * 2018-03-22 2018-09-11 四川大学 A kind of residual filtering method based on pixel domain JND model
CN109525847A (en) * 2018-11-13 2019-03-26 华侨大学 A kind of just discernable distortion model threshold value calculation method
US10356404B1 (en) * 2017-09-28 2019-07-16 Amazon Technologies, Inc. Image processing using just-noticeable-difference thresholds

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102724525A (en) * 2012-06-01 2012-10-10 宁波大学 Depth video coding method on basis of foveal JND (just noticeable distortion) model
CN103514580A (en) * 2013-09-26 2014-01-15 香港应用科技研究院有限公司 Method and system used for obtaining super-resolution images with optimized visual experience
CN103607589A (en) * 2013-11-14 2014-02-26 同济大学 Level selection visual attention mechanism-based image JND threshold calculating method in pixel domain
CN104992419A (en) * 2015-07-08 2015-10-21 北京大学深圳研究生院 Super pixel Gaussian filtering pre-processing method based on JND factor
US10356404B1 (en) * 2017-09-28 2019-07-16 Amazon Technologies, Inc. Image processing using just-noticeable-difference thresholds
CN108521572A (en) * 2018-03-22 2018-09-11 四川大学 A kind of residual filtering method based on pixel domain JND model
CN109525847A (en) * 2018-11-13 2019-03-26 华侨大学 A kind of just discernable distortion model threshold value calculation method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Incorporating Texture into SLIC Super-pixels method for high spatial resolution remote Sensing image segmentation;Lizhen Lu,Chuang Wang;《IEEE》;20191230;全文 *
基于中心凹的双目恰可察觉编码失真模型;杨家辉,郁梅,徐升阳,蒋刚毅;《光电子· 激光》;20170930;全文 *

Also Published As

Publication number Publication date
CN112634278A (en) 2021-04-09

Similar Documents

Publication Publication Date Title
Wang et al. An experimental-based review of image enhancement and image restoration methods for underwater imaging
Gu et al. Multiscale natural scene statistical analysis for no-reference quality evaluation of DIBR-synthesized views
Li et al. No-reference quality assessment of deblocked images
Yue et al. Combining local and global measures for DIBR-synthesized image quality evaluation
CN112001960B (en) Monocular image depth estimation method based on multi-scale residual error pyramid attention network model
Liang et al. No-reference perceptual image quality metric using gradient profiles for JPEG2000
Shao et al. Full-reference quality assessment of stereoscopic images by learning binocular receptive field properties
Liu et al. Perceptual reduced-reference visual quality assessment for contrast alteration
CN107784651B (en) Fuzzy image quality evaluation method based on fuzzy detection weighting
CN109978854B (en) Screen content image quality evaluation method based on edge and structural features
Yue et al. Blind stereoscopic 3D image quality assessment via analysis of naturalness, structure, and binocular asymmetry
Tian et al. Quality assessment of DIBR-synthesized views: An overview
CN112950596B (en) Tone mapping omnidirectional image quality evaluation method based on multiple areas and multiple levels
Chen et al. Blind quality index for tone-mapped images based on luminance partition
Wang et al. Screen content image quality assessment with edge features in gradient domain
Gao et al. A novel UAV sensing image defogging method
Yu et al. Fla-net: multi-stage modular network for low-light image enhancement
CN113038123A (en) No-reference panoramic video quality evaluation method, system, terminal and medium
Feng et al. Low-light image enhancement algorithm based on an atmospheric physical model
CN113298763B (en) Image quality evaluation method based on significance window strategy
CN115131229A (en) Image noise reduction and filtering data processing method and device and computer equipment
CN112767385B (en) No-reference image quality evaluation method based on significance strategy and feature fusion
CN112634278B (en) Super-pixel-based just noticeable distortion method
Yang et al. EHNQ: Subjective and objective quality evaluation of enhanced night-time images
Yang et al. Latitude and binocular perception based blind stereoscopic omnidirectional image quality assessment for VR system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant