CN103002280B - Distributed decoding method based on HVS&ROI and system - Google Patents

Distributed decoding method based on HVS&ROI and system Download PDF

Info

Publication number
CN103002280B
CN103002280B CN201210377970.4A CN201210377970A CN103002280B CN 103002280 B CN103002280 B CN 103002280B CN 201210377970 A CN201210377970 A CN 201210377970A CN 103002280 B CN103002280 B CN 103002280B
Authority
CN
China
Prior art keywords
block
roi
frame
value
macro block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210377970.4A
Other languages
Chinese (zh)
Other versions
CN103002280A (en
Inventor
丁恩杰
黄河
袁莎莎
仲亚丽
徐卫东
向洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology CUMT
Original Assignee
China University of Mining and Technology CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology CUMT filed Critical China University of Mining and Technology CUMT
Priority to CN201210377970.4A priority Critical patent/CN103002280B/en
Publication of CN103002280A publication Critical patent/CN103002280A/en
Application granted granted Critical
Publication of CN103002280B publication Critical patent/CN103002280B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

A kind of distributed coding method and system based on HVS&ROI, belong to distributed coding method and system, and this system is made up of encoder two parts.Encoder combines the JND model of HVS and input video Wyner Ziv frame is divided into C macro block and AC macro block, and AC macro block is extracted ROI, and K frame and each macro block of WZ frame are carried out absolute coding;The decoder code stream to receiving carries out combined decoding.Due to the fact that the JND model introducing human visual system HVS, the data decreasing JND threshold value region once process and transmission work, reduce transfer rate, play the effect saving energy consumption with bandwidth;Due to the fact that ROI region of interest is extracted in the glare area to motion intense and luminance contrast are bigger, and it is carried out entropy code, in the case of not increasing encoder complexity, improve subjective quality as far as possible.

Description

Distributed decoding method based on HVS&ROI and system
Technical field
The present invention is a kind of distributed decoding method and system, a kind of distributed decoding method based on HVS&ROI And system.
Background technology
Along with sensor technology, underground communica tion technology and the development of technology of Internet of things, for improving the performance of coal mine underground monitoring system And security reliability provides technical guarantee.Rely on project of national nature science fund project " the mine multimedia of distance WMSN Disaster communication system-based theoretical research " proposed, intend using have self-organization, place flexibly, that mobility is strong is wireless Multimedia sensor network (Wireless Multimedia Sensor Network, WMSN) realizes monitoring image after mine disaster Gather transmission work.
In WMSN, the video sensing node of computing capability and energy constraint has two basic demands to encoder: (1) is subject to Joint behavior, energy limit, and encoder should have the feature of low complex degree, low-power consumption;(2) limited by speed, bandwidth, compiled Code device should have high compression rate.Traditional multimedia video compression and coding standard, its complexity usual Video coding end is decoding end 5-10 times, be not suitable for WMSN.And distributed video coding (Distributed Video Coding, DVC) has as follows Feature: (1) encoder complexity is relatively low, and decoding complex degree is of a relatively high;(2) to easily occur error code, packet loss wireless Channel has good robustness;(3) compression ratio is higher, and easily realizes multilevel coding.DVC passes through intraframe coding and interframe solution Code, makes the complexity of coding side transfer to decoding end.It is very suitable for being applied to the movement harsh to encoder complexity and horsepower requirements The wireless video occasions such as visual telephone, wireless camera machine monitoring.
Underground coal mine does not has natural light irradiation, can only manually illuminate, and subsurface environment is complicated, it is impossible to ensure that all regions can Laying and install luminaire, major part region, down-hole belongs to half-light environment, even without light in some tunnel.But, by ore deposit Lamp direct irradiation or be irradiated to reflective object, it will cause visual field internal space or the contrast of temporal extreme brightness, and cause and regard The uncomfortable glare vision reduced with object visibility of feel.Mostly existing DVC system model is that coding side independently carries out Wyner-Ziv With intraframe coding, utilize side information and virtual correlated channels combined decoding in decoding end, and only to video space own with on the time Redundancy and transform domain on the compression that carries out of characteristic process, consider the specific photographed scene of video and visual effect.
Summary of the invention
Present invention aim to address the weak point of above-mentioned technology, it is provided that a kind of low photograph of the half-light bad for underground coal mine illumination Degree environment, only encodes JND threshold value above section macro block, in the case of not increasing encoder complexity, effectively changes as far as possible The subjective quality of kind video, reduces transfer rate and saves based on HVS&ROI the distributed decoding method of energy consumption and bandwidth and be System.
For achieving the above object, based on HVS&ROI the distributed decoding method of the present invention includes coding and decodes, wherein, Coded method step is as follows:
A., input video frame being divided into key frame and Wyner-Ziv frame, key frame carries out intraframe coding, described key frame is K frame, H.264 intra encoder carries out intraframe coding to key frame;Wyner-Ziv frame is carried out HVS&ROI Wyner-Ziv coding;
The method of the most described HVS&ROI Wyner-Ziv coding includes: by the JND model of HVS, Wyner-Ziv frame is divided into C Macro block and AC macro block;
Described JND model method includes:
B1., Wyner-Ziv frame is divided into 8 × 8 piecemeals, and each piecemeal is B block;;
B2. the JND threshold value of each pixel in B block is calculated according to following formula:
JND (x, y, t)=f (idl (x, y, t)) JNDs(x, y)
Wherein, (x, y t) represent the average interframe luminance errors value of adjacent t and t-1 moment, JND to idls(x y) represents JND In spatial domain threshold value;
idl ( x , y , t ) = 1 2 ( I ( x , y , t ) - I ( x , y , t - 1 ) + I ‾ ( x , y , t ) - I ‾ ( x , y , t - 1 ) )
JNDs(x, y)=Tl(x, y)+TY t(x, y)-CY×Min{Tl(x, y), TY t(x, y) }
Wherein, Tl(x y) represents that background luminance adapts to influence function, TY t(x y) represents that texture effects function reaction HVS is for flat Skating area is more higher than the sensitivity of texture compact district, and (x y) represents pixel corresponding coordinate in image, CYRepresent background luminance and stricture of vagina Reason shelters the correlation coefficient between two kinds of influence factors;
B3. according to forward, backward and the mean pixel SAD of following formula calculating current block B block:
SAD = Σ x = 1 M Σ y = 1 N | w ( x , y ) - r ( x , y ) |
Wherein, (x, y) (x y) represents that in WZ frame, encoding block and reference block are at coordinate (x, y) pixel value at place to w respectively with r;
B4. coding side side information ESI is obtained by following formula:
ESI=MinSAD{ FB, BB, AB}
Wherein, FB is forward direction consecutive frame corresponding blocks, and FB is the abbreviation of Forward Block;BB is backward consecutive frame corresponding blocks, BB is the abbreviation of Backward Block;AB is front and back's consecutive frame correspondence average block, and AB is the abbreviation of Average Block;
Calculate consecutive frame SAD the most respectivelyBIn maximumAnd minimaNote Obtain t < SAD respectivelyBWith t > SADBTime pixel ratio ρ0、ρ1With average gradient value m0、m1, then grand mean gradient Value is: m=ρ0m01m1
B6. calculating prospect with the variance of background is:
V=ρ0ρ1(m0-m1)2
B7. existTravel through v on interval and find out maximum vMax, vMaxCorresponding t is prospect and background side When difference maximum, now t value is threshold value TROI
B8. according to the T obtainedROIValue tentatively extracts foreground image, by comparing SADBWith TROIValue extract foreground picture Picture, for more than TROIForeground image, out as foreground image, is then performed by the extracting section of value by step b5 to b8 again One time, obtain more accurate two grades of TROI
B9. according to the distortion function D of ESI (x, y) calculates the predicted distortion of each point, following formula obtain the predicted distortion value of each point:
D (x, y)=| ESI (x, y)-I (x, y) | wherein, ESI (x, y) and I (x y) represents coordinate in WZ frame (x, y) place respectively ESI value and pixel value;
B10. according to following formula, block B is carried out JND differentiation, divide AC block and C block, for distortion value less than or equal to JND threshold The macro block of value is divided into C block, and is divided into AC block for losing straight value more than the macro block of JND threshold value, and obtains this WZ frame figure
B11. AC block is DCT and 2MLevel unified quantization, extraction coefficient band also carries out Zigzag scanning, according to selected amount Change matrix and conversion coefficient is carried out uniform quantization;And calculate K frame and WZ frame same position AC block gradient according to following formula SADB
SAD B = Σ ( j , k ) ∈ B | G K [ F ( j , k ) ] - G W [ F ( j , k ) ] |
Wherein K represent key frame, W be Wyner-Ziv frame, B be single macro block, (j k) is z coordinate (j, k) place to F Pixel value;
B12. use following formula ROI macro block decision criteria, obtain region of interest two-value mask ROImask:
POI mask = 1 ( SAD B &GreaterEqual; T ROI ) 0 ( SAD B < T ROI )
Wherein, ROImaskFor the two-value mask of region of interest, when macro block gradient SADBThen it is set to 0 less than gradient T, is otherwise 1; To two-value mask ROImaskCarry out simple Morphological scale-space, first corrode then further expansion, make contour of object smooth, then by following formula Extraction ROI:
ROI = W B ( W B &SubsetEqual; ROI mask ) 0 ( W B &NotSubset; ROI mask )
Wherein, WBRepresent AC macro block corresponding in current WZ frame, if WBCorresponding ROImaskBeing 1 is ROI block, Otherwise set to 0 as non-ROI block, finally give the ROI region of interest of present frame;
B13. to Blockmask、ROImaskAnd the ROI block after quantifying carries out Huffman compression coding;Non-to AC macro block ROI block extracts bit plane and carries out LDPC decoding, if the image sets of present encoding is last image sets, then exits HVS&RO1 distributed video coding;
C. AC macro block is extracted ROI, the label information of ROI macro block Yu each macro block is carried out entropy code, non-ROI macro block is entered Row Wyner-Ziv based on LDPC coding;
D. comprising the concrete steps that of described coding/decoding method:
D1. intraframe decoder H.264 is obtained key frame;
D2. the label information to ROI macro block Yu each macro block carries out Huffman entropy decoding, it is thus achieved that Blockmask、ROImaskAnd The quantization parameter of ROI block;
D3., channel estimation parameter is set, uses cross entropy minimum rule to estimate channel parameter, according to the WZ frame having decoded out The most adjacent key frame carries out motion compensated interpolation and obtains initial edge information SI;
D4. utilize side information to pass through LDPC decoding to combine the label information of each macro block and obtain the bit stream coefficient of non-ROI block;
D5. will be according to BlockmaskThe C block obtained, the ROI block of entropy decoding and the non-ROI block of block decoding are reconstructed.
Described LDPC interpretation method is:
2a initializing variable node, likelihood information L (cj) such as following formula:
L(cj)=2yj2
Wherein, yjFor receiving code word, σ2For noise variance, j=1,2 ..., n;
2b vertical direction iteration, calculates variable node likelihood value L (gij), wherein variable node is odd even school corresponding on bipartite graph Test the column vector of matrix H, if h in check matrix Hij=0, then L (gij)=0;Otherwise it is calculated as follows L (gij):
L(gij)=L (cj)+∑i,j≠jL(hij)
Wherein, L (hij) it is check-node likelihood information, i=1,2 ..., m, during the 1st iteration, L (hij)=0;
2c horizontal direction iteration, calculates check-node likelihood information L (hij), wherein check-node is odd even school corresponding on bipartite graph Test the row vector of matrix H, if check matrix hij=0, then L (hij)=0;Otherwise it is calculated as follows L (hij):
L ( h ij ) = 2 tanh - 1 [ &Pi; j , j &NotEqual; i tanh ( 1 2 L ( g ij ) ) ]
2d decodes judgement, obtains the discriminative information of variable node according to function obtained above, is calculated as follows variable node judgement Information L (Qj):
L(Qj)=L (cj)+∑iL(hij)
If L is (Qj) > 0, then adjudicate code wordOtherwiseIfOr reach the maximum restriction time of iteration Number, terminates, and otherwise repeats step 2b;
The function calculated in above step is for decoding.
A kind of system for implementing distributed video coding method based on HVS&ROI, this system includes encoder;
Described encoder includes,
Described H.264 intra encoder: for the key frame of input is carried out H.264 intraframe coding, and the compression obtained Code stream is sent to H.264 intraframe decoder device;
Described ESI signal generating unit: for building coarse coding side side information, Encoder-side Side at coding side Information is abbreviated as ESI, selects forward direction consecutive frame corresponding blocks FB, backward consecutive frame corresponding blocks BB and front and back consecutive frame Corresponding average block AB inputs as after each piece of computing of ESI, ESI and WZ frame with the reference block of WZ frame corresponding blocks correlation maximum JND decision unit;Described FB is the abbreviation of Forward Block;Described BB is the abbreviation of Backward Block; Described AB is the abbreviation of Average Block;
Described JND threshold value signal generating unit: for the JND decision unit of input is generated by background luminance and texture masking two The visibility JND threshold value of the human visual system that the factor of kind determines;
Described ROI threshold value signal generating unit: for the ROI extraction unit of input being obtained automatically the threshold value of region of interest ROI;
Described JND decision unit: divide C macro block for comparing JND threshold value according to ESI with current 8 × 8 pieces of margin of image element With AC macro block;The described abbreviation that C macro block is Copy Block;Described AC macro block is Actual Coding Block Abbreviation;
Described gradient SAD computing unit: be used for being that ROI extraction is done to input ROI extraction unit computing macro block gradient SAD Prepare;
Described dct transform unit: obtain conversion coefficient for Wyner-Ziv frame is carried out discrete cosine transform, by all not Organize the efficiency of formation band with the coefficient of same frequency position in transform block, coefficient tape is carried out Zigzag scan sorting and transmits To quantifying unit;
Described quantifying unit: the conversion coefficient for transmitting converter unit carries out quantization and obtains quantization parameter, to this quantization is Number carries out 2MLevel unified quantization;For specific coefficient tape, identical quantification symbol position is divided into one group, is defined corresponding position Set of planes, is then input to LDPC encoder;
Described ROI extraction unit: for extracting ROI region according to gradient SAD with ROI threshold value, AC macro block is divided into ROI block and non-ROI block;
Described LDPC encoder: each bit-plane that the quantifying unit for non-ROI block obtains carries out independent LDPC and compiles Code;
Described entropy code unit: for ROI block, ROImaskAnd BlockmaskHuffman coding;
Described decoder includes:
Described H.264 intraframe decoder device: for the key frame compressed bit stream received is decoded, and the key that will recover Two field picture is sent to side information signal generating unit;
Described side information signal generating unit: adjacent key frame produces limit letter by motion compensated interpolation before and after utilization decodes Breath, with the Block of decodingmaskObtain C block;Estimate that original Wyner-Ziv frame is believed with limit according to motion compensated interpolation residual information Dependency between breath, sets up channel estimation model;Side information is carried out DCT, quantization and the bit plane identical with coding side again Extract, and this bit plane is sent into LDPC decoding unit;
Described entropy decoding unit: for the ROI block received, ROImaskAnd BlockmaskHuffman compression coding code stream It is decoded, and by the ROI of decodingmaskIt is sent to LDPC decoder, by BlockmaskIt is sent to side information signal generating unit;
Described LDPC decoder: the non-ROI block Wyner-Ziv compressed bit stream received for basis and side information are to non-ROI Block is decoded, and bit plane decoding obtained is sent to inverse quantization unit;
Described inverse quantization unit: merge the amount of bit-plane for LDPC decoding being obtained the bit stream of non-ROI block and ROI block Change coefficient, according to dependency, quantization parameter is carried out inverse quantization and obtain conversion coefficient, and this conversion coefficient is sent to inverse transformation list Unit;
Described inverse transformation unit: the conversion coefficient for transmitting inverse quantization unit carries out what inverse discrete cosine transformation was restored AC image block also sends into reconfiguration unit;
Described reconfiguration unit: for C macro block and AC reconstructing macroblocks reduction WZ two field picture that decoding is obtained.
Beneficial effect:
1) employing the JND model of human visual system HVS, the data decreasing JND threshold value region once process and transmission work Make, reduce transfer rate, play the effect saving energy consumption with bandwidth;
2) ROI region of interest is extracted in the glare area that the present invention is directed to motion intense bigger with luminance contrast, and it is carried out entropy volume Code, in the case of not increasing encoder complexity, improves subjective quality as far as possible;
In the case of not budget increase complexity of trying one's best and amount of budget, improve the subjective quality of downhole video, and improve video sequence Compression ratio, reduces transfer rate, saves energy consumption and bandwidth.
Accompanying drawing explanation
Fig. 1 is present invention distributed coding based on HVS&ROI system architecture diagram;
Fig. 2 is present invention distributed video coding based on HVS&ROI flow chart;
Fig. 3 is present invention distributed video based on HVS&ROI decoding process figure;
Fig. 4,5 it is present invention distributed coding based on HVS&ROI system experimentation performance map.
Detailed description of the invention
An enforcement to the present invention is further described below in conjunction with the accompanying drawings:
As shown in Figure 1: based on HVS&ROI the distributed coding and decoding device of the present invention includes encoder, wherein:
Described encoder includes:
Described H.264 intra encoder: for the key frame of input is carried out H.264 intraframe coding, and the compression obtained Code stream is sent to H.264 intraframe decoder device;
Described ESI signal generating unit: for building coarse coding side side information, Encoder-side Side at coding side Information is abbreviated as ESI, selects forward direction consecutive frame corresponding blocks FB, backward consecutive frame corresponding blocks BB and front and back consecutive frame Corresponding average block AB inputs as after each piece of computing of ESI, ESI and WZ frame with the reference block of WZ frame corresponding blocks correlation maximum JND decision unit;Described FB is the abbreviation of Forward Block;Described BB is the abbreviation of Backward Block; Described AB is the abbreviation of Average Block;
Described JND threshold value signal generating unit: for the JND decision unit of input is generated by background luminance and texture masking two The visibility JND threshold value of the human visual system that the factor of kind determines;
Described ROI threshold value signal generating unit: for the ROI extraction unit of input being obtained automatically the threshold value of region of interest ROI;
Described JND decision unit: divide C macro block for comparing JND threshold value according to ESI with current 8 × 8 pieces of margin of image element With AC macro block;The described abbreviation that C macro block is Copy Block;Described AC macro block is Actual Coding Block Abbreviation;
Described gradient SAD computing unit: be used for being that ROI extraction is done to input ROI extraction unit computing macro block gradient SAD Prepare;
Described dct transform unit: obtain conversion coefficient for Wyner-Ziv frame is carried out discrete cosine transform, by all not Organize the efficiency of formation band with the coefficient of same frequency position in transform block, coefficient tape is carried out Zigzag scan sorting and transmits To quantifying unit;
Described quantifying unit: the conversion coefficient for transmitting converter unit carries out quantization and obtains quantization parameter, to this quantization is Number carries out 2MLevel unified quantization;For specific coefficient tape, identical quantification symbol position is divided into one group, is defined corresponding position Set of planes, is then input to LDPC encoder;
Described ROI extraction unit: for extracting ROI region according to gradient SAD with ROI threshold value, AC macro block is divided into ROI block and non-ROI block;
Described LDPC encoder: each bit-plane that the quantifying unit for non-ROI block obtains carries out independent LDPC and compiles Code;
Described entropy code unit: for ROI block, ROImaskAnd BlockmaskHuffman coding;
Described decoder includes:
Described H.264 intraframe decoder device: for the key frame compressed bit stream received is decoded, and the key that will recover Two field picture is sent to side information signal generating unit;
Described side information signal generating unit: adjacent key frame produces limit letter by motion compensated interpolation before and after utilization decodes Breath, with the Block of decodingmaskObtain C block;Estimate that original Wyner-Ziv frame is believed with limit according to motion compensated interpolation residual information Dependency between breath, sets up channel estimation model;Side information is carried out DCT, quantization and the bit plane identical with coding side again Extract, and this bit plane is sent into LDPC decoding unit;
Described entropy decoding unit: for the ROI block received, ROImaskAnd BlockmaskHuffman compression coding code stream It is decoded, and by the ROI of decodingmaskIt is sent to LDPC decoder, by BlockmaskIt is sent to side information signal generating unit;
Described LDPC decoder: the non-ROI block Wyner-Ziv compressed bit stream received for basis and side information are to non-ROI Block is decoded, and bit plane decoding obtained is sent to inverse quantization unit;
Described inverse quantization unit: merge the amount of bit-plane for LDPC decoding being obtained the bit stream of non-ROI block and ROI block Change coefficient, according to dependency, quantization parameter is carried out inverse quantization and obtain conversion coefficient, and this conversion coefficient is sent to inverse transformation list Unit.
Described inverse transformation unit: the conversion coefficient for transmitting inverse quantization unit carries out what inverse discrete cosine transformation was restored AC image block also sends into reconfiguration unit;
Described reconfiguration unit: for C macro block and AC reconstructing macroblocks reduction WZ two field picture that decoding is obtained.
As depicted in figs. 1 and 2, the encoder that distributed decoding method based on HVS&ROI uses is mainly by H.264 frame Encoder, ESI signal generating unit, JND threshold value signal generating unit, ROI threshold value signal generating unit, JND decision unit, gradient SAD are counted Calculate unit, dct transform unit, quantifying unit, ROI extraction unit, LDPC coding unit, 11 unit groups of entropy code unit Become.The most H.264 intra encoder is used for key frame is carried out intraframe coding;ESI signal generating unit is thick for building at coding side Rough coding side side information;JND threshold value signal generating unit and 2 unit of JND decision unit are for dividing human eye according to HVS characteristic Insensitive C macro block;ROI threshold value signal generating unit, gradient SAD computing unit, 3 unit of ROI extraction unit are used for extracting ROI block;Above each unit is exported by dct transform unit, quantifying unit, LDPC coding unit, 4 unit of entropy code unit The label information of ROI block and each macro block and non-ROI block carry out respectively entropy code and LDPC Wyner-Ziv coding.
Comprising the concrete steps that of described coded method:
The method is made up of coded method and coding/decoding method two parts;Encoder combines the JND model of HVS by input video Wyner-Ziv (WZ) frame is divided into C macro block and AC macro block, and AC macro block is extracted ROI, and K frame and each macro block of WZ frame are carried out independence Coding;The decoder code stream to receiving carries out combined decoding.
Comprising the concrete steps that of described coded method:
A., input video frame being divided into key frame and Wyner-Ziv frame, key frame carries out intraframe coding, described key frame is K frame, H.264 intra encoder carries out intraframe coding to key frame;Wyner-Ziv frame is carried out HVS&ROI Wyner-Ziv coding;
The method of the most described HVS&ROI Wyner-Ziv coding includes: by the JND model of HVS, Wyner-Ziv frame is divided into C Macro block and AC macro block;
Described JND model method includes:
B1., Wyner-Ziv frame is divided into 8 × 8 piecemeals, and each piecemeal is B block;;
B2. the JND threshold value of each pixel in B block is calculated according to following formula:
JND (x, y, t)=f (idl (x, y, t)) JNDs(x, y)
Wherein, (x, y t) represent the average interframe luminance errors value of adjacent t and t-1 moment, JND to idls(x y) represents JND In spatial domain threshold value;
idl ( x , y , t ) = 1 2 ( I ( x , y , t ) - I ( x , y , t - 1 ) + I &OverBar; ( x , y , t ) - I &OverBar; ( x , y , t - 1 ) )
JNDs(x, y)=Tl(x, y)+TY t(x, y)-CY×Min{Tl(x, y), TY t(x, y) }
Wherein, Tl(x y) represents that background luminance adapts to influence function, TY t(x y) represents that texture effects function reaction HVS is for flat Skating area is more higher than the sensitivity of texture compact district, and (x y) represents pixel corresponding coordinate in image, CYRepresent background luminance and stricture of vagina Reason shelters the correlation coefficient between two kinds of influence factors;
B3. according to forward, backward and the mean pixel SAD of following formula calculating current block B block:
SAD = &Sigma; x = 1 M &Sigma; y = 1 N | w ( x , y ) - r ( x , y ) |
Wherein, (x, y) (x y) represents that in WZ frame, encoding block and reference block are at coordinate (x, y) pixel value at place to w respectively with r;
B4. coding side side information ESI is obtained by following formula:
ESI=MinSAD{ FB, BB, AB}
Wherein, FB is forward direction consecutive frame corresponding blocks, and FB is the abbreviation of Forward Block;BB is backward consecutive frame corresponding blocks, BB is the abbreviation of Backward Block;AB is front and back's consecutive frame correspondence average block, and AB is the abbreviation of Average Block;
Calculate consecutive frame SAD the most respectivelyBIn maximumAnd minimaNote Obtain t < SAD respectivelyBWith t > SADBTime pixel ratio ρ0、ρ1With average gradient value m0、m1, then grand mean gradient Value is: m=ρ0m01m1
B6. calculating prospect with the variance of background is:
V=ρ0ρ1(m0-m1)2
B7. existTravel through v on interval and find out maximum vMax, vMaxCorresponding t is prospect and background side When difference maximum, now t value is threshold value TROI
B8. according to the T obtainedROIValue tentatively extracts foreground image, by comparing SADBWith TROIValue extract prospect, right In more than TROIForeground image, out as foreground image, is then performed one time by the extracting section of value by step b5 to b8 again, Obtain more accurate two grades of TROI
B9. according to the distortion function D of ESI (x, y) calculates the predicted distortion of each point, following formula obtain the predicted distortion value of each point:
D (x, y)=| ESI (x, y)-I (x, y) | wherein, ESI (x, y) and I (x y) represents coordinate in WZ frame (x, y) place respectively ESI value and pixel value.
B10. according to following formula, block B is carried out JND differentiation, divide AC block and C block, for distortion value less than or equal to JND threshold The macro block of value is divided into C block, and is divided into AC block for distortion value more than the macro block of JND threshold value, and obtains this WZ frame figure
B11. AC block is DCT and 2MLevel unified quantization, extraction coefficient band also carries out Zigzag scanning, according to selected amount Change matrix and conversion coefficient is carried out uniform quantization;And calculate K frame and WZ frame same position AC block gradient according to following formula SADB
SAD B = &Sigma; ( j , k ) &Element; B | G K [ F ( j , k ) ] - G W [ F ( j , k ) ] |
Wherein K represent key frame, W be Wyner-Ziv frame, B be single macro block, (j k) is z coordinate (j, k) place to F Pixel value;
B12. use following formula ROI macro block decision criteria, obtain region of interest two-value mask ROImask:
POI mask = 1 ( SAD B &GreaterEqual; T ROI ) 0 ( SAD B < T ROI )
Wherein, ROImaskFor the two-value mask of region of interest, when macro block gradient SADBThen it is set to 0 less than gradient T, is otherwise 1; To two-value mask ROImaskCarry out simple Morphological scale-space, first corrode then further expansion, make contour of object smooth, then by following formula Extraction ROI:
ROI = W B ( W B &SubsetEqual; ROI mask ) 0 ( W B &NotSubset; ROI mask )
Wherein, WBRepresent AC macro block corresponding in current WZ frame, if WBCorresponding ROImaskBeing 1 is ROI block, Otherwise set to 0 as non-ROI block, finally give the ROI region of interest of present frame;
B13. to Blockmask、ROImaskAnd the ROI block after quantifying carries out Huffman compression coding;Non-to AC macro block ROI block extracts bit plane and carries out LDPC decoding, if the image sets of present encoding is last image sets, then exits HVS&ROI distributed video coding;
C. AC macro block is extracted ROI, the label information of ROI macro block Yu each macro block is carried out entropy code, non-ROI macro block is entered Row Wyner-Ziv based on LDPC coding, described LDPC is defined as above;
D. comprising the concrete steps that of described coding/decoding method:
D1. intraframe decoder H.264 is obtained key frame;
D2. the label information to ROI macro block Yu each macro block carries out Huffman entropy decoding, it is thus achieved that Blockmask、ROImaskAnd The quantization parameter of ROI block;
D3., channel estimation parameter is set, uses cross entropy minimum rule to estimate channel parameter, according to the WZ frame having decoded out The most adjacent key frame carries out motion compensated interpolation and obtains initial edge information SI;
D4. utilize side information to pass through LDPC decoding to combine the label information of each macro block and obtain the bit stream coefficient of non-ROI block;
D5. will be according to BlockmaskThe C block obtained, the ROI block of entropy decoding and the non-ROI block of block decoding are reconstructed, wherein, The C block obtained at coding side need not encode, and intercepts the C block part of corresponding side information block as decoding in decoding end After C block, obtain terminating after C block.
Described LDPC interpretation method is:
2a initializing variable node, likelihood information L (cj) such as following formula:
L(cj)=2yj2
Wherein, yjFor receiving code word, σ2For noise variance, j=1,2 ..., n;
2b vertical direction iteration, calculates variable node likelihood value L (gij), wherein variable node is odd even school corresponding on bipartite graph Test the column vector of matrix H.If h in check matrix Hij=0, then L (gij)=0;Otherwise it is calculated as follows L (gij):
L(gij)=L (cj)+∑I, j ≠ jL(hij)
Wherein, L (hij) it is check-node likelihood information, i=1,2 ..., m, during the 1st iteration, L (hij)=0;
2c horizontal direction iteration, calculates check-node likelihood information L (hij), wherein check-node is odd even school corresponding on bipartite graph Test the row vector of matrix H.If check matrix hij=0, then L (hij)=0;Otherwise it is calculated as follows L (hij):
L ( h ij ) = 2 tanh - 1 [ &Pi; j , j &NotEqual; i tanh ( 1 2 L ( g ij ) ) ]
2d decodes judgement, obtains the discriminative information of variable node according to function obtained above, is calculated as follows variable node judgement Information L (Qj):
L(Qj)=L (cj)+∑iL(hij)
If L is (Qj) > 0, then adjudicate code wordOtherwiseIfOr reach the maximum restriction time of iteration Number, terminates, and otherwise repeats step 2b;
The function calculated in above step is for decoding.
The effect of the present invention is further illustrated by following experiment:
1) experiment condition
Hardware environment: CPU Intel (R) Core (TM) 2Duo CPU E7500,2.93GHz, 2.00GB internal memory;
Gop structure: IWWWWWWW pattern, the frame can divided exactly by 8 is as K frame, and remaining is WZ frame;IWIWIWIW Pattern, odd-numbered frame is K frame, and even frame is WZ frame;SINGLE_I pattern, in GOP, first frame is as K frame, remaining For WZ frame;
Reference sequences: Foreman, Carphone;
Resolution: 176 × 144;
Product is examined sequence condition and is shown in Table 1.
Table 1 cycle tests condition
Video sequence title Foreman Carphone
Test frame number 300 100
Frame per second (HZ) 30 30
Picture format QCIF QCIF
2) experiment content
Add up each reference sequences and be respectively adopted optimum prediction coding under these experimental conditions, without motion-estimation encoded, JPEG coding The HVS&ROI distributed video coding method proposed with the present invention encodes, and obtains the Y-PSNR PSNR of each pattern Curve chart with code check Rata.The experimental result of each sequence is shown in Fig. 4 and Fig. 5.
As seen from Figure 4, for " Foreman " sequence, distributed with HVS&ROI compared with motion estimation coding method The objective Quality of recovery PSNR of coded method can improve 0.3~0.7dB, compared with JPEG coding, can improve 1~3dB; " Carphone " sequence in Fig. 5, objective with HVS&ROI distributed coding method compared with motion estimation coding method Quality of recovery PSNR can improve 1~2dB, compared with JPEG coding, can improve 3~4dB.The HVS&ROI that the present invention proposes Distributed coding system and optimum prediction encode, without compared with motion-estimation encoded, JPEG coded method, distortion performance has carried Height, and encoder is simple, it is easy to accomplish.

Claims (1)

1. a distributed video coding method based on HVS&ROI, it is characterised in that: the method is made up of coded method and coding/decoding method two parts, HVS (Human Visual System), ROI (Region of Interest);
Comprising the concrete steps that of described coded method:
A., input video frame being divided into key frame and Wyner-Ziv frame, key frame (K frame) is carried out intraframe coding, described key frame is K frame, and H.264 intra encoder carries out intraframe coding to key frame;Wyner-Ziv frame is carried out HVS&ROI Wyner-Ziv coding;
The method of the most described HVS&ROI Wyner-Ziv coding includes: by JND (the Just noticeable difference) model of HVS, Wyner-Ziv frame is divided into C macro block and AC macro block;
Described JND model method includes:
B1., Wyner-Ziv frame is divided into 8 × 8 piecemeals, and each piecemeal is B block;
B2. the JND threshold value of each pixel in B block is calculated according to following formula:
JND (x, y, t)=f (idl (x, y, t)) JNDs(x,y)
Wherein, (x, y t) represent the average interframe luminance errors value of adjacent t and t-1 moment, JND to idls(x y) represents that JND is in spatial domain threshold value;
JNDs(x, y)=Tl(x,y)+TY t(x,y)-CY×Min{Tl(x,y),TY t(x,y)}
Wherein, Tl(x y) represents that background luminance adapts to influence function, TY t(x, y) represents that texture effects function reaction HVS is more higher than the sensitivity of texture compact district for smooth area, and (x y) represents pixel corresponding coordinate in image, CYRepresent the correlation coefficient between background luminance and two kinds of influence factors of texture masking;
B3. according to forward, backward and the mean pixel SAD of following formula calculating current block B block:
Wherein, (x, y) (x y) represents that in WZ frame, encoding block and reference block are at coordinate (x, y) pixel value at place to w respectively with r;
B4. coding side side information ESI is obtained by following formula:
ESI=MinSAD{FB,BB,AB}
Wherein, FB is forward direction consecutive frame corresponding blocks, and FB is the abbreviation of Forward Block;BB is backward consecutive frame corresponding blocks, and BB is the abbreviation of Backward Block;AB is front and back's consecutive frame correspondence average block, and AB is the abbreviation of Average Block;
B5. all pieces are calculated respectively as current block B the mean pixel SAD of current block B block consecutive frameBIn maximumAnd minimaNoteObtain t < SAD respectivelyBWith t > SADBTime pixel ratio ρ0、ρ1With average gradient value m0、m1, then grand mean Grad is: m=ρ0m01m1
B6. calculating prospect with the variance of background is:
V=ρ0ρ1(m0-m1)2
B7. existTravel through v on interval and find out maximum vMax, vMaxWhen corresponding t is prospect and background variance maximum, now t value is threshold value TROI
B8. according to the T obtainedROIValue tentatively extracts foreground image, by comparing SADBWith TROIValue extract foreground image, for more than TROIForeground image, out as foreground image, is then performed one time by step b5 to b8, obtains more accurate two grades of T by the extracting section of value againROI
B9. according to the distortion function D of ESI (x, y) calculates the predicted distortion of each point, following formula obtain the predicted distortion value of each point:
D (x, y)=| ESI (x, y)-I (x, y) | wherein, ESI (x, y) and I (x y) represents coordinate in WZ frame (x, y) the ESI value at place and pixel value respectively;
B10. according to following formula, block B is carried out JND differentiation, divide AC block and C block, less than or equal to the macro block of JND threshold value, C block is divided into for distortion value, and more than the macro block of JND threshold value, AC block is divided into for distortion value, and obtain the two-value mask Block that this WZ two field picture block dividesmask:
Wherein, Num is the total quantity of pixel in each macro block, and ε takes 0.1;
B11. AC block is DCT and 2MLevel unified quantization, extraction coefficient band also carries out Zigzag scanning, according to selected quantization matrix, conversion coefficient is carried out uniform quantization;And calculate K frame and the SAD of WZ frame same position AC block gradient according to following formulaB
Wherein K represents key frame, GKRepresenting key frame set, W is Wyner-Ziv frame, GWBe Wyner-Ziv frame set, B be single macro block, (j k) is z coordinate (j, k) pixel value at place to F;
B12. use following formula ROI macro block decision criteria, obtain region of interest two-value mask ROImask:
Wherein, ROImaskFor the two-value mask of region of interest, when macro block gradient SADBThen it is set to 0 less than gradient T, is otherwise 1;To two-value mask ROImaskCarry out simple Morphological scale-space, first corrode then further expansion, make contour of object smooth, then extracted ROI by following formula:
Wherein, WBRepresent AC macro block corresponding in current WZ frame, if WBCorresponding ROImaskBe 1 for ROI block, otherwise set to 0 as non-ROI block, finally give the ROI region of interest of present frame;
B13. to Blockmask、ROImaskAnd the ROI block after quantifying carries out Huffman compression coding;The non-ROI block extraction bit plane of AC macro block is carried out LDPC (Low Density Parity Check Code) decoding, if the image sets of present encoding is last image sets, then exits HVS&ROI distributed video coding;
Described LDPC interpretation method is:
2a. initializing variable node, likelihood information L (cj) such as following formula:
L(cj)=2yj2
Wherein, yjFor receiving code word, σ2For noise variance, j=1,2 ..., n;
2b vertical direction iteration, calculates variable node likelihood value L (gij), wherein variable node is the column vector of parity check matrix H corresponding on bipartite graph;If h in check matrix Hij=0, then L (gij)=0;Otherwise it is calculated as follows L (gij):
L(gij)=L (cj)+∑i,j jL(hij)
Wherein, L (hij) it is check-node likelihood information, i=1,2 ..., m, during the 1st iteration, L (hij)=0;
2c horizontal direction iteration, calculates check-node likelihood information L (hij), wherein check-node is the row vector of parity check matrix H corresponding on bipartite graph, if check matrix hij=0, then L (hij)=0;Otherwise it is calculated as follows L (hij):
2d decodes judgement, obtains the discriminative information of variable node according to function obtained above, is calculated as follows variable node discriminative information L (Qj):
L(Qj)=L (cj)+∑iL(hij)
If L is (Qj) > 0, then adjudicate code wordOtherwiseIfOr reach the maximum limited number of times of iteration, terminate, otherwise repeat step 2b;
The function calculated in above step is for decoding;
C. the label information of AC macro block Yu each macro block is carried out entropy code, non-ROI macro block is carried out Wyner-Ziv based on LDPC coding;
D. comprising the concrete steps that of described coding/decoding method:
D1. intraframe decoder H.264 is obtained key frame;
D2. the label information to ROI macro block Yu each macro block carries out Huffman entropy decoding, it is thus achieved that Blockmask、ROImaskAnd the quantization parameter of ROI block;
D3., channel estimation parameter is set, uses cross entropy minimum rule to estimate channel parameter, carry out motion compensated interpolation according to adjacent key frame before and after the WZ frame having decoded out and obtain initial edge information SI;
D4. utilize side information to pass through LDPC decoding to combine the label information of each macro block and obtain the bit stream coefficient of non-ROI block;
D5. will be according to BlockmaskThe C block obtained, the ROI block of entropy decoding and the non-ROI block of block decoding are reconstructed.
CN201210377970.4A 2012-10-08 2012-10-08 Distributed decoding method based on HVS&ROI and system Expired - Fee Related CN103002280B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210377970.4A CN103002280B (en) 2012-10-08 2012-10-08 Distributed decoding method based on HVS&ROI and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210377970.4A CN103002280B (en) 2012-10-08 2012-10-08 Distributed decoding method based on HVS&ROI and system

Publications (2)

Publication Number Publication Date
CN103002280A CN103002280A (en) 2013-03-27
CN103002280B true CN103002280B (en) 2016-09-28

Family

ID=47930347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210377970.4A Expired - Fee Related CN103002280B (en) 2012-10-08 2012-10-08 Distributed decoding method based on HVS&ROI and system

Country Status (1)

Country Link
CN (1) CN103002280B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107547895B (en) * 2016-06-29 2020-02-18 腾讯科技(深圳)有限公司 Image processing method and device
CN108600751A (en) * 2018-05-03 2018-09-28 山东师范大学 Polygon information-distribution type Video coding based on JND, decoded method and system
CN108632613B (en) * 2018-05-21 2020-10-16 南京邮电大学 Hierarchical distributed video coding method and system based on DISCOVER framework
CN109474824B (en) * 2018-12-04 2020-04-10 深圳市华星光电半导体显示技术有限公司 Image compression method
US20220051385A1 (en) * 2018-12-12 2022-02-17 Shenzhen Institutes Of Advanced Technology Chinese Academy Of Sciences Method, device and apparatus for predicting picture-wise jnd threshold, and storage medium
US11649719B2 (en) 2019-06-12 2023-05-16 Baker Hughes Oilfield Operations Llc Compressing data collected downhole in a wellbore
CN110505480B (en) * 2019-08-02 2021-07-27 浙江大学宁波理工学院 Monitoring scene-oriented fast perception video coding method
CN111491167B (en) * 2019-10-28 2022-08-26 华为技术有限公司 Image encoding method, transcoding method, device, equipment and storage medium
CN111464834B (en) * 2020-04-07 2023-04-07 腾讯科技(深圳)有限公司 Video frame processing method and device, computing equipment and storage medium
CN112261407B (en) * 2020-09-21 2022-06-17 苏州唐古光电科技有限公司 Image compression method, device and equipment and computer storage medium
CN112104869B (en) * 2020-11-10 2021-02-02 光谷技术有限公司 Video big data storage and transcoding optimization system
CN113596451B (en) * 2021-06-28 2024-01-26 无锡唐古半导体有限公司 Video encoding method, video decoding method and related devices
CN116248895B (en) * 2023-05-06 2023-07-21 上海扬谷网络科技有限公司 Video cloud transcoding method and system for virtual reality panorama roaming

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040091158A1 (en) * 2002-11-12 2004-05-13 Nokia Corporation Region-of-interest tracking method and device for wavelet-based video coding
CN101102492A (en) * 2007-07-26 2008-01-09 上海交通大学 Conversion method from compression domain MPEG-2 based on interest area to H.264 video
CN101882316A (en) * 2010-06-07 2010-11-10 深圳市融创天下科技发展有限公司 Method, device and system for regional division/coding of image

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040091158A1 (en) * 2002-11-12 2004-05-13 Nokia Corporation Region-of-interest tracking method and device for wavelet-based video coding
CN101102492A (en) * 2007-07-26 2008-01-09 上海交通大学 Conversion method from compression domain MPEG-2 based on interest area to H.264 video
CN101882316A (en) * 2010-06-07 2010-11-10 深圳市融创天下科技发展有限公司 Method, device and system for regional division/coding of image

Also Published As

Publication number Publication date
CN103002280A (en) 2013-03-27

Similar Documents

Publication Publication Date Title
CN103002280B (en) Distributed decoding method based on HVS&amp;ROI and system
CN101742319B (en) Background modeling-based static camera video compression method and background modeling-based static camera video compression system
CN101835044B (en) Grouping method in frequency domain distributed video coding
CN101835042B (en) Wyner-Ziv video coding system controlled on the basis of non feedback speed rate and method
CN102281446B (en) Visual-perception-characteristic-based quantification method in distributed video coding
CN105049850A (en) HEVC (High Efficiency Video Coding) code rate control method based on region-of-interest
CN103002283A (en) Multi-view distributed video compression side information generation method
CN103581647A (en) Depth map sequence fractal coding method based on motion vectors of color video
CN101977323B (en) Method for reconstructing distributed video coding based on constraints on temporal-spatial correlation of video
CN107277537B (en) A kind of distributed video compressed sensing method of sampling based on temporal correlation
CN102271256B (en) Mode decision based adaptive GOP (group of pictures) distributed video coding and decoding method
CN103546758A (en) Rapid depth map sequence interframe mode selection fractal coding method
CN102625102A (en) H.264/scalable video coding medius-grain scalability (SVC MGS) coding-oriented rate distortion mode selection method
Chen et al. Improving video coding quality by perceptual rate-distortion optimization
CN102572428B (en) Side information estimating method oriented to distributed coding and decoding of multimedia sensor network
CN103442228A (en) Quick frame inner transcoding method from H.264/AVC standard to HEVC standard and transcoder thereof
CN106101714A (en) One and the tightly coupled H.264 video information hiding method of compression encoding process
CN102595132A (en) Distributed video encoding and decoding method applied to wireless sensor network
CN104853215A (en) Video steganography method based on motion vector local optimality preservation
CN102833536A (en) Distributed video encoding and decoding method facing to wireless sensor network
CN101827268A (en) Object-based fractal video compression and decompression method
CN101980536B (en) Object and fractal-based multi-ocular three-dimensional video compression encoding and decoding method
CN106060567A (en) Wavelet domain distributed multi-view video coding based on layered WZ frame
CN105611301A (en) Distributed video coding and decoding method based on wavelet domain residual errors
CN103546747B (en) A kind of depth map sequence fractal coding based on color video encoding pattern

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP02 Change in the address of a patent holder

Address after: 221116 No. 1 University Road, copper mountain, Jiangsu, Xuzhou

Patentee after: China University of Mining & Technology

Address before: 221116 Department of science and technology, China University of Mining and Technology, Xuzhou University Road, No. 1, Jiangsu

Patentee before: China University of Mining & Technology

CP02 Change in the address of a patent holder
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160928

Termination date: 20191008

CF01 Termination of patent right due to non-payment of annual fee