CN102098515B - Realizing method of loop filtering parallel - Google Patents

Realizing method of loop filtering parallel Download PDF

Info

Publication number
CN102098515B
CN102098515B CN 201110042746 CN201110042746A CN102098515B CN 102098515 B CN102098515 B CN 102098515B CN 201110042746 CN201110042746 CN 201110042746 CN 201110042746 A CN201110042746 A CN 201110042746A CN 102098515 B CN102098515 B CN 102098515B
Authority
CN
China
Prior art keywords
filtering
macro block
parallel processing
processing district
macro
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201110042746
Other languages
Chinese (zh)
Other versions
CN102098515A (en
Inventor
戚红命
俞海
贾永年
胡杨忠
邬伟琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN 201110042746 priority Critical patent/CN102098515B/en
Publication of CN102098515A publication Critical patent/CN102098515A/en
Application granted granted Critical
Publication of CN102098515B publication Critical patent/CN102098515B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a realizing method of loop filtering parallel. The method comprises the following steps of: sequentially dividing a plurality of macro blocks to be filtered into M ordered parallel processing regions according to the preset non-filtering direction and the number N of cores contained in a multi-core processor; selecting b macro blocks to be filtered from the a(th) parallel processing region, and taking the b macro blocks to be filtered as a macro block group filtered at this time; when the number b of the macro blocks contained in the macro block group filtered at this time is smaller than N, selecting c macro blocks from the (a+1)th parallel processing region and adding into the macro block group filtered at this time, and filtering the macro block group filtered at this time by utilizing the N cores. M is an integer value obtained after the number of the macro blocks to be filtered is divided by the number N of cores or is the sum of 1 and an integer value obtained after the number of the macro blocks to be filtered is divided by the number N of cores; a is smaller than M; and b and c are the maximum value of the number of the macro blocks filtered at this time. With the method provided by the invention, the IPR can be lowered, the working efficiency for filtering is increased, and the time spent in filtering is shortened.

Description

A kind of Parallel Implementation method of loop filtering
Technical field
The present invention relates to the video coding and decoding technology field, particularly a kind of Parallel Implementation method of loop filtering.
Background technology
At present, in the video flowing transmission of paramount importance encoding and decoding standard have International Telecommunication Association H.261, H.263, the M-JPEG of motion rest image expert group and the MPEG series standard of Motion Picture Experts Group of International Standards Organization.
H.264 be joint video team (Joint Video Team, the high compression digital video decoding standard that JVT) proposes that constitutes jointly by ITU-T video coding expert group (VCEG) and ISO/IEC dynamic image expert group (MPEG).With respect to video encoding standard in the past, under the situation of equal picture quality, H.264 save the code check more than 50% than standard in the past; And allow video frequency program on lower bandwidth, to transmit, saved a large amount of bandwidth resources; Quality of coded picture is high, and continuous, smooth high quality graphic can be provided, and stronger fault-tolerant ability is arranged.
Existing with in the macro-block loop filtering process; When a certain macro block is carried out filtering; Need the input of the filtered of the filtered of this macro block adjacent left-hand macro block and the adjacent upside macro block of this macro block as this macro block filtering; Be the input of this filtering, such as: what H.264 standard adopted is the loop circuit filtering method of unit with MB, utilizes this loop circuit filtering method to remove the blocking effect that produces in the encoding-decoding process.Wherein, macro block is the elementary cell of image in the coding and decoding video, is generally the 16*16 pixel size.
Can know that according to foregoing the loop filtering between adjacent macroblocks has certain dependence.Though the loop filtering of adjacent macroblocks be have dependent,, under certain condition, the loop filtering process that is in the macro block of non-conterminous position can be separate.Above-mentioned position adjacent relation is meant: do not have position relation at interval between the row at macro block place, or do not have position relation at interval between the row at macro block place; Above-mentioned non-conterminous position relation is meant: the position relation of delegation at least at interval between the row at macro block place, or the position relation of row at least at interval between the row at macro block place.
In order to shorten the spent time of loop filtering, adopt MB_wavefront parallel filtering algorithm to carry out loop filtering usually.Fig. 1 (a) is the existing sketch map that 8 row, 11 row images are carried out loap-paralled track filtering.Fig. 1 (b) is for carrying out the sketch map of loap-paralled track filtering institute cycles consumed to image shown in Fig. 1 (a).Combine Fig. 1 (a) and Fig. 1 (b) at present, the method that adopts MB_wavefront parallel filtering algorithm to carry out loop filtering is described, specific as follows:
When the image of the row of 8 row 11 shown in Fig. 1 (a) was carried out loap-paralled track filtering, i represented the line number at macro block place, and j representes the columns at macro block place, and the numeral that marks on the macro block is the macro block sequence number.Dependence according to loop filtering between loop circuit filtering method that adopts MB_wavefront parallel filtering algorithm and adjacent macroblocks; The image that adopts four core processors to treat filtering carries out loop filtering; Suppose that the cycle that the each filtering of each core that four core processors comprise is consumed is T; Then the 1st time to MB [0,0] filtering, consumes 1T; The 2nd time MB [0,1] filtering is consumed 1T; The 3rd time MB [0,2] and MB [1,0] filtering are consumed 1T; The 4th consumes 1T to MB [0,3] and MB [1,1] filtering; The 5th consumes 1T to MB [0,4], MB [1,2] and MB [2,0] filtering; The 6th time MB [0,5], MB [1,3] and MB [2,1] filtering are consumed 1T; The 7th time MB [0,6], MB [1,4], MB [2,2] and MB [3,0] filtering are consumed 1T; The 8th time MB [0,7], MB [1,5], MB [2,3] and MB [3,1] filtering are consumed 1T; The 9th time MB [0,8], MB [1,6], MB [2,4], MB [3,2] and MB [4,0] filtering are consumed 2T; The 10th time MB [0,9], MB [1,7], MB [2,5], MB [3,3] and MB [4,1] filtering are consumed 2T; The 11st time MB [0,10], MB [1,8], MB [2,6], MB [3,4], MB [4,2] and MB [5,0] filtering are consumed 2T; The 12nd time MB [1,9], MB [2,7], MB [3,5], MB [4,3] and MB [5,1] filtering are consumed 2T; The 13rd time MB [1,10], MB [2,8], MB [3,6], MB [4,4], MB [5,2] and MB [6,0] filtering are consumed 2T; The 14th time MB [2,9], MB [3,7], MB [4,5], MB [5,3] and MB [6,1] filtering are consumed 2T; The 15th time MB [2,10], MB [3,8], MB [4,6], MB [5,4], MB [6,2] and MB [7,0] filtering are consumed 2T; The 16th time MB [3,9], MB [4,7], MB [5,5], MB [6,3] and MB [7,1] filtering are consumed 2T; The 17th time MB [3,10], MB [4,8], MB [5,6], MB [6,4] and MB [7,2] filtering are consumed 2T; The 18th time MB [4,9], MB [5,7], MB [6,5] and MB [7,3] filtering are consumed 1T; The 19th time MB [4,10], MB [5,8], MB [6,6] and MB [7,4] filtering are consumed 1T; The 20th time MB [5,9], MB [6,7] and MB [7,5] filtering are consumed 1T; The 21st time MB [5,10], MB [6,8] and MB [7,6] filtering are consumed 1T; The 22nd time MB [6,9] and MB [7,7] filtering are consumed 1T; The 23rd time MB [6,10] and MB [7,8] filtering are consumed 1T; The 24th time MB [7,9] filtering is consumed 1T; The 25th time MB [7,10] filtering is consumed 1T.
Can know according to foregoing, adopt in the loop circuit filtering method of MB_wavefront parallel filtering algorithm, utilize N core in the system that has polycaryon processor; Carry out Filtering Processing to being in non-conterminous locational N macro block simultaneously; To shorten the spent time of loop filtering, still, idle cores rate (Idle Processor Ratio; IPR) be raised, the operating efficiency of parallel filtering has reduced.Shown in Fig. 1 (b), when having only the 7th, 8,18,19 filtering, 4 cores of four core processors are all in running order; It is minimum that IPR has reached; In remaining 21 filtering, have at least a core to be in idle condition in 4 cores of four core processors, IPR is higher; And in the 9-17 time filtering, the cycle that each filtering consumes is 2T, and it is 34T that above-mentioned image is carried out total periodicity that filtering consumed.
In sum, existingly utilize a plurality of cores in the polycaryon processor to carry out in the method for loap-paralled track filtering, the IPR of system is higher, and the operating efficiency of parallel filtering is lower, and the time of parallel filtering consumption is still waiting to shorten further.
Summary of the invention
In view of this, the object of the present invention is to provide a kind of Parallel Implementation method of loop filtering, this method can reduce IPR, improves the operating efficiency of parallel filtering, shortens the time that parallel filtering consumes.
For achieving the above object, technical scheme of the present invention specifically is achieved in that
A kind of Parallel Implementation method of loop filtering, this method comprises:
A, according to the core number N that preset non-filtering direction and polycaryon processor comprise, a plurality of macro blocks of filtering of treating are divided into M orderly parallel processing district successively; Said N is the integer greater than 1; Said M is the value after the merchant that treats macro block number and the core number N of filtering rounds, and perhaps is the merchant of the macro block number of treating filtering and core number N value and 1 sum after rounding;
B, from a parallel processing district, select b the macro block of treating filtering, treat the macro block group of the macro block of filtering as this filtering with said b; Said a is less than M; Said b is the maximum of treating the filtered macroblock number of carrying out this filtering;
C, the macro block number b that comprises in the macro block group of said this filtering from a+1 parallel processing district, select c macro block to add the macro block group of said this filtering during less than core number N, utilize N core that the macro block group of said this filtering is carried out filtering; Said c is the maximum of carrying out the macro block number of this filtering.
Preferably, further comprise before the said steps A:
Relatively treat the macro block number that the image of filtering comprises on two-dimensional directional and the relation of core number, the macro block number that will comprise is that the one dimension direction of the integral multiple of core number is made as non-filtering direction, and another dimension direction is made as the filtering direction.
In the said method, the said preset non-filtering direction of steps A is line direction or column direction.
In the said method, steps A is said to be divided into M orderly parallel processing district successively with a plurality of macro blocks of treating filtering and to comprise:
A1, the macro block number that each parallel processing district is comprised on non-filtering direction are made as core number N;
A2, according to the macro block number that comprises on the said non-filtering direction, according to non-filtering direction a plurality of macro blocks of filtering of treating are divided successively, obtain M orderly parallel processing district.
In the said method, step B is said to select b to treat that the macro block of filtering comprises from a parallel processing district:
B 1, basis be the position of the macro block of filtering, confirms the adjacent a plurality of macro blocks of treating filtering of macro block with said filtering; The macro block of said filtering is arranged in a parallel processing district and/or a-1 parallel processing district;
B2, according to the filtering between adjacent macroblocks relation in the loop filtering mode, from the said a plurality of macro blocks of treating filtering of step B 1, select the macro block that carries out this filtering;
B3, from the said macro block that carries out this filtering of step B2, select to be in a the parallel processing district, on non-filtering direction adjacent and on the filtering direction the individual macro block of treating filtering of non-conterminous b.
Preferably, further comprise before the said step B1:
B0, judge when a parallel processing district is first parallel processing district, first macro block in first parallel processing district and second macro block are carried out filtering;
Said first macro block is to be in to treat the macro block on the original position in the filtering image; Said second macro block be on the filtering direction with said first macro block neighboring macro-blocks.
In the said method, step C is said from a+1 parallel processing district, and the macro block group of selecting c macro block to add said this filtering comprises:
C 1, basis be the position of the macro block of filtering, in a+1 parallel processing district, confirms the adjacent a plurality of macro blocks of treating filtering of macro block with said filtering;
C2, according to the filtering between adjacent macroblocks relation in the loop filtering mode, from the said a plurality of macro blocks of treating filtering of step C 1, select the macro block that carries out this filtering;
C3, from the said macro block that carries out this filtering of step C2, select to be in a+1 the parallel processing district, on non-filtering direction adjacent and on the filtering direction a non-conterminous c macro block, with the macro block group of said c said this filtering of macro block adding.
In the said method, a said c macro block comprise at least one on non-filtering direction with said a parallel processing district neighboring macro-blocks.
Preferably, it is characterized in that, further comprise after the said step B:
When the macro block number b that comprises in the macro block group of said this filtering equaled core number N, N core utilizing polycaryon processor to comprise carried out filtering to the macro block group of said this filtering.
Visible by above-mentioned technical scheme, the invention provides a kind of Parallel Implementation method of loop filtering, in this method,, be M parallel processing district with the image division of treating filtering according to the core number N that preset non-filtering direction and polycaryon processor comprise; From a parallel processing district, select b the macro block of treating filtering, treat the macro block group of the macro block of filtering as this filtering with said b; The macro block number b that comprises in the macro block group of said this filtering from a+1 parallel processing district, selects c macro block to add the macro block group of said this filtering during less than core number N, utilizes N processor that the macro block group of said this filtering is carried out filtering.Adopt method of the present invention, can reduce IPR, improve the operating efficiency of parallel filtering, shorten the time that parallel filtering consumes.
Description of drawings
Fig. 1 (a) is the existing sketch map that 8 row, 11 row images are carried out loap-paralled track filtering.
Fig. 1 (b) is for carrying out the sketch map of loap-paralled track filtering institute cycles consumed to image shown in Fig. 1 (a).
A plurality of processors carry out the flow chart of the method for loap-paralled track filtering to Fig. 2 for the present invention utilizes.
Fig. 3 (a) carries out the sketch map of the embodiment one of parallel filtering for adopting method of the present invention to image.
Fig. 3 (b) is for carrying out the sketch map of loap-paralled track filtering institute cycles consumed to image shown in Fig. 3 (a).
Fig. 4 carries out the sketch map of the embodiment two of parallel filtering for adopting method of the present invention to image.
Embodiment
For make the object of the invention, technical scheme, and advantage clearer, below with reference to the accompanying drawing embodiment that develops simultaneously, to further explain of the present invention.
The invention provides a kind of Parallel Implementation method of the loop filtering based on multinuclear; In this method; At first, the core number according to polycaryon processor comprises becomes a plurality of parallel processings district with image division; So that the macro block number that the macro block group in each parallel processing district comprises smaller or equal to the core number, shortens each macro block group is carried out the time that filtering consumed; Secondly; The macro block number that macro block group in a parallel processing district comprises is during less than the core number; Can from a+1 parallel processing district, select the macro block of treating filtering that can carry out this filtering; Add the macro block group, so that the macro block number that the macro block group comprises as much as possible near the core number, reduces the numerical value of IPR.
The mentioned core number of the present invention is the number of the nuclear that comprises of existing polycaryon processor, such as, the core number of four core processors is 4, the core number of dual core processor is 2.
For sake of clarity, existing position to macro block in the inventive method describes, and the position of macro block of the present invention refers to the row at macro block place and the row at macro block place.
A plurality of processors carry out the flow chart of the method for loap-paralled track filtering to Fig. 2 for the present invention utilizes.Combine Fig. 2 at present, method of the present invention is described, specific as follows:
Step 201:, confirm non-filtering direction according to the macro block number of treating that filtering image comprises on two-dimensional directional;
In this step; Relatively treat the relation of the core number that macro block number that filtering image comprises and polycaryon processor comprise respectively on two-dimensional directional; The macro block number that will comprise is that the one dimension direction of integral multiple of core number is as non-filtering direction; Such as: if the macro block number that on line direction, comprises is the integral multiple of core number, with line direction as non-filtering direction, with column direction as the filtering direction; If the macro block number that on column direction, comprises is the integral multiple of core number, with column direction as non-filtering direction, with line direction as the filtering direction.
Preferably, when the macro block number that on two-dimensional directional, comprises all is the integral multiple of core number, can select arbitrary dimension direction on the two-dimensional directional as non-filtering direction; No matter selecting which dimension direction is non-filtering direction, under two kinds of situation, it is all identical that image is carried out cycle, IPR that filtering consumed.
If treating a macro block number average that filtering image comprises on two-dimensional directional is not the integral multiple of the core number that comprises of polycaryon processor, then can be according to user's needs, with the arbitrary dimension direction in the two-dimensional directional as non-filtering direction.
Macro block of the present invention can adopt existing method of dividing according to pixel, treats filtering image and handles acquisition; The big I of macro block of the present invention is set as required, such as: 16*16 pixel or 8*8 pixel.
Step 202:, a plurality of macro blocks of filtering of treating are divided into M orderly parallel processing district successively according to preset non-filtering direction and core number;
This step comprises: step 2021, and according to preset non-filtering direction, according to the core number N that polycaryon processor in the video coding and decoding system comprises, the macro block number that each parallel processing district is comprised on non-filtering direction is made as core number N; Step 2022, according to the macro block number that comprises on the non-filtering direction in the step 2021, a plurality of macro blocks of filtering of treating that the image of treating filtering according to non-filtering direction comprises are divided successively, obtain M orderly parallel processing district.
Wherein, M orderly parallel processing district sorts according to the sequencing of dividing; N is the integer greater than 1; If not the macro block number of treating filtering that comprises on the filtering direction is the integral multiple of core number N, then M is the value after the merchant that treats macro block number and the core number N of filtering rounds; If not the macro block number of treating filtering that comprises on the filtering direction is not the integral multiple of core number N, then M is value and 1 sum after the merchant that treats macro block number and the core number N of filtering rounds.
Step 203: judge to treat whether a parallel processing district of filtering is first parallel processing district, if, execution in step 204, otherwise execution in step 205:
Because first parallel processing district is in the edge of treating filtering image; When first first parallel processing district being carried out filtering, there is not the macro block of filtering, need be when filtering first; Adopt and other parallel processing district different filtering methods, promptly execution in step 204.
If when a the parallel processing district that treats filtering is not first parallel processing district, be illustrated in before this filtering, the part macro block in a-1 parallel processing district and/or a the parallel processing district has been carried out filtering, then execution in step 205
Step 204: first macro block and second macro block to first parallel processing district carry out filtering;
First macro block in first parallel processing district is positioned at the macro block on the original position of treating filtering image, promptly is positioned at the macro block of initial row and initial column position, in Fig. 3 and Fig. 4, is positioned at the macro block MB [0,0] on the 0th row and the 0th column position.
Second macro block be on the filtering direction with first macro block neighboring macro-blocks; In Fig. 3, be positioned at the macro block MB [0,1] on the 0th row the 1st column position, among Fig. 4, be positioned at the macro block MB [1,0] on the 1st row the 0th column position.
Step 205:, from a parallel processing district, select b and treat the macro block group of the macro block of filtering as this filtering according to the macro block position of filtering;
When a the parallel processing district that treats filtering was first parallel processing district, the macro block of filtering was first macro block and second macro block in this treatment region; When a the parallel processing district that treats filtering was not first parallel processing district, the macro block of filtering was arranged in a-1 parallel processing district and/or a parallel processing district.
This step comprises: step 2051, according to the position of the macro block of filtering, confirm and the adjacent a plurality of macro blocks of treating filtering of macro block of filtering; Step 2052, according to the filtering between adjacent macroblocks relation in the loop filtering mode, a plurality of from step 2051 treat that selection can be carried out the macro block of this filtering in the macro block of filtering; Step 2053 is carried out from step 2052 in the macro block of this filtering, select to be in a the parallel processing district, on non-filtering direction adjacent and on the filtering direction the individual macro block of treating filtering of non-conterminous b; Step 2054 is treated the macro block group of the macro block of filtering as this filtering with b.Wherein, b is a positive integer.
Owing to possibly comprise the macro block that can't carry out this filtering in the macro block of confirming in the step 2051 of treating filtering, need carry out screening again, promptly execution in step 2052.
Filtering relation in the loop filtering mode of mentioning in the step 2052 between adjacent macroblocks; Promptly when a certain macro block is carried out filtering, need the input of the filtered of the filtered of this macro block adjacent left-hand macro block and the adjacent upside macro block of this macro block as this filtering; When the macro block of treating filtering is in the edge of treating filtering image, with the filtered of the macro block adjacent left-hand macro block of treating filtering or the filtered of the adjacent upside of macro block of treating filtering as the input of this filtering.
The b that mentions in the step 2053 is the maximum of treating the filtered macroblock number of carrying out this filtering; Such as, this can treat that filtered macroblock carries out filtering to 2, perhaps treats that to 1 filtered macroblock carries out filtering, then with 2 macro block groups of treating filtered macroblock as this filtering.
Step 206: whether judge macro block number that the macro block group of this filtering comprises less than N, if, execution in step 207, otherwise execution in step 208;
Judge the core number N whether the macro block number b that comprises in the macro block group of this filtering comprises less than polycaryon processor, if, then need, the next one select the macro block that is used to substitute from treating the parallel processing district of filtering, and promptly execution in step 207; Otherwise the macro block number that comprises in the macro block group of this filtering can satisfy the requirement of N core in the polycaryon processor, execution in step 208.
Step 207: from a+1 parallel processing district, select c macro block to add the macro block group of this filtering;
On non-filtering direction, a+1 parallel processing district is adjacent with a parallel processing district.
This step comprises: step 2071, according to the position of the macro block of filtering, in a+1 parallel processing district, confirm and the adjacent a plurality of macro blocks of treating filtering of macro block of filtering; Step 2072, according to the filtering between adjacent macroblocks relation in the loop filtering mode, a plurality of from step 2071 treat that selection can be carried out the macro block of this filtering in the macro block of filtering; Step 2073 is carried out from step 2072 in the macro block of this filtering, select to be in a+1 the parallel processing district, on non-filtering direction adjacent and on the filtering direction a non-conterminous c macro block; Step 2074 is with the macro block group of c this filtering of macro block adding.
Wherein, c is an integer; In a+1 parallel processing district, do not exist satisfy this filtering treat the macro block of filtering the time, c is 0; In a+1 parallel processing district, exist at least one this filtering treat the macro block of filtering the time; C is the maximum of carrying out the macro block number of this filtering; Such as; This can treat that filtered macroblock carries out filtering to 2, perhaps treats that to 1 filtered macroblock carries out filtering, then with 2 macro block groups of treating filtered macroblock as this filtering.
In a+1 parallel processing district, exist at least one this filtering treat the macro block of filtering the time, c macro block comprise at least one on non-filtering direction with said a parallel processing district neighboring macro-blocks.
Step 208: utilize N core that the macro block group of this filtering is carried out filtering;
With the input of the filtered of the macro block of filtering, utilize N core that polycaryon processor comprises that the macro block that the macro block group of this filtering comprises is carried out filtering as this filtering.
Step 209: finish.
Fig. 3 (a) carries out the sketch map of the embodiment one of parallel filtering for adopting method of the present invention to image.Fig. 3 (b) is for carrying out the sketch map of loap-paralled track filtering institute cycles consumed to image shown in Fig. 3 (a).Combine Fig. 3 (a) and Fig. 3 (b) at present, the embodiment one of inventive method is described, specific as follows:
Comprise in system under the situation of 4 processors; When the image that adopts method of the present invention that 8 row 11 are listed as carried out loap-paralled track filtering, the i direction was a line direction, and the j direction is a column direction; Each square is represented the macro block of treating that filtering image comprises, the number of times of the numeral filtering that marks on each macro block.
In the present embodiment, with column direction as the filtering direction, with line direction as non-filtering direction; The core number that comprises according to polycaryon processor; I.e. 4 cores comprising of four core processors; With the image division of treating filtering is 2 parallel processing districts; Be first parallel processing district and second parallel processing district representing with the shadow region, each parallel processing district comprises 4 row, 11 row, totally 44 macro blocks.
First parallel processing district is when treating a parallel processing district of filtering, and the 1st time to MB [0,0] filtering; The 2nd time to MB [0,1] filtering.
When carrying out the 3rd filtering, can be to MB [0,2] and MB [1,0] parallel filtering, or to MB [2,0] filtering, according to the method for the invention, select MB [0,2] and MB [1,0] parallel filtering.In like manner, the 4th is to MB [0,3] and MB [1,1] parallel filtering; The 5th is to MB [0,4], MB [1,2] and MB [2,0] parallel filtering; The 6th time to MB [0,5], MB [1,3] and MB [2,1] parallel filtering.When carrying out above-mentioned the 3-6 time filtering, because therefore the macro block of treating filtering that does not have to carry out this filtering in second parallel processing district, can't be selected to be in the macro block group that second macro block in the parallel processing district adds this filtering, the c value is 0.
The 7th time to MB [0,6], MB [1,4], MB [2,2] and MB [3,0] parallel filtering; The 8th time to MB [0,7], MB [1,5], MB [2,3], MB [3,1] parallel filtering; The 9th time to MB [0,8], MB [1,6], MB [2,4] and MB [3,2] parallel filtering; The 10th time to MB [0,9], MB [1,7], MB [2,5] and MB [3,3] parallel filtering; The 11st time to MB [0,10], MB [1,8], MB [2,6], MB [3,4] parallel filtering.When carrying out above-mentioned the 7-11 time filtering, each macro block group can be selected 4 of can carry out this filtering and treat filtered macroblock from first parallel processing district, and 4 cores that four core processors comprise have all been participated in the filtering to macro block, and it is minimum that IPR reaches; The c value is 0.
When carrying out the 12nd filtering, the macro block group of this filtering comprises MB [1,9], MB [2,7] and MB [3 in first parallel processing district; 5] there is the macro block of treating filtering that can carry out this filtering in these 3 macro blocks in second parallel processing district, i.e. MB [4,0]; Then MB [4,0] is added the macro block group of this filtering, just, the 12nd time to MB [4; 0], MB [1,9], MB [2,7] and MB [3,5] parallel filtering.In like manner, the 13rd time to MB [4,1], MB [1,10], MB [2,8] and MB [3,6] parallel filtering; The 14th time to MB [4,2], MB [5,0], MB [2,9] and MB [3,7] parallel filtering; The 15th time to MB [4,3], MB [5,1], MB [2,10] and MB [3,8] parallel filtering; The 16th time to MB [4,4], MB [5,2], MB [6,0] and MB [3,9] parallel filtering; The 17th time to MB [4,5], MB [5,3], MB [6,1] and MB [3,10] parallel filtering.When carrying out the above-mentioned the 12nd-17 filtering; The macro block number of each macro block group in first parallel processing district is less than the core number; Through the filtered macroblock of from second parallel processing district, selecting to carry out this filtering of treating; The macro block number that makes the macro block group comprise equals the core number, and then reduces IPR, improves the operating efficiency of four core processor parallel filterings; C is a nonzero value.
The 18th time to MB [4,6], MB [5,4], MB [6,2] and MB [7,0] parallel filtering; The 19th time to MB [4,7], MB [5,5], MB [6,3] and MB [7,1] parallel filtering; The 20th time to MB [4,8], MB [5,6], MB [6,4] and MB [7,2] parallel filtering; The 21st time to MB [4,9], MB [5,7], MB [6,5], MB [7,3] parallel filtering; The 22nd time to MB [4,10], MB [5,8], MB [6,6], MB [7,4] parallel filtering.When carrying out above-mentioned the 18-22 time filtering, each macro block group can be selected 4 of can carry out this filtering and treat filtered macroblock from second parallel processing district, and 4 cores that four core processors comprise have all been participated in the filtering to macro block, and it is minimum that IPR reaches; The c value is 0.
The 23rd time to MB [5,9], MB [6,7] and MB [7,5] parallel filtering; The 24th time to MB [5,10], MB [6,8] and MB [7,6] parallel filtering; The 25th time to MB [6,9] and MB [7,7] parallel filtering; The 26th time to MB [6,10] and MB [7,8] parallel filtering; The 27th time to MB [7,9] filtering; The 28th time to MB [7,10] filtering.When carrying out above-mentioned the 23-28 time filtering, do not exist at the image of treating filtering under the situation in the 3rd parallel processing district, the macro block number that the macro block group of this filtering in second parallel processing district comprises is less than the core number, and the c value is 0.
Can know that according to foregoing when the image that adopts method of the present invention that 8 row 11 are listed as carried out loap-paralled track filtering, when the 7-22 time filtering, 4 cores that four core processors comprise were all in running order, have 16 IPR in the loop filtering and reach minimum; And the cycle that consumes during above-mentioned each filtering is T, and treating the cycle that the loap-paralled track filtering of the image of filtering consumed is 28T, and more existing loap-paralled track filtering method has further shortened the time that consumes.
Fig. 4 carries out the sketch map of the embodiment two of parallel filtering for adopting method of the present invention to image.In the present embodiment, with line direction as the filtering direction, with column direction as non-filtering direction; The core number that comprises according to polycaryon processor, i.e. 4 cores comprising of four core processors are 2 parallel processing districts with the image division of 11 row, 8 row of treating filtering, i.e. first parallel processing district and second parallel processing district representing with the shadow region.The number of times of the numeral filtering that marks on each macro block.
In the present embodiment, during the x time filtering, be that the macro block of x carries out filtering to label among the figure; Wherein, x is more than or equal to 1 and smaller or equal to 28 integer; The cycle that the cycle that present embodiment consumed and embodiment illustrated in fig. 3 one consumes is identical, only is different on the filtering direction, no longer concrete filtering is given unnecessary details at this.
H.264 method provided by the invention both can be applicable in the standard, and also can be applicable to the macro block is that base unit carries out in other video encoding and decoding standards of loop filtering.
In the above-mentioned preferred embodiment of the present invention; Treat a plurality of macro blocks that filtering image comprises, the core number that at first comprises according to polycaryon processor is a plurality of parallel processings districts with image division; Exactly in order to make macro block number that each macro block group comprises smaller or equal to the core number; Like this, utilizing polycaryon processor that each macro block group is carried out in the process of filtering, can not have macro block number that the macro block group comprises situation greater than the core number; The cycle T that consumes when each macro block group being carried out filtering with shortening, the numerical value of reduction IPR; The macro block number that comprises in each macro block group is during less than the core number; The present invention selects the filtered macroblock of treating that can carry out this filtering from next parallel processing district; Mend the macro block group of this filtering; So that the macro block number that the macro block group of this filtering comprises is identical with the core number,, improve the operating efficiency of parallel filtering so that IPR reaches minimum.
The above is merely preferred embodiment of the present invention, and is in order to restriction the present invention, not all within spirit of the present invention and principle, any modification of being made, is equal to replacement, improvement etc., all should be included within the scope that the present invention protects.

Claims (9)

1. the Parallel Implementation method of a loop filtering is characterized in that, this method comprises:
A, according to the core number N that preset non-filtering direction and polycaryon processor comprise, a plurality of macro blocks of filtering of treating are divided into M orderly parallel processing district successively; Said N is the integer greater than 1; Said M is the value after the merchant that treats macro block number and the core number N of filtering rounds, and perhaps is the merchant of the macro block number of treating filtering and core number N value and 1 sum after rounding;
B, from a parallel processing district, select b the macro block of treating filtering, treat the macro block group of the macro block of filtering as this filtering with said b; Said a is less than M; Said b is the maximum of treating the filtered macroblock number of carrying out this filtering;
C, the macro block number b that comprises in the macro block group of said this filtering from a+1 parallel processing district, select c macro block to add the macro block group of said this filtering during less than core number N, utilize N core that the macro block group of said this filtering is carried out filtering; Said c is the maximum of carrying out the macro block number of this filtering.
2. method according to claim 1 is characterized in that, further comprises before the said steps A:
Relatively treat the macro block number that the image of filtering comprises on two-dimensional directional and the relation of core number, the macro block number that will comprise is that the one dimension direction of the integral multiple of core number is made as non-filtering direction, and another dimension direction is made as the filtering direction.
3. method according to claim 1 and 2 is characterized in that, the said preset non-filtering direction of steps A is line direction or column direction.
4. method according to claim 3 is characterized in that, steps A is said to be divided into M orderly parallel processing district successively with a plurality of macro blocks of treating filtering and to comprise:
A1, the macro block number that each parallel processing district is comprised on non-filtering direction are made as core number N;
A2, according to the macro block number that comprises on the said non-filtering direction, according to non-filtering direction a plurality of macro blocks of filtering of treating are divided successively, obtain M orderly parallel processing district.
5. method according to claim 3 is characterized in that, step B is said to select b to treat that the macro block of filtering comprises from a parallel processing district:
B1, basis be the position of the macro block of filtering, confirms the adjacent a plurality of macro blocks of treating filtering of macro block with said filtering; The macro block of said filtering is arranged in a parallel processing district and/or a-1 parallel processing district;
B2, according to the filtering between adjacent macroblocks relation in the loop filtering mode, from the said a plurality of macro blocks of treating filtering of step B 1, select the macro block that carries out this filtering;
B3, from the said macro block that carries out this filtering of step B2, select to be in a the parallel processing district, on non-filtering direction adjacent and on the filtering direction the individual macro block of treating filtering of non-conterminous b.
6. method according to claim 5 is characterized in that, further comprises before the said step B 1:
B0, judge when a parallel processing district is first parallel processing district, first macro block in first parallel processing district and second macro block are carried out filtering;
Said first macro block is to be in to treat the macro block on the original position in the filtering image; Said second macro block be on the filtering direction with said first macro block neighboring macro-blocks.
7. method according to claim 3 is characterized in that, step C is said from a+1 parallel processing district, and the macro block group of selecting c macro block to add said this filtering comprises:
C1, basis be the position of the macro block of filtering, in a+1 parallel processing district, confirms the adjacent a plurality of macro blocks of treating filtering of macro block with said filtering;
C2, according to the filtering between adjacent macroblocks relation in the loop filtering mode, from the said a plurality of macro blocks of treating filtering of step C1, select the macro block that carries out this filtering;
C3, from the said macro block that carries out this filtering of step C2, select to be in a+1 the parallel processing district, on non-filtering direction adjacent and on the filtering direction a non-conterminous c macro block, with the macro block group of said c said this filtering of macro block adding.
8. method according to claim 7 is characterized in that, a said c macro block comprise at least one on non-filtering direction with said a parallel processing district neighboring macro-blocks.
9. method according to claim 1 and 2 is characterized in that, further comprises after the said step B:
When the macro block number b that comprises in the macro block group of said this filtering equaled core number N, N core utilizing polycaryon processor to comprise carried out filtering to the macro block group of said this filtering.
CN 201110042746 2011-02-18 2011-02-18 Realizing method of loop filtering parallel Active CN102098515B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110042746 CN102098515B (en) 2011-02-18 2011-02-18 Realizing method of loop filtering parallel

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110042746 CN102098515B (en) 2011-02-18 2011-02-18 Realizing method of loop filtering parallel

Publications (2)

Publication Number Publication Date
CN102098515A CN102098515A (en) 2011-06-15
CN102098515B true CN102098515B (en) 2012-12-12

Family

ID=44131352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110042746 Active CN102098515B (en) 2011-02-18 2011-02-18 Realizing method of loop filtering parallel

Country Status (1)

Country Link
CN (1) CN102098515B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1233171C (en) * 2004-01-16 2005-12-21 北京工业大学 A simplified loop filtering method for video coding
CN1306826C (en) * 2004-07-30 2007-03-21 联合信源数字音视频技术(北京)有限公司 Loop filter based on multistage parallel pipeline mode
US8111760B2 (en) * 2006-11-16 2012-02-07 Texas Instruments Incorporated Deblocking filters
CN101459839A (en) * 2007-12-10 2009-06-17 三星电子株式会社 Deblocking effect filtering method and apparatus for implementing the method
CN101472173B (en) * 2007-12-29 2012-07-25 安凯(广州)微电子技术有限公司 Method, system and filter for filtering de-block

Also Published As

Publication number Publication date
CN102098515A (en) 2011-06-15

Similar Documents

Publication Publication Date Title
CN106464894B (en) Method for processing video frequency and device
CN101490968B (en) Parallel processing apparatus for video compression
CN101371587B (en) Parallel decoding of intra-encoded video
CN104935942B (en) The method that intra prediction mode is decoded
CN105491377B (en) A kind of video decoded macroblock grade Method of Scheduling Parallel of computation complexity perception
CN101170688B (en) A quick selection method for macro block mode
CN108449603B (en) Based on the multi-level task level of multi-core platform and the parallel HEVC coding/decoding method of data level
CN105981383B (en) Method for processing video frequency and device
EP2445211A1 (en) Multi-core image encoding processing device and image filtering method thereof
CN102098503A (en) Method and device for decoding image in parallel by multi-core processor
CN102143361B (en) Video coding method and video coding device
CN102740077A (en) H.264/AVC standard-based intra-frame prediction mode selection method
JP2011066844A (en) Parallel decoding device, program, and parallel decoding method of coded data
CN112468821A (en) HEVC core module-based parallel decoding method, device and medium
CN106454349A (en) Motion estimation block matching method based on H.265 video coding
CN101115207B (en) Method and device for implementing interframe forecast based on relativity between future positions
CN104521234B (en) Merge the method for processing video frequency and device for going block processes and sampling adaptive migration processing
CN105791829B (en) A kind of parallel intra-frame prediction method of HEVC based on multi-core platform
CN104469488A (en) Video decoding method and system
CN108540797A (en) HEVC based on multi-core platform combines WPP coding methods within the frame/frames
CN101841722B (en) Detection method of detection device of filtering boundary strength
CN109391816B (en) Parallel processing method for realizing entropy coding link in HEVC (high efficiency video coding) based on CPU (Central processing Unit) and GPU (graphics processing Unit) heterogeneous platform
CN1112654C (en) Image processor
CN105100799A (en) Method for reducing intraframe coding time delay in HEVC encoder
CN112422986B (en) Hardware decoder pipeline optimization method and application

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: HANGZHOU HIKVISION DIGITAL TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: HANGZHOU HAIKANG WEISHI SOFTWARE CO., LTD.

Effective date: 20120904

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 310012 HANGZHOU, ZHEJIANG PROVINCE TO: 310051 HANGZHOU, ZHEJIANG PROVINCE

TA01 Transfer of patent application right

Effective date of registration: 20120904

Address after: Hangzhou City, Zhejiang province 310051 Binjiang District East Road Haikang Science Park No. 700, No. 1

Applicant after: Hangzhou Hikvision Digital Technology Co., Ltd.

Address before: Ma Cheng Road Hangzhou City, Zhejiang province 310012 No. 36

Applicant before: Hangzhou Haikang Weishi Software Co., Ltd.

C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20110615

Assignee: Hangzhou Hikvision Technology Co.,Ltd.

Assignor: Hangzhou Hikvision Digital Technology Co.,Ltd.

Contract record no.: X2021330000212

Denomination of invention: A parallel implementation method of loop filtering

Granted publication date: 20121212

License type: Common License

Record date: 20210901

EE01 Entry into force of recordation of patent licensing contract