CN102857758B - Reusable pixel processing method and reusable video processing chip - Google Patents

Reusable pixel processing method and reusable video processing chip Download PDF

Info

Publication number
CN102857758B
CN102857758B CN201210321972.1A CN201210321972A CN102857758B CN 102857758 B CN102857758 B CN 102857758B CN 201210321972 A CN201210321972 A CN 201210321972A CN 102857758 B CN102857758 B CN 102857758B
Authority
CN
China
Prior art keywords
macroblock
execution cycle
instruction
encoded
rank
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210321972.1A
Other languages
Chinese (zh)
Other versions
CN102857758A (en
Inventor
朱磊
兰军强
沙力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Galaxycore Shanghai Ltd Corp
Original Assignee
SHANGHAI SUANXIN MICROELECTRONICS CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI SUANXIN MICROELECTRONICS CO Ltd filed Critical SHANGHAI SUANXIN MICROELECTRONICS CO Ltd
Priority to CN201210321972.1A priority Critical patent/CN102857758B/en
Publication of CN102857758A publication Critical patent/CN102857758A/en
Application granted granted Critical
Publication of CN102857758B publication Critical patent/CN102857758B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a reusable pixel processing method and a reusable video processing chip. According to the method, a same pixel processing device is used for executing at least two types of pixel processing. The method includes: determining processing types for processing a current frame; configuring corresponding commands of the processing types in the pixel processing device; and performing processing of each execution cycle for to-be-coded macro blocks of the current frame sequentially by the aid of the configured commands corresponding to the determined processing types of the current frame. The video processing chip comprises a determining unit, an acquisition unit, a register array, a clock control unit, a controller and an arithmetic unit array. The reusable pixel processing method and the reusable video processing chip have the advantages that by the reusable design technology, registers and arithmetic units are configured by different methods, so that single hardware unit can be reused for multiple operations in a time division manner, computing and storage logic complexity of the chip is lowered, and the area and power consumption of the chip are reduced.

Description

Reusable pixel processing method and video frequency processing chip
Technical field
The present invention relates to multi-media processing field, particularly relate to the video frequency processing chip of a kind of reusable pixel processing method and correspondence.
Background technology
In Video coding and image processing process, estimation, video scaling, spatial domain/time-domain filtering are all important component parts.
Estimation (Motion Estimation) is widely used a kind of technology in Video coding and Video processing.Its basic thought is: in video interprediction encoding, correlation in based upon activities image contiguous frames between scenery, each frame of image sequence is divided into the macro block of many non-overlapping copies, and think that the displacement of all pixels is all identical in macro block, then in a certain given hunting zone of reference frame, find out the block (i.e. match block) the most similar to current block to each macro block according to certain matching criterior, the relative displacement of match block and current block is motion vector.When video compression, only need preserve motion vector and residual error data just can recover current block completely.But due to natural forms motion continuity, the motion vector of the block between adjacent two frames is not base unit with Integer Pel, may real moving displacement amount be in units of 1/2 the pixel even sub-pix such as 1/4 pixel, so need when estimation the residual error asking sub-pix.Usually the residual error of sub-pix is asked to be come by interpolation.
Video scaling refers to the process adjusted the size of digital picture.Its rudimentary algorithm is the filter coefficients to image pixel.Pixel after each convergent-divergent, is all generated through vertical and horizontal twice filter coefficients by the multiple original pixels around it.
Airspace filter and time-domain filtering can be used for improving the visual effect of image.
The basic skills of airspace filter is each pixel for image, carries out vertical and horizontal twice filtering.Each filtering is all the difference first obtaining this pixel some pixel adjacent with vertical and horizontal, these differences is carried out computing and tables look-up being added in original pixel, obtains new pixel.Horizontal filtering will be input based on longitudinal filtered result.
The basic skills of time-domain filtering obtains the similarity of current macro and predicted macroblock, similarity carried out calculating and tabling look-up, is added in original pixel, obtain new pixel.
In existing video chip, estimation, video scaling, spatial domain/time-domain filtering have been come by different hardware cells, each hardware cell needs, once to multiple pixel-parallel process, not only to need a large amount of arithmetic element, and increase chip area and power consumption.
Therefore, those skilled in the art are a kind of solution urgently, can process above-mentioned operations, can reduce chip complexity, saving chip area and power consumption simultaneously.
At application publication number be disclose in the Chinese patent application of CN102547294A a kind of be applicable to H.264 with the CABAC hardware decoder framework of HEVC video standard, wherein fractional reuse memory cell and combinational logic arithmetic element.But, its multiplexing basis be H.264 with common CABAC algorithm in HEVC video encoding standard, the complicated huge technical problem of hardware cell in it and unresolved estimation, video scaling, spatial domain/time-domain filtering processing procedure.
Summary of the invention
Technical problem to be solved by this invention uses general framework time-sharing multiplex in estimation, video scaling, spatial domain/time-domain filtering, reduces complexity, saving chip area and the power consumption of chip calculating and stored logic simultaneously.
In order to solve the problem, the invention provides a kind of reusable pixel processing method, adopt same image processor to perform the processes pixel of at least two kinds of process types; The method comprises the following steps:
Determine the process type that present frame is processed; In described image processor, be all configured with corresponding instruction for each process type;
Described image processor obtains present frame;
The instruction adopting the process type for the aforementioned present frame determined to configure, carries out the process of each execution cycle successively to each macroblock to be encoded of described present frame.
Optionally, described process type comprises one or more in estimation, video scaling, airspace filter and time-domain filtering.
Optionally, described execution cycle is divided into the first execution cycle and the second execution cycle; The described corresponding instruction for process type configuration comprises:
When described process type is estimation, the instruction of the first execution cycle is carry out the following interpolation in n rank and n rank to the reference macroblock in reference frame, and n is positive integer, and the instruction of the second execution cycle is that the reference macroblock after described macroblock to be encoded and interpolation asks poor;
When described process type is video scaling, the instruction of the first execution cycle is for carry out the following interpolation in n rank and n rank to described macroblock to be encoded, and n is positive integer, and the instruction of the second execution cycle is bye;
When described process type is airspace filter, the instruction of the first execution cycle is for carry out the following interpolation in n rank and n rank to described macroblock to be encoded, and n is positive integer, and the instruction of the second execution cycle is bye;
When described process type is time-domain filtering, the instruction of the first execution cycle is bye, and the instruction of the second execution cycle is ask poor to the reference macroblock in described macroblock to be encoded and reference frame.
Optionally, the described corresponding instruction for process type configuration also comprises: the coefficient of the following interpolation formula in n rank and n rank in period 1 instruction;
Described interpolation formula is: P=round [(α 1a 1+ α 2a 2+ ... + α n-1a n-1+ α na n)/2 n], wherein: P is gained pixel after interpolation, round () function is for returning the integer value that rounds up, A 1~ A nfor n known pixels adjacent in same a line or same row, α 1~ α nfor each coefficient of described interpolation formula.
Optionally, described n is less than or equal to 6.
Optionally, when the instruction of described first execution cycle is for carry out six rank interpolation to reference macroblock, or when carrying out six rank interpolation to macroblock to be encoded in present frame, the coefficient of described interpolation formula is [1 ,-5,20,20 ,-5,1].
Optionally, when the instruction of described first execution cycle is bye, the coefficient of described interpolation formula is [1].
Optionally, described execution cycle is divided into the first execution cycle and the second execution cycle; Describedly successively the process that each macroblock to be encoded of present frame carries out each execution cycle to be comprised: each macroblock to be encoded processing described present frame in a pipeline fashion successively.
Optionally, the each macroblock to be encoded of described pipeline system process comprises: while adopting the second execution cycle instruction corresponding with process type to carry out corresponding computing to current macroblock to be encoded, adopts the first execution cycle instruction corresponding with process type to carry out corresponding computing to next macroblock to be encoded.
Optionally, described reference macroblock and macroblock to be encoded are based on H.264 standard.
Optionally, described reference macroblock and macroblock to be encoded are at least 4 × 4 pixel sizes.
Optionally, described execution cycle is divided into the first execution cycle and the second execution cycle; The process that described each macroblock to be encoded to present frame carries out each execution cycle comprises:
Carry out the following interpolation in n rank and n rank at the first execution cycle to the reference macroblock in reference value, n is positive integer, asks poor at the second execution cycle to the reference macroblock after described macroblock to be encoded and interpolation;
Or carry out the following interpolation in n rank and n rank at the first execution cycle to described macroblock to be encoded, n is positive integer, in the second execution cycle bye;
Or in the first execution cycle bye, at the second execution cycle, the reference macroblock in described macroblock to be encoded and reference frame is asked poor.
Present invention also offers a kind of reusable video frequency processing chip, comprising:
Be configured to be suitable for determining the determining unit to the process type that present frame processes;
Be configured to the acquiring unit being suitable for obtaining present frame;
Register array, is configured to be suitable for the reference macroblock in temporary reference frame of working as needed for pre-treatment and the macroblock to be encoded in described present frame;
Clock control cell, is configured to be suitable for providing execution cycle;
Arithmetic element array, is configured to be suitable for carrying out corresponding computing to the reference macroblock in register array or macroblock to be encoded;
Controller, be connected to described determining unit, register array, arithmetic element array, clock control cell, be configured to be suitable for reading when the reference macroblock needed for pre-treatment and macroblock to be encoded, and according to the instruction corresponding with present frame process type, control algorithm cell array carries out corresponding computing.
Optionally, also comprise: the instruction dispensing unit being connected to described controller, be suitable for the instruction of the pre-configured correspondence of each process type.
Optionally, described process type comprises one or more in estimation, video scaling, airspace filter and time-domain filtering.
Optionally, described instruction dispensing unit also comprises: coefficient dispensing unit, is suitable for each coefficient of the following interpolation formula in n rank and n rank in the instruction of configuration first execution cycle.
Optionally, described arithmetic element array comprises the first execution cycle array and the second execution cycle array.
Optionally, described first execution cycle array comprises filter array, and described second execution cycle array comprises subtracter array.
Optionally, described controller also comprises: rhythm control unit, for processing each macroblock to be encoded of present frame in a pipeline fashion successively according to described execution cycle, while making arithmetic element array adopt the second corresponding execution cycle instruction to carry out corresponding computing to current macroblock to be encoded, the first corresponding execution cycle instruction is adopted to carry out corresponding computing to next macroblock to be encoded.
Optionally, the reference macroblock in described register array and macroblock to be encoded are based on H.264 standard.
Optionally, when described reference macroblock or macroblock to be encoded are a × a pixel size, a is positive integer, and described register array is at least the individual pixel size of (a+6) × (a+6).
Optionally, described reference macroblock or macroblock to be encoded are at least 4 × 4 pixel sizes, and described register array is at least 10 × 10 pixel sizes.
Compared with prior art, technical scheme of the present invention has the following advantages:
1, the present invention is by the similitude analyzed and summarize in estimation, video scaling, spatial domain/time-domain filtering algorithm and compatibility, adopt reusable designing technique, by by diverse ways configuration register and arithmetic element, make the computing that originally will have been come by different hardware unit, can be realized by time-sharing multiplex by individual feature unit, effectively reduce the quantity of hardware cell, reduce the complexity of chip, save chip area and power consumption simultaneously.
2, in possibility, the solution based on H.264 standard is given, to meet the requirement of the International video coding standard more widely of application at present.
3, in possibility, by configuring the coefficient of interpolation formula, making the bye in the first execution cycle be equivalent to the single order interpolation arithmetic carried out current pixel itself, having unified computing, simplified hardware logic.
4, in possibility, adopt each macroblock to be encoded of pipeline system process present frame, improve treatment effeciency.
Accompanying drawing explanation
Fig. 1 is the method flow diagram of the embodiment of reusable pixel processing method of the present invention;
Fig. 2 is the working timing figure of an embodiment of reusable pixel processing method of the present invention;
Fig. 3 is the structural representation of the embodiment of reusable video frequency processing chip of the present invention.
Embodiment
Set forth a lot of detail in the following description so that fully understand the present invention.But the present invention can be much different from alternate manner described here to implement, those skilled in the art can when without prejudice to doing similar popularization when intension of the present invention, therefore the present invention is by the restriction of following public concrete enforcement.
Secondly, the present invention utilizes schematic diagram to be described in detail, and when describing the embodiment of the present invention in detail, for ease of illustrating, described schematic diagram is example, and it should not limit the scope of protection of the invention at this.
In order to solve the technical problem in background technology, inventor analyzes the coding and decoding video related in background technology and image processing techniques.
The core algorithm of estimation has two parts: the interpolation of sub-pix and ask the macro block of present frame and the residual error of reference frame; The core algorithm of video scaling carries out vertical and horizontal twice totally filter coefficients to each pixel; The core algorithm of airspace filter carries out vertical and horizontal twice filter coefficients to each pixel of image, and horizontal filtering will be input based on longitudinal filtered result; The core algorithm of time-domain filtering is the difference asking current macro and predicted macroblock.
Can be drawn by above analysis, estimation is asked residual sum time-domain filtering all to comprise and is asked poor to pixel, and estimation sub-pixel interpolation, video scaling and airspace filter all comprise interpolation.Therefore, the core algorithm of above-mentioned technology can be unified to be summarised as 2 steps: interpolation and ask poor.
Based on above-mentioned analysis and refinement, the invention provides a kind of reusable pixel processing method.Fig. 1 is the method flow diagram of the embodiment of reusable pixel processing method of the present invention, and it at least comprises the following steps:
Perform step S10, determine the process type of carrying out present frame, described process type corresponds to the instruction in order to process present frame.Particularly, described process type comprises the one in estimation, video scaling, airspace filter, time-domain filtering.Often kind of process type corresponds respectively to the different instructions in order to process present frame.
It should be noted that, described instruction can be preset in advance and be fixed in controller, or is configured according to different process types.When allowing to be configured according to different process types, before execution step S10, can step S00 be performed, according to different process types, the instruction that configuration is corresponding.
Particularly, for meeting the support completely to 4 kinds of computings, 4 kinds of computings being divided into 2 execution cycles complete, in the period 1, realizing interpolation arithmetic, realize in second round asking difference operation.If some computing only needs interpolation arithmetic or only demand difference operation, then realize interpolation in the period 1, second round bye, or period 1 bye, realizes asking difference second round.By such instruction configuration, the unification of computing can be realized, effectively reduce arithmetic logic.
This embodiment can process estimation, video scaling, airspace filter, time-domain filtering 4 kinds process type, so the instruction of corresponding configuration comprises:
When described process type is estimation, the instruction of the first execution cycle is for carry out the following interpolation in n rank and n rank to reference macroblock, and the instruction of the second execution cycle is that the reference macroblock after described macroblock to be encoded and interpolation asks poor;
When described process type is video scaling, the instruction of the first execution cycle is carry out the following interpolation in n rank and n rank to the macroblock to be encoded of present frame, and the instruction of the second execution cycle is bye;
When described process type is airspace filter, the instruction of the first execution cycle is carry out the following interpolation in n rank and n rank to the macroblock to be encoded of present frame, and the instruction of the second execution cycle is bye;
When described process type is time-domain filtering, the instruction of the first execution cycle is bye, and the instruction of the second execution cycle is ask poor to the reference macroblock in described macroblock to be encoded and reference frame.
The described corresponding instruction for process type configuration also comprises: each coefficient in the following interpolation formula in n rank and n rank in configuration period 1 interpolation instruction.Described interpolation formula is: P=round [(α 1a 1+ α 2a 2+ ... + α n-1a n-1+ α na n)/2 n], wherein: P is gained pixel after interpolation, round () function is for returning the integer value that rounds up, A 1~ A nfor n known pixels adjacent in same a line or same row, α 1~ α nfor each coefficient of described interpolation formula.
Such as: according to H.264 standard, carry out six jump values, then each coefficient of six rank interpolation formulas is configured to [1 ,-5,20,20 ,-5,1].Such as: carry out quadravalence interpolation, then coefficient is only 4, the coefficient of the 1st and the 6th pixel can be set to 0, only arrange 4 coefficients.For another example: when the first execution cycle is bye, now can be considered and carry out single order interpolation arithmetic to macroblock to be encoded itself, only need the coefficient of known pixels itself to be set to 1, the coefficient of rest of pixels is set to 0.
With reference to figure 1, continue to perform step S20, obtain present frame.
Particularly, for estimation, its operation result is the residual error in reference frame in reference macroblock and present frame between macroblock to be encoded.Equally, time-domain filtering is also ask poor to 2 macro blocks in reference frame and present frame.And for video scaling and airspace filter, only need that interpolation arithmetic is done to 1 macro block in present frame and can obtain result, it asks poor execution cycle to be wheel dummy status.Distinguish to some extent, video scaling comprises laterally longitudinally 2 interpolation, and each filtering is all the difference first obtaining this pixel some pixel adjacent with vertical and horizontal, these differences is carried out computing and tables look-up being added in original pixel, obtains new pixel.Horizontal filtering will be input based on longitudinal filtered result.
It should be noted that, this embodiment is based on H.264 standard, so accordingly, reference frame, present frame and the macro block that wherein contains are also all based on H.264 standard.It will be understood by those skilled in the art that H.264 standard is compared with other existing video encoding standard simultaneously, there is higher data compression ratio, more outstanding image quality can be provided under identical bandwidth.Accordingly, the computation complexity of H.264 encoding is higher compared with other existing video encoding standard.Therefore, can meet on the basis of H.264 standard, this embodiment is compatible other digital video coding standard existing natch, such as: H.263, MPEG-4, AVS etc.
With reference to figure 1, continue to perform step S30, adopt the instruction that the process type for the aforementioned present frame determined configures, successively each macroblock to be encoded of described present frame is carried out to the process of each execution cycle.
Particularly, at the first execution cycle, adopt the instruction corresponding with process type to carry out interpolation to macroblock to be encoded in present frame, or adopt the instruction corresponding with process type to carry out interpolation to reference macroblock in reference frame, or bye.At the second execution cycle, adopt the instruction corresponding with process type to ask poor to the reference macroblock after macroblock to be encoded and interpolation, or adopt the instruction corresponding with process type to ask poor to reference macroblock in described macroblock to be encoded and reference frame, or bye.And two steps in repetitive cycling, until each macroblock to be encoded in present frame is all disposed.
Further, during estimation, at the first execution cycle, the following interpolation in n rank and n rank is carried out to the reference macroblock in reference value, at the second execution cycle, the reference macroblock after described macroblock to be encoded and interpolation is asked poor.Or during video scaling/airspace filter, at the first execution cycle, the following interpolation in n rank and n rank is carried out, in the second execution cycle bye to described macroblock to be encoded.Or during time-domain filtering, in the first execution cycle bye, at the second execution cycle, the reference macroblock in described macroblock to be encoded and reference frame is asked poor.
It should be noted that, this embodiment adopts pipelined operation, namely while the second execution cycle instruction is performed to current macroblock to be encoded, next macroblock to be encoded is performed to the instruction of the first execution cycle, to ensure from second period, each cycle has a result to export afterwards.
In view of in H.264 standard, macro block is defined as 16 × 16,16 × 8,8 × 16,8 × 8,8 × 4,4 × 8,4 × 4 totally 7 kinds of patterns, so the reference macroblock of this embodiment or macroblock to be encoded are at least 4 × 4 pixel sizes, to meet the lowest limit requirement of H.264 standard.
Particularly, described interpolation comprises the following interpolation in n rank and n rank or bye, and n is positive integer.H.264 the highest six rank interpolation are supported at present in standard, the interpolation of its 1/2 pixel calculates 1/2 middle pixel based on 6 pixels adjacent in same a line or same row, even to the reference macroblock interpolation of 14 × 4 pixel size, then at least need 10 × 10 pixels centered by these 4 × 4 pixels.
It should be noted that, although it will be understood by those skilled in the art that and at present H.264 the highlyest in standard only support six rank interpolation, should not be construed as the present invention and be only applicable to the following interpolation in six rank and six rank.In fact, the present invention is intended to by time-sharing multiplexing technology, reduces the quantity of hardware cell, reduces computational complexity, to the implementation of wherein concrete interpolation, is not specifically limited.
Below in conjunction with the drawings and specific embodiments, technical scheme of the present invention is described further.Fig. 2 is the working timing figure of reusable pixel processing method one embodiment of the present invention.With reference to figure 2, the present embodiment contains altogether 7 process, is respectively continuous print 3 estimation, afterwards 1 video scaling, 1 airspace filter and 2 continuous print time-domain filterings.
It should be noted that, the present embodiment is for illustrate the present invention better, so the transformation of process type is comparatively extreme.In actual applications, often only same process is carried out in a period of time, such as: during estimation, all complete to each macroblock to be encoded in major general's present frame and each reference macroblock in reference frame and ask poor, and after obtaining motion vector and residual error data according to optimization matching function, just likely change process type.So corresponding instruction usually can not be complicated like this to the present embodiment.
In the present embodiment, the instruction of corresponding configuration refers to following table 1:
Sequence number Process type First execution cycle instruction Second execution cycle instruction
1 Estimation Interpolation Ask poor
2 Estimation Interpolation Ask poor
3 Estimation Interpolation Ask poor
4 Video scaling Interpolation Bye
5 Airspace filter Interpolation Bye
6 Time-domain filtering Bye Ask poor
7 Time-domain filtering Bye Ask poor
Composition graphs 2 illustrates specific works process.
First is treated to estimation, so according to corresponding instruction, carries out interpolation in the period 1 to the macroblock to be encoded of present frame.
Correspond to the second round of the first process and the period 1 of the second process second round, so perform the instruction corresponding to second round of the first process-carried out asking poor by the reference macroblock after macroblock to be encoded and interpolation, poor result is asked in output.Instruction-interpolation that the period 1 that execution simultaneously second processes is corresponding.
Period 3 corresponds to the second round of the second process and the period 1 of the 3rd process, so perform the instruction corresponding to second round of the second process-ask poor, Output rusults, performs instruction-interpolation corresponding to the period 1 of the 3rd process simultaneously.
Period 4 corresponds to the second round of the 3rd process and the period 1 of the 4th process, so perform the instruction corresponding to second round of the 3rd process-ask poor, Output rusults.The instruction that the period 1 that execution simultaneously the 4th processes is corresponding.Because the process type conversion of the 4th process is video scaling, the instruction of corresponding video scaling first execution cycle is interpolation, so perform interpolation arithmetic herein.
Period 5 corresponds to the second round of the 4th process and the period 1 of the 5th process, so instruction-bye that the second round performing the 4th process is corresponding, Output rusults (result namely after interpolation), the instruction that the period 1 that execution simultaneously the 5th processes is corresponding.Because the process type conversion of the 5th process is airspace filter, the instruction of corresponding airspace filter first execution cycle is interpolation, so perform interpolation arithmetic herein.
Period 6 corresponds to the second round of the 5th process and the period 1 of the 6th process, so instruction-bye that the second round performing the 5th process is corresponding, Output rusults (result namely after interpolation), the instruction that the period 1 that execution simultaneously the 6th processes is corresponding.Because the process type conversion of the 6th process is time-domain filtering, the instruction of corresponding time-domain filtering first execution cycle is bye, so perform bye herein.
7th cycle corresponded to the second round of the 6th process and the period 1 of the 7th process, so perform the instruction corresponding to second round of the 6th process-ask poor, Output rusults (namely asking the result after difference), instruction-bye that the period 1 that execution simultaneously the 7th processes is corresponding.
8th cycle corresponded to the second round of the 7th process, so perform the instruction corresponding to second round of the 7th process-ask poor, and exported and asked poor result.
Complete whole process thus, from second round, each cycle has a result to export.
Realize time-sharing multiplex by the configuration of above-mentioned instruction, make to realize 4 kinds of dissimilar calculation functions by unified arithmetic logic, simplify storage and arithmetic logic.
It should be noted that, as seen through the above description of the embodiments, the mode that partly or entirely can add required general hardware platform by software that those skilled in the art can be well understood to the application realizes.Based on such understanding, the technical scheme of the application can embody with the form of software product the part that prior art contributes in essence in other words, this computer software product can be stored in storage medium, as ROM/RAM, magnetic disc, CD etc., comprising some instructions in order to make a computer equipment (can be personal computer, server, or the network equipment etc.) perform the method described in some part of each embodiment of the application or embodiment.
The application can be used in numerous general or special purpose computing system environment or configuration.Such as: personal computer, server computer, handheld device or portable set, laptop device, multicomputer system, system, set top box, programmable consumer-elcetronics devices, network PC, minicom, mainframe computer, the distributed computing environment (DCE) comprising above any system or equipment etc. based on microprocessor.
The application can describe in the general context of computer executable instructions, such as program module.Usually, program module comprises the routine, program, object, assembly, data structure etc. that perform particular task or realize particular abstract data type.Also can put into practice the application in a distributed computing environment, in these distributed computing environment (DCE), be executed the task by the remote processing devices be connected by communication network.In a distributed computing environment, program module can be arranged in the local and remote computer-readable storage medium comprising memory device.
Present invention also offers a kind of reusable video frequency processing chip, by the mode of shared register and arithmetic element interpolation arithmetic with ask difference operation to unite.First define a unified register array, for depositing the pixel of parallel processing, then the arithmetic element array that definition one is unified, this array both can carry out asking poor, n rank interpolation arithmetic can be carried out again, last for estimation, video scaling, these the dissimilar process of spatial domain/time-domain filtering, with diverse ways configuration register array and arithmetic element array, corresponding result can be produced.
Fig. 3 is the structural representation of the embodiment of reusable video frequency processing chip of the present invention.With reference to figure 3, this embodiment comprises:
Determining unit 10, acquiring unit 20, register array 30, clock control cell 40, controller 50, arithmetic element array 60.
Particularly, determining unit 10, for determining the process type of carrying out present frame, described process type corresponds to the instruction in order to process present frame; Acquiring unit 20, for obtaining present frame; Register array 30, for temporary when the reference macroblock in the reference frame needed for pre-treatment and the macroblock to be encoded in present frame; Clock control cell 40, for providing execution cycle; Controller 50, for reading when the reference macroblock needed for pre-treatment and macroblock to be encoded, and obtain instruction corresponding to present frame process type, control algorithm cell array 60 carries out corresponding computing; Arithmetic element array 60, for according to the instruction of controller 50 and execution cycle, carries out corresponding computing to the reference macroblock in register array 30 or macroblock to be encoded.
This embodiment also can comprise instruction dispensing unit 00, for according to different process types, configures the instruction corresponding with process type.
This embodiment supports one or more process types in estimation, video scaling, airspace filter and time-domain filtering.
Particularly, corresponding with process type instruction comprises:
When described process type is estimation, the instruction of the first execution cycle is for carry out the following interpolation in n rank and n rank to described reference macroblock, and n is positive integer, and the instruction of the second execution cycle is that the reference macroblock after described macroblock to be encoded and interpolation asks poor;
When described process type is video scaling, the instruction of the first execution cycle is for carry out the following interpolation in n rank and n rank to described macroblock to be encoded, and n is positive integer, and the instruction of the second execution cycle is bye;
When described process type is airspace filter, the instruction of the first execution cycle is for carry out the following interpolation in n rank and n rank to described macroblock to be encoded, and n is positive integer, and the instruction of the second execution cycle is bye;
When described process type is time-domain filtering, the instruction of the first execution cycle is bye, and the instruction of the second execution cycle is ask poor to the reference macroblock in described macroblock to be encoded and reference frame.
Above-mentioned instruction can be preset in advance and be fixed in controller 50, or is configured according to different process types.When allowing to be configured according to different process types, by instruction dispensing unit 00 according to different process types, configure the instruction corresponding with process type.Such as: under certain computing environment, only have the processing demands of estimation and video scaling, then instruction dispensing unit 00 only needs to configure the instruction corresponding with estimation and video scaling.
Described instruction dispensing unit 00 also can comprise: coefficient dispensing unit 01, for configure the first execution cycle interpolation instruction in each coefficient of the following interpolation formula in n rank and n rank.
The highest six rank interpolation are supported at present in view of in H.264 standard, the interpolation of its 1/2 pixel calculates 1/2 middle pixel based on 6 pixels adjacent in same a line or same row, computing formula based on six rank interpolation H.264 can be: P=round [(A-5B+20C+20D-5E+F)/32], by coefficient dispensing unit 01, the coefficient of interpolation formula is configured to [1,-5,20,20,-5,1].For another example: carry out quadravalence interpolation, then coefficient is only 4, and the coefficient of the 1st and the 6th pixel is set to 0 by coefficient dispensing unit 01, only arranges 4 coefficients.For another example: when the first execution cycle is bye, now can be considered and carry out single order interpolation arithmetic to macroblock to be encoded itself, only need, by coefficient dispensing unit 01, the coefficient of known pixels itself is set to 1, the coefficient of rest of pixels is set to 0.
Particularly, described controller 50 also can comprise rhythm control unit 51, for processing each macroblock to be encoded of present frame in a pipeline fashion successively according to execution cycle, that is: while arithmetic element array 60 adopts the second corresponding execution cycle instruction to carry out corresponding computing to current macroblock to be encoded, the first corresponding execution cycle instruction is adopted to carry out corresponding computing to next macroblock to be encoded, to ensure from second period, each cycle has a result to export afterwards.
The configuration of register array 30 is at least that interpolation still asks poor, the highest n rank interpolation and carry out process to how many pixels simultaneously relevant with related operation.Particularly, if only process 1 pixel at every turn, when related operation is for asking poor, only need in register array 30 to preserve reference macroblock and macroblock to be encoded; And related operation is when being interpolation, need in register array 30, according to the highest n rank interpolation, to preserve the individual pixel size of at least (a+n) × (a+n) centered by macro block (such as: a × a pixel size).If speed up processing, m pixel is processed simultaneously, then also need be multiplied by the pixel quantity m of parallel processing on aforementioned base.
In the h .264 standard macro block is defined as 16 × 16,16 × 8,8 × 16,8 × 8,8 × 4,4 × 8,4 × 4 totally 7 kinds of patterns, so the reference macroblock of this embodiment or macroblock to be encoded are at least 4 × 4 pixel sizes, to meet the lowest limit requirement of H.264 standard.Meanwhile, H.264 support the highest six rank interpolation at present in standard, the interpolation of its 1/2 pixel calculates 1/2 middle pixel based on 6 pixels adjacent in same a line or same row.So when reference macroblock or macroblock to be encoded are a × a pixel size, register array 30 is at least the individual pixel size of (a+6) × (a+6).On this basis, if carry out interpolation to the reference macroblock of 14 × 4 pixel size or macroblock to be encoded, then register array 30 is at least 10 × 10 pixel sizes centered by these 4 × 4 pixels.
The configuration of arithmetic element array 60 is at least that interpolation still asks poor, the highest n rank interpolation and carry out process to how many pixels simultaneously relevant with related operation.First, for meeting computing demand, arithmetic element array 60 at least comprises the first execution cycle array 61 and the second execution cycle array 62.Particularly, the first execution cycle array 61 at least comprises filter permutation, and the second execution cycle array 62 at least comprises subtracter array.Described first execution cycle array 61 is relevant with the highest n rank interpolation, also with to carry out process to how many pixels relevant simultaneously.The quantity of parallel processing is more, and execution speed is faster, and correspondingly the cost of hardware configuration is higher.Described second execution cycle array 62 is with to carry out process to how many each pixel relevant simultaneously.
The estimation of 4x4 pixel is carried out with two execution cycles, the highest support six rank interpolation is example, then need the register array 30 configuring few 10x10 pixel size, can support to deposit the data that 4x4 pixel carries out six rank interpolation needs, also can support that 4x4 pixel asks the data required for difference.
Meanwhile, configuration 16 six rank filter arrays and 16 subtracter arrays are also needed.
Above-mentioned register array 30 and arithmetic element array 60 can the different operating such as passive movement estimation, video scaling, spatial domain/time-domain filtering share, and the core algorithm supporting these to operate.
The present invention is based on the reusing design technology of arithmetic unit and structure, analyze and the similitude of arithmetic unit and structure and compatibility in extraction module and system algorithm, make single arithmetic element can time-sharing multiplex in polyalgorithm, reduce the complexity of chip calculating and stored logic, simultaneously due to the decreased number of arithmetic element, chip can be realized with less area, thus save chip area and power consumption.
Further, in the present invention, give the solution based on H.264 standard, to meet the requirement of the International video coding standard more widely of application at present.
Further, the present invention, by configuring the coefficient of interpolation formula, makes the bye in the first execution cycle be equivalent to the single order interpolation arithmetic carried out current pixel itself, has unified computing, simplified hardware logic.
Further, the present invention also can adopt each macroblock to be encoded of pipeline system process present frame, improves treatment effeciency.
Although the present invention with preferred embodiment openly as above; but it is not for limiting the present invention; any those skilled in the art without departing from the spirit and scope of the present invention; the Method and Technology content of above-mentioned announcement can be utilized to make possible variation and amendment to technical solution of the present invention; therefore; every content not departing from technical solution of the present invention; the any simple modification done above embodiment according to technical spirit of the present invention, equivalent variations and modification, all belong to the protection range of technical solution of the present invention.

Claims (18)

1. a reusable pixel processing method, is characterized in that, adopts same image processor to perform the processes pixel of at least two kinds of process types; The method comprises the following steps:
Determine the process type that present frame is processed, described process type comprise in estimation, video scaling, airspace filter and time-domain filtering one or more;
In described image processor, be all configured with corresponding instruction for each process type;
Described image processor obtains present frame;
The instruction adopting the process type for the aforementioned present frame determined to configure, carries out the process of each execution cycle successively to each macroblock to be encoded of described present frame;
Described execution cycle is divided into the first execution cycle and the second execution cycle; The described corresponding instruction for process type configuration comprises:
When described process type is estimation, the instruction of the first execution cycle is carry out the following interpolation in n rank and n rank to the reference macroblock in reference frame, and n is positive integer, and the instruction of the second execution cycle is that the reference macroblock after described macroblock to be encoded and interpolation asks poor;
When described process type is video scaling, the instruction of the first execution cycle is for carry out the following interpolation in n rank and n rank to described macroblock to be encoded, and n is positive integer, and the instruction of the second execution cycle is bye;
When described process type is airspace filter, the instruction of the first execution cycle is for carry out the following interpolation in n rank and n rank to described macroblock to be encoded, and n is positive integer, and the instruction of the second execution cycle is bye;
When described process type is time-domain filtering, the instruction of the first execution cycle is bye, and the instruction of the second execution cycle is ask poor to the reference macroblock in described macroblock to be encoded and reference frame.
2. reusable pixel processing method as claimed in claim 1, is characterized in that, the described corresponding instruction for process type configuration also comprises: the coefficient of the following interpolation formula in n rank and n rank in the first execution cycle instruction;
Described interpolation formula is: P=round [(α 1a 1+ α 2a 2+ ... + α n-1a n-1+ α na n)/2 n], wherein: P is gained pixel after interpolation, round () function is for returning the integer value that rounds up, A 1~ A nfor n known pixels adjacent in same a line or same row, α 1~ α nfor each coefficient of described interpolation formula.
3. reusable pixel processing method as claimed in claim 2, it is characterized in that, described n is less than or equal to 6.
4. reusable pixel processing method as claimed in claim 2, is characterized in that, when described first execution cycle instruction is for carry out six rank interpolation to reference macroblock, or when carrying out six rank interpolation to macroblock to be encoded in present frame, the coefficient of described interpolation formula is [1 ,-5,20,20 ,-5,1].
5. reusable pixel processing method as claimed in claim 2, is characterized in that: when described first execution cycle instruction is bye, the coefficient of described interpolation formula is [1].
6. reusable pixel processing method as claimed in claim 1, is characterized in that, describedly comprises the process that each macroblock to be encoded of present frame carries out each execution cycle successively: each macroblock to be encoded processing described present frame in a pipeline fashion successively.
7. reusable pixel processing method as claimed in claim 6, it is characterized in that, the each macroblock to be encoded of described pipeline system process comprises: while adopting the second execution cycle instruction corresponding with process type to carry out corresponding computing to current macroblock to be encoded, adopts the first execution cycle instruction corresponding with process type to carry out corresponding computing to next macroblock to be encoded.
8. reusable pixel processing method as claimed in claim 1, is characterized in that, described reference macroblock and macroblock to be encoded are based on H.264 standard.
9. reusable pixel processing method as claimed in claim 1, it is characterized in that, described reference macroblock and macroblock to be encoded are at least 4 × 4 pixel sizes.
10. reusable pixel processing method as claimed in claim 1, it is characterized in that, the process that described each macroblock to be encoded to present frame carries out each execution cycle comprises:
Carry out the following interpolation in n rank and n rank at the first execution cycle to the reference macroblock in reference value, n is positive integer, asks poor at the second execution cycle to the reference macroblock after described macroblock to be encoded and interpolation;
Or carry out the following interpolation in n rank and n rank at the first execution cycle to described macroblock to be encoded, n is positive integer, in the second execution cycle bye;
Or in the first execution cycle bye, at the second execution cycle, the reference macroblock in described macroblock to be encoded and reference frame is asked poor.
11. 1 kinds of reusable video frequency processing chips, is characterized in that, comprising:
Be configured to be suitable for determining the determining unit to the process type that present frame processes, described process type comprise in estimation, video scaling, airspace filter and time-domain filtering one or more;
Be configured to the acquiring unit being suitable for obtaining present frame;
Register array, is configured to be suitable for the reference macroblock in temporary reference frame of working as needed for pre-treatment and the macroblock to be encoded in described present frame;
Clock control cell, is configured to be suitable for providing execution cycle;
Arithmetic element array, is configured to be suitable for carrying out corresponding computing to the reference macroblock in register array or macroblock to be encoded;
Controller, be connected to described determining unit, register array, arithmetic element array, clock control cell, be configured to be suitable for reading when the reference macroblock needed for pre-treatment and macroblock to be encoded, and according to the instruction corresponding with present frame process type, control algorithm cell array carries out corresponding computing;
Instruction dispensing unit, is connected to described controller, is suitable for the instruction of the pre-configured correspondence of each process type; Described execution cycle is divided into the first execution cycle and the second execution cycle, and described instruction dispensing unit is that the instruction of the pre-configured correspondence of each process type comprises:
When described process type is estimation, the instruction of the first execution cycle is carry out the following interpolation in n rank and n rank to the reference macroblock in reference frame, and n is positive integer, and the instruction of the second execution cycle is that the reference macroblock after described macroblock to be encoded and interpolation asks poor;
When described process type is video scaling, the instruction of the first execution cycle is for carry out the following interpolation in n rank and n rank to described macroblock to be encoded, and n is positive integer, and the instruction of the second execution cycle is bye;
When described process type is airspace filter, the instruction of the first execution cycle is for carry out the following interpolation in n rank and n rank to described macroblock to be encoded, and n is positive integer, and the instruction of the second execution cycle is bye;
When described process type is time-domain filtering, the instruction of the first execution cycle is bye, and the instruction of the second execution cycle is ask poor to the reference macroblock in described macroblock to be encoded and reference frame.
12. reusable video frequency processing chips as claimed in claim 11, it is characterized in that, described instruction dispensing unit also comprises: coefficient dispensing unit, is suitable for each coefficient of the following interpolation formula in n rank and n rank in the instruction of configuration first execution cycle.
13. reusable video frequency processing chips as claimed in claim 11, is characterized in that, described arithmetic element array comprises the first execution cycle array and the second execution cycle array.
14. reusable video frequency processing chips as claimed in claim 13, it is characterized in that, described first execution cycle array comprises filter array, described second execution cycle array comprises subtracter array.
15. reusable video frequency processing chips as claimed in claim 11, it is characterized in that, described controller also comprises: rhythm control unit, the execution cycle provided according to described clock control cell is provided, process each macroblock to be encoded of present frame in a pipeline fashion successively, while making described arithmetic element array adopt the second corresponding execution cycle instruction to carry out corresponding computing to current macroblock to be encoded, the first corresponding execution cycle instruction is adopted to carry out corresponding computing to next macroblock to be encoded.
16. reusable video frequency processing chips as claimed in claim 11, is characterized in that, the reference macroblock in described register array and macroblock to be encoded are based on H.264 standard.
17. reusable video frequency processing chips as claimed in claim 16, it is characterized in that, when described reference macroblock or macroblock to be encoded are a × a pixel size, a is positive integer, and described register array is at least the individual pixel size of (a+6) × (a+6).
18. reusable video frequency processing chips as claimed in claim 17, it is characterized in that, described reference macroblock or macroblock to be encoded are at least 4 × 4 pixel sizes, and described register array is at least 10 × 10 pixel sizes.
CN201210321972.1A 2012-09-03 2012-09-03 Reusable pixel processing method and reusable video processing chip Active CN102857758B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210321972.1A CN102857758B (en) 2012-09-03 2012-09-03 Reusable pixel processing method and reusable video processing chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210321972.1A CN102857758B (en) 2012-09-03 2012-09-03 Reusable pixel processing method and reusable video processing chip

Publications (2)

Publication Number Publication Date
CN102857758A CN102857758A (en) 2013-01-02
CN102857758B true CN102857758B (en) 2015-02-04

Family

ID=47403899

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210321972.1A Active CN102857758B (en) 2012-09-03 2012-09-03 Reusable pixel processing method and reusable video processing chip

Country Status (1)

Country Link
CN (1) CN102857758B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103237211B (en) * 2013-04-07 2016-04-20 福建星网视易信息系统有限公司 A kind of method of the video sub-pix brightness interpolating based on FPGA and device thereof
CN103780942B (en) * 2013-12-23 2017-07-11 青岛海信电器股份有限公司 A kind of information configuring methods and device
JP6819096B2 (en) * 2016-06-30 2021-01-27 オムロン株式会社 Image processing equipment, image processing methods, and image processing programs
CN106713928B (en) * 2016-12-29 2020-04-10 深圳Tcl数字技术有限公司 Video encoding and decoding method and device
CN111163317B (en) * 2019-12-31 2022-06-17 上海富瀚微电子股份有限公司 Pixel searching method
CN116453455B (en) * 2023-06-14 2023-11-14 长春希达电子技术有限公司 Pixel multiplexing method, data transmission system and display screen control system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1588638A (en) * 2004-08-09 2005-03-02 中芯联合(北京)微电子有限公司 Multiple mold multiple scale movement evaluation super large scale integrated circuit system structure and method
CN101383970A (en) * 2007-09-06 2009-03-11 北京中电华大电子设计有限责任公司 Intra-frame predictor implementing method based on AVS parallel flow
CN101938644A (en) * 2009-07-03 2011-01-05 哈尔滨工业大学深圳研究生院 Hardware implementation method for video compression 1/2 pixel accuracy motion estimation
CN101951519A (en) * 2010-10-12 2011-01-19 西安电子科技大学 High speed de-blocking effect filtering method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1588638A (en) * 2004-08-09 2005-03-02 中芯联合(北京)微电子有限公司 Multiple mold multiple scale movement evaluation super large scale integrated circuit system structure and method
CN101383970A (en) * 2007-09-06 2009-03-11 北京中电华大电子设计有限责任公司 Intra-frame predictor implementing method based on AVS parallel flow
CN101938644A (en) * 2009-07-03 2011-01-05 哈尔滨工业大学深圳研究生院 Hardware implementation method for video compression 1/2 pixel accuracy motion estimation
CN101951519A (en) * 2010-10-12 2011-01-19 西安电子科技大学 High speed de-blocking effect filtering method

Also Published As

Publication number Publication date
CN102857758A (en) 2013-01-02

Similar Documents

Publication Publication Date Title
CN102857758B (en) Reusable pixel processing method and reusable video processing chip
CN100586180C (en) Be used to carry out the method and system of de-blocking filter
CN104205845B (en) LM patterns with unified bit wide multiplier
CN110830803B (en) Image compression method combining block matching and string matching
CN109495743B (en) Parallelization video coding method based on heterogeneous multiprocessing platform
KR20160104706A (en) Recursive block partitioning
CN103997650A (en) Video decoding method and video decoder
CN106341689B (en) A kind of optimization method and system of AVS2 quantization modules and inverse quantization module
CN101115207B (en) Method and device for implementing interframe forecast based on relativity between future positions
EP3939301A1 (en) Low displacement rank based deep neural network compression
CN109246430A (en) 360 degree of video fast intra-mode predictions of virtual reality and CU, which are divided, shifts to an earlier date decision
CN105245896A (en) HEVC (High Efficiency Video Coding) parallel motion compensation method and device
CN104521234A (en) Method and apparatus for video processing incorporating deblocking and sample adaptive offset
CN106231307B (en) A kind of compression of images intra-coding prediction method and its hardware realization
CN101383971A (en) Intra-frame prediction processing method based on image encoding and decoding
Kalali et al. An approximate HEVC intra angular prediction hardware
CN105100799A (en) Method for reducing intraframe coding time delay in HEVC encoder
CN112738522A (en) Video coding method and device
CN114697650A (en) Intra-frame division method based on down-sampling, related device equipment and medium
Asif et al. Exploiting MB level parallelism in H. 264/AVC encoder for multi-core platform
WO2021089339A1 (en) Deep intra prediction of an image block
KR20220035108A (en) Method and apparatus for video encoding and decoding using matrix-based intra prediction
CN102186082B (en) H.264 protocol based optimized decoding method for intra-frame coding compression technology
Kopperundevi et al. Methods to develop high throughput hardware architectures for HEVC Deblocking Filter using mixed pipelined-block processing techniques
CN103179392A (en) Image processing device and image processing method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20130102

Assignee: GALAXYCORE SHANGHAI Ltd.,Corp.

Assignor: SHANGHAI SUANXIN MICROELECTRONICS Co.,Ltd.

Contract record no.: 2015990000743

Denomination of invention: Reusable pixel processing method and reusable video processing chip

Granted publication date: 20150204

License type: Common License

Record date: 20150821

LICC Enforcement, change and cancellation of record of contracts on the licence for exploitation of a patent or utility model
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220727

Address after: 200120 room 11F, building 2, Lane 560, shengxia Road, Pudong New Area, Shanghai

Patentee after: GALAXYCORE SHANGHAI Ltd.,Corp.

Address before: Room 1004 and room 1005, building 2, No. 560, shengxia Road, Pudong New Area, Shanghai 201203

Patentee before: SHANGHAI SUANXIN MICROELECTRONICS Co.,Ltd.