CN103618900A - Video region-of-interest extraction method based on encoding information - Google Patents

Video region-of-interest extraction method based on encoding information Download PDF

Info

Publication number
CN103618900A
CN103618900A CN201310591430.0A CN201310591430A CN103618900A CN 103618900 A CN103618900 A CN 103618900A CN 201310591430 A CN201310591430 A CN 201310591430A CN 103618900 A CN103618900 A CN 103618900A
Authority
CN
China
Prior art keywords
frame
video
mode
current
xpn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310591430.0A
Other languages
Chinese (zh)
Other versions
CN103618900B (en
Inventor
刘鹏宇
贾克斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Hongyi Environmental Protection Technology Co ltd
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201310591430.0A priority Critical patent/CN103618900B/en
Publication of CN103618900A publication Critical patent/CN103618900A/en
Application granted granted Critical
Publication of CN103618900B publication Critical patent/CN103618900B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a video region-of-interest extraction method based on visual perception characteristics and encoding information, and relates to the field of video encoding. The video region-of-interest extraction method comprises the following steps of (1) extracting luminance information of a current encoding macro-block from a primary video stream, (2) identifying a space domain visual characteristic saliency region through an inter-frame prediction mode type of the current encoding macro-block, (3) using a mean motion vector, in the horizontal direction, of a previous encoding macro-block and a mean motion vector, in the perpendicular direction, of the previous encoding macro-block as dual dynamic thresholds, identifying a time domain visual characteristic saliency region according to the result of comparison among a motion vector, in the horizontal direction, of the current encoding macro-block, a motion vector, in the perpendicular direction, of the current encoding macro-block and the dual dynamic thresholds, and (4) defining a video interest priority through combination of the identification result of the space domain visual characteristic saliency region and the identification result of the time domain visual characteristic saliency region, and achieving automatic extraction of a region of interest of a video. According to the video region-of-interest extraction method, the important encoding basis can be provided for the video encoding technology based on the ROI.

Description

Video area-of-interest exacting method based on coded message
Technical field
The invention belongs to video information process field.Utilize video coding technique and human eye vision perception principle to realize a kind of video interested region rapid extracting method.The method can be carried out automatic analysis to the video flowing of input, utilizes coded message mark output video area-of-interest.
Background technology
H.264/AVC, up-to-date video encoding standard has adopted multiple advanced person's coding techniques, and when improving coding efficiency, its encoder complexity also sharply increases, and has limited its extensive use in multimedia signal processing and real time communication business.People to how improving H.264/AVC coding rate conduct in-depth research, and a large amount of fast coding optimized algorithms have been proposed, but But most of algorithms is not distinguished the significance level of regional in vision meaning in video image, all encoded contents are adopted to identical encoding scheme, ignored human visual system HVS(Human Visual System, HVS) otherness to video scene perception.
Optic nerve scientific research proves, and HVS has selectivity to the perception of video scene, and zones of different is had to different visual importance.Therefore, utilize existing coded message to carry out visually-perceptible signature analysis, again according to visually-perceptible feature by computational resource priority allocation to area-of-interest, to improving video coding algorithm real-time, reducing computation complexity, there is important theory significance and using value.And effective detection of visual signature analysis fast and effectively, particularly visual impression region-of-interest is Optimized Coding Based resource, writes the important foundation of efficient video coding scheme.
Summary of the invention
The present invention is different from the Moving Objects in Video Sequences extracting methods such as existing optical flow method, frame difference method, kinergety detection method, background subtraction method, that to take the coded messages such as predictive mode in video code flow, motion vector be basis, according to the relevance of coded message and visual impression region-of-interest, visual signature significance region, spatial domain and temporal signatures visual saliency region in identification video coding contents, thus realize the Automatic Logos of video interested region and obtain.
According to HVS feature, human eye is more responsive than chrominance information to monochrome information, and the inventive method, for the coded message of the luminance component in video sequence, is carried out the Automatic Logos of video interested region and obtains.
The inventive method specifically comprises the steps:
Step 1: input yuv format, GOP(Group of Picture, GOP) video sequence that structure is IPPP, reads the luminance component Y of coded macroblocks, carries out coding parameter configuration and initiation parameter;
Step 2: to the first frame of video sequence, I frame carries out intraframe predictive coding;
In video encoding standard, I frame is as the reference point of random access, contain bulk information, because can not utilizing the temporal correlation between consecutive frame, it encodes, thereby employing intra-frame predictive encoding method, utilize the coded message of own coding and rebuilding macro block in present frame to predict current macro, to eliminate spatial redundancy.To the first frame of video sequence, to carry out intraframe predictive coding be a kind of conventional coded system habitual in Video coding to I frame.
Step 3: current p frame is carried out to inter prediction encoding, utilize the correlation of consecutive frame video content to eliminate time redundancy.The inter-frame forecast mode type that records all coded macroblockss in present frame, is designated as Mode pn;
Wherein, p=1,2,3 ..., L-1, represents p frame of video of carrying out interframe encode, L is the totalframes that whole video sequence is encoded; N is illustrated in the sequence number of n coded macroblocks in current encoded frame.
Step 4: identify the visual signature significance region, spatial domain of current p frame, be specially: if the inter-frame forecast mode Mode of current coding macro block pnbelong to sub-split set of modes or intra prediction mode set, i.e. Mode pn∈ 8 * 8,8 * 4,4 * 8, and 4 * 4}or{Intra16 * 16, Intra4 * 4}, is labeled as S by this macro block yp(x, y, Mode pn)=1, belongs to visual signature significance region, spatial domain, otherwise mark S yp(x, y, Mode pn)=0; Wherein, the luminance component of Y presentation code macro block, (x, y) represents the position coordinates of this coded macroblocks, p and Mode pndefinition the same, travel through all coded macroblockss in current p frame;
Fig. 1 has provided H.264 standard inter-frame forecast mode and has selected schematic flow sheet.
Through experiment, find, in standard code H.264/AVC, between predictive coding result and human eye area-of-interest, to there is strong correlation: for the higher moving region of human eye attention rate or texture-rich region, Mode pnlarge more options sub-split set of modes 8 * 8,8 * 4,4 * 8,4 * 4}; At camera lens, switch, video content is undergone mutation, or while there is the larger Moving Objects of motion amplitude, human eye attention rate is the highest, now Mode pnjust can select intra prediction mode set { Intra16 * 16, Intra4 * 4}; For the lower background smooth region of human eye attention rate, Mode pnlarge more options macroblock partition set of modes { Skip, 16 * 16,16 * 8,8 * 16}.Fig. 2 be take Claire sequence as example, has provided Claire sequence the 50th frame inter-frame forecast mode distribution map, can find from figure in the higher region of human eye attention rate, and coded macroblocks has mostly been selected the set of interframe sub-split predictive mode.
Step 5: record each coded macroblocks motion vector V in the horizontal direction in p frame xpnmotion vector V in vertical direction ypn; And calculate all coded macroblockss average motion vector in the horizontal direction in previous coded frame
Figure BDA0000419026700000031
and the average motion vector in vertical direction
Figure BDA0000419026700000041
Wherein, V ‾ x ( p - 1 ) th = Σ n = 1 Num V x ( p - 1 ) n Num , V ‾ y ( p - 1 ) th = Σ n = 1 Num V y ( p - 1 ) n Num ; V x (p-1) nand V y (p-1) nrepresent each coded macroblocks motion vector in the horizontal and vertical directions in previous coded frame, the definition of p and n is identical with step 3; Num represents the macro block number comprising in a coded frame, namely accumulative frequency.It is example that Fig. 3 be take the video of QCIF form (176 * 144), has provided position and the sequence number n thereof of all coded macroblockss (16 * 16) in a coded frame, now, Num = 176 16 × 144 16 = 11 × 9 = 99 .
Step 6: identify the time domain visual signature significance region of current p frame, be specially: if the horizontal motion vector V of current coding macro block xpnbe greater than former frame coded macroblocks motion vector mean value in the horizontal direction
Figure BDA0000419026700000044
or the movement in vertical direction vector V of current coding macro block ypnbe greater than former frame coded macroblocks motion vector mean value in the vertical direction
Figure BDA0000419026700000045
this macro block belongs to time domain visual signature significance region, mark T yp(x, y, V xpn, V ypn)=1, otherwise mark T yp(x, y, V xpn, V ypn)=0, travels through all coded macroblockss in current p frame;
Wherein, the luminance component of Y presentation code macro block, (x, y) represents the position coordinates of this coded macroblocks, the definition of p is identical with step 3.
Motion perception is one of most important visual processes mechanism in human visual system.Through experiment, the encoded content that discovery has larger motion vector is the interested moving region of human eye (as head, arm, personage etc.) just; And motion vector less be even zero the encoded content static background region that human eye attention rate is lower just.Fig. 4 be take Akiyo sequence as example, has provided Akiyo sequence the 50th frame motion vector distribution map, can find that coded macroblocks has larger motion vector conventionally in the higher people's face of human eye attention rate and head shoulder region from figure.
Acutely whether, the setting of decision threshold is larger on the impact of result for the movement degree of current coding macro block.For reducing False Rate, the present invention is designated as the movement degree decision threshold of horizontal direction and vertical direction respectively
Figure BDA0000419026700000051
with
Figure BDA0000419026700000052
Figure BDA0000419026700000053
represent all coded macroblockss average motion vector in the horizontal direction in former frame,
Figure BDA0000419026700000054
represent all coded macroblockss average motion vector in vertical direction in former frame.The setting of dynamic threshold in the present invention, taken into full account the temporal correlation of video sequence, threshold value can be changed with the variation of former frame coded macroblocks motion vector mean value, effectively reduced erroneous judgement, can obtain quickly and accurately time domain visual signature significance region.
Step 7: the video interested region of the current p frame of mark, is specially: travel through all coded macroblockss in current p frame, carry out mark according to the spatial feature significance of each coded macroblocks and time domain visual signature significance, concrete mark formula is as follows:
ROI Yp ( x , y ) = 3 , S Yp ( x , y , Mode pn ) = 1 | | T Yp ( x , y , V xpn , V ypn ) = 1 2 , S Yp ( x , y , Mode pn ) = 0 | | T Yp ( x , y , V xpn , V ypn ) = 1 1 , S Yp ( x , y , Mode pn ) = 1 | | T Yp ( x , y , V xpn , V ypn ) = 0 0 , S Yp ( x , y , Mode pn ) = 0 | | T Yp ( x , y , V xpn , V ypn ) = 0
Marking video area-of-interest, is divided into following a few class situation:
If current coding macro block has spatial domain and time domain visual signature significance, i.e. S simultaneously yp(x, y, Mode pn)=1 and T yp(x, y, V xpn, V ypn)=1, current coding macro block is described, and not only grain details is abundant, and has produced larger motion vector, and human eye interest level is the highest, mark ROI yp(x, y)=3;
If only there is time domain visual signature significance, do not there is spatial domain visual signature significance, i.e. T yp(x, y, V xpn, V ypn)=1 and S yp(x, y, Mode pn)=0, illustrates that current coding macro block has produced larger motion vector, and according to the Perception Features of HVS, human eye has high susceptibility to the motion of object, and human eye interest level takes second place, mark ROI yp(x, y)=2;
If macro block movement degree is lower, do not there is time domain visual signature significance, but there is abundant texture information, only there is spatial domain visual signature significance, i.e. S yp(x, y, Mode pn)=1 and T yp(x, y, V xpn, V ypn)=0, human eye interest level again, mark ROI yp(x, y)=1;
If neither there is spatial domain visual signature significance, do not there is time domain visual signature significance, i.e. S yet yp(x, y, Mode pn)=0 and T yp(x, y, V xpn, V ypn)=0, illustrates that current coding macro block texture is smooth, mild or static, the normally static background area of moving, and is the non-area-of-interest of human eye, and human eye interest level is minimum, mark ROI yp(x, y)=0;
Wherein, ROI yp(x, y) represents current coding macro block visual impression interest priority; T yp(x, y, V xpn, V ypn) represent the time domain visual signature significance of current coding macro block; S yp(x, y, Mode pn) represent the spatial domain visual signature significance of current coding macro block; (x, y) represents the position coordinates of current coding macro block; Y represents the luminance component of macro block; P represents p frame of video of carrying out interframe encode; N is illustrated in the sequence number of n coded macroblocks in current encoded frame.
Step 8: output video encoding code stream, is specially: according to the ROI of mark yp(x, y) priority level height interested, does following processing to the luminance component Y of all macro blocks in current p frame, and the video flowing after output token,
Y p ( x , y ) = 255 , ROI Yp ( x , y ) = 3 150 , ROI Yp ( x , y ) = 2 100 , ROI Yp ( x , y ) = 1 0 , ROI Yp ( x , y ) = 0
Because the span of the luminance component of coded macroblocks is Y ∈ [0,255], from 0 to 255 represents that macro block brightness component is from complete black in complete 256 white ranks.According to the ROI of mark yp(x, y) priority level height interested, the luminance component Y that the present invention is directed to macro block does following processing, and the video flowing after output token.
If ROI yp(x, y)=3, interest level is the highest, and human eye attention rate is the highest, and the luminance component of this coded macroblocks is made as to 255, and output macro Block Brightness component value is the highest, i.e. Y p(x, y)=255;
If ROI yp(x, y)=2, interest level takes second place, and human eye attention rate is higher, and the luminance component of this coded macroblocks is made as to 150, and output macro Block Brightness component value is higher, i.e. Y p(x, y)=150;
If ROI yp(x, y)=1, again, human eye attention rate is lower for interest level, and the luminance component of this coded macroblocks is made as to 100, and output macro Block Brightness component value is lower, i.e. Y p(x, y)=100;
If ROI yp(x, y)=0, moral sense region-of-interest, human eye attention rate is minimum, and the luminance component of this coded macroblocks is made as to 0, and output macro Block Brightness component value is minimum, i.e. Y p(x, y)=0.
Step 9: return to step 3, next frame is processed, until travel through whole video sequence.
Fig. 5 has provided video interested region sign and extracting method flow chart.
Fig. 6 has provided the video interested region Output rusults after the mark of exemplary video sequence.Beneficial effect
This method according to basic coding information realization the rapid extraction of video interested region.This method is utilized the relevance between basic coding information and human eye vision area-of-interest, identify respectively visual signature significance region, spatial domain and temporal signatures visual saliency region in video coding contents, again in conjunction with the sign result in spatial domain and time domain visual signature significance region, define video interested priority, finally realized video interested automatic extraction.The video coding technique that the inventive method can be based on region of interest ROI (Region of Interest, ROI) provides important coding basis.
Accompanying drawing explanation
Fig. 1 .H.264 standard inter-frame forecast mode is selected schematic flow sheet;
Fig. 2 .Claire sequence the 50th frame inter-frame forecast mode distribution map;
Fig. 3. the position of each coded macroblocks and sequence number schematic diagram thereof in a frame of video;
Fig. 4 .Akiyo sequence the 50th frame motion vector distribution map;
Fig. 5. the inventive method flow chart;
Fig. 6. utilize the Output rusults schematic diagram of the inventive method marking video area-of-interest.
Embodiment
More responsive than chrominance information to monochrome information in view of human eye, the inventive method is encoded for the luminance component of frame of video.First read in video sequence, extract its luminance component, call Automatic Logos and extraction that video interested region extraction module of the present invention completes area-of-interest.
In the invention process, be to adopt video capture device (as Digital Video etc.) to realize the collection of video image, and by picture transmission to computer, in computer, according to the coded message in video code flow, realize the Automatic Logos of video interested region.Visual signature significance region, predictive coding pattern identification spatial domain according to current coding macro block; The motion vector in horizontal or vertical direction according to current coding macro block again, sign time domain visual signature significance region, reduces due to the impact of different video motion types for region of interesting extraction accuracy by setting dynamic motion vector decision threshold; Finally according to spatial domain/time domain visual signature significance, obtain video interested classification results, realize the automatic extraction of video interested region.
In concrete enforcement, in computer, complete following program:
The first step: encoder.cfg reads in video sequence according to coding configuration file, according to the parameter configuration encoder in configuration file.For example: complete video code flow structure GOP=IPPP Coding frame number FramesToBeEncoded=100; Frame per second FrameRate=30f/s; Video file width S ourceWidth=176, height SourceHeight=144; Output file title OutputFile=ROI.264; Quantization step value QPISlice=28, QPPSlice=28; Motion estimation search scope SearchRange=± 16; Reference frame number NumberReferenceFrames=5; Activity ratio distortion cost function RDOptimization=on; The parameter configuration such as entropy type of coding SymbolMode=CAVLC are set, the initiation parameter L=frame number of encoding, p=1;
Second step: read frame by frame in order coded macroblocks luma component values Y from input video sequence;
The 3rd step: to the first frame of video sequence, I frame carries out intraframe predictive coding;
The 4th step: current p frame is carried out to inter prediction encoding; Record the inter-frame forecast mode type Mode of current coding macro block pn; Wherein, p=1,2,3 ..., L-1, represents p frame of video of carrying out interframe encode, L is the totalframes that whole video sequence is encoded; N is illustrated in the sequence number of n coded macroblocks in current encoded frame.
The 5th step: sign visual signature significance region, spatial domain, if the inter-frame forecast mode Mode of current coding macro block pnbelong to sub-split set of modes or intra prediction mode set, Mode pn∈ 8 * 8,8 * 4,4 * 8, and 4 * 4}or{Intra16 * 16, Intra4 * 4}, is labeled as S by this macro block yp(x, y, Mode pn)=1, belongs to visual signature significance region, spatial domain, otherwise mark S yp(x, y, Mode pn)=0;
S ( x , y , Mode pn ) = 1 , Mode pn ∈ { 8 × 8,8 × 4,4 × 8,4 × 4 } or { Intra 16 × 16 , Intra 4 × 4 } 0 , else
The 6th step: if each coded macroblocks motion vector V in the horizontal direction in p frame is recorded in p ≠ 1 xpnmotion vector V in vertical direction ypn; And calculate all coded macroblockss average motion vector in the horizontal direction in previous coded frame
Figure BDA0000419026700000102
and the average motion vector in vertical direction
Figure BDA0000419026700000103
otherwise, jump to the tenth step;
The 7th step: sign time domain visual signature significance region, if the horizontal motion vector V of current coding macro block xpnbe greater than former frame coded macroblocks motion vector mean value in the horizontal direction
Figure BDA0000419026700000104
or the movement in vertical direction vector V of current coding macro block ypnbe greater than former frame coded macroblocks motion vector mean value in the vertical direction
Figure BDA0000419026700000105
meet wherein any one criterion, this macro block belongs to time domain visual signature significance region, mark T yp(x, y, V xpn, V ypn)=1, otherwise mark T yp(x, y, V xpn, V ypn)=0;
T Yp ( x , y , V xpn , V ypn ) = 1 , V xpn > V ‾ x ( p - ) th or V ypn > V ‾ t ( p - 1 ) th 0 , else
The 8th step: marking video area-of-interest.
ROI Yp ( x , y ) = 3 , S Yp ( x , y , Mode pn ) = 1 | | T Yp ( x , y , V xpn , V ypn ) = 1 2 , S Yp ( x , y , Mode pn ) = 0 | | T Yp ( x , y , V xpn , V ypn ) = 1 1 , S Yp ( x , y , Mode pn ) = 1 | | T Yp ( x , y , V xpn , V ypn ) = 0 0 , S Yp ( x , y , Mode pn ) = 0 | | T Yp ( x , y , V xpn , V ypn ) = 0
If current coding macro block has spatial domain and time domain visual signature significance, i.e. S simultaneously yp(x, y, Mode pn)=1 and T yp(x, y, V xpn, V ypn)=1, human eye interest level is the highest, mark ROI yp(x, y)=3;
If only there is time domain visual signature significance, i.e. T yp(x, y, V xpn, V ypn)=1 and S yp(x, y, Mode pn)=0, human eye interest level takes second place, mark ROI yp(x, y)=2;
If only there is spatial domain visual signature significance, i.e. S yp(x, y, Mode pn)=1 and T yp(x, y, V xpn, V ypn)=0, human eye interest level again, mark ROI yp(x, y)=1;
If neither there is spatial domain visual signature significance, do not there is time domain visual signature significance, i.e. S yet yp(x, y, Mode pn)=0 and T yp(x, y, V xpn, V ypn)=0, is the non-area-of-interest of human eye, mark ROI yp(x, y)=0;
The 9th step: output video encoding code stream.
Y p ( x , y ) = 255 , ROI Yp ( x , y ) = 3 150 , ROI Yp ( x , y ) = 2 100 , ROI Yp ( x , y ) = 1 0 , ROI Yp ( x , y ) = 0
If ROI yp(x, y)=3, interest level is the highest, and human eye attention rate is the highest, and the luminance component of this coded macroblocks is made as to 255, and output macro Block Brightness component value is the highest, i.e. Y p(x, y)=255;
If ROI yp(x, y)=2, interest level takes second place, and human eye attention rate is higher, and the luminance component of this coded macroblocks is made as to 150, and output macro Block Brightness component value is higher, i.e. Y p(x, y)=150;
If ROI yp(x, y)=1, again, human eye attention rate is lower for interest level, and the luminance component of this coded macroblocks is made as to 100, and output macro Block Brightness component value is lower, i.e. Y p(x, y)=100;
If ROI yp(x, y)=0, moral sense region-of-interest, human eye attention rate is minimum, and the luminance component of this coded macroblocks is made as to 0, and output macro Block Brightness component value is minimum, i.e. Y p(x, y)=0.
The tenth step: if p ≠ L-1, p=p+1, jumps to the 3rd step; Otherwise, finish coding.
Utilize the Output rusults schematic diagram of the inventive method marking video area-of-interest, as shown in Figure 6.Take typical video monitoring sequence (Hall) and indoor activity video sequence (Salesman) is example, utilize motion vector distribution result and inter-frame forecast mode selection result, marking video area-of-interest, if the human eye interest level of certain macro block is higher, in output video, the brightness value of this position is higher, otherwise brightness value is lower.From Fig. 6, the mark result of the rightmost side one row can be found, it is irregular adopting the shape of the video interested region of the inventive method acquisition, compare with the area-of-interest that the moving target detecting method of traditional employing solid shape template obtains, the inventive method mark result more approaches the interesting target shape that human eye is paid close attention to, more accurately mark area-of-interest.
The inventive method also can be combined with other fast coding technology, under guaranteeing the prerequisite of human eye encoding region of interest quality, reduce the uninterested background area of human eye encoder complexity, further reduce the scramble time, also can be used in the scalable coding based on H.264, the selectivity that realizes area-of-interest strengthens coding.

Claims (1)

1. the video area-of-interest exacting method based on coded message, is characterized in that comprising the steps:
Step 1: input yuv format, GOP(Group of Picture, GOP) video sequence that structure is IPPP, reads the luminance component Y of coded macroblocks, carries out coding parameter configuration;
Step 2: to the first frame of video sequence, I frame carries out intraframe predictive coding;
Step 3: current p frame is carried out to inter prediction encoding, record the inter-frame forecast mode type of all coded macroblockss in current p frame, be designated as Mode pn; P=1,2,3 ..., L-1, represents p frame of video of carrying out interframe encode, L is the totalframes that whole video sequence is encoded; N is illustrated in the sequence number of n coded macroblocks in current encoded frame;
Step 4: identify the visual signature significance region, spatial domain of current p frame, be specially: if the inter-frame forecast mode Mode of current coding macro block pnbelong to sub-split set of modes or intra prediction mode set, i.e. Mode pn∈ 8 * 8,8 * 4,4 * 8, and 4 * 4}or{Intra16 * 16, Intra4 * 4}, is labeled as S by this macro block yp(x, y, Mode pn)=1, belongs to visual signature significance region, spatial domain, otherwise mark S yp(x, y, Mode pn)=0; The luminance component of Y presentation code macro block, (x, y) represents the position coordinates of this coded macroblocks, travels through all coded macroblockss in current p frame;
Step 5: record each coded macroblocks motion vector V in the horizontal direction in p frame xpnmotion vector V in vertical direction ypn; And calculate all coded macroblockss average motion vector in the horizontal direction in previous coded frame
Figure FDA0000419026690000011
and the average motion vector in vertical direction
Figure FDA0000419026690000012
num represents the macro block number comprising in a coded frame, i.e. accumulative frequency; Step 6: identify the time domain visual signature significance region of current p frame, be specially: if the horizontal motion vector V of current coding macro block xpnbe greater than former frame coded macroblocks motion vector mean value in the horizontal direction
Figure FDA0000419026690000022
or the movement in vertical direction vector V of current coding macro block ypnbe greater than former frame coded macroblocks motion vector mean value in the vertical direction
Figure FDA0000419026690000023
this macro block belongs to time domain visual signature significance region, mark T yp(x, y, V xpn, V ypn)=1, otherwise mark T yp(x, y, V xpn, V ypn)=0, travels through all coded macroblockss in current p frame; Step 7: the video interested region of the current p frame of mark, is specially: travel through all coded macroblockss in current p frame, carry out mark according to the spatial feature significance of each coded macroblocks and time domain visual signature significance, concrete mark formula is as follows: ROI Yp ( x , y ) = 3 , S Yp ( x , y , Mode pn ) = 1 | | T Yp ( x , y , V xpn , V ypn ) = 1 2 , S Yp ( x , y , Mode pn ) = 0 | | T Yp ( x , y , V xpn , V ypn ) = 1 1 , S Yp ( x , y , Mode pn ) = 1 | | T Yp ( x , y , V xpn , V ypn ) = 0 0 , S Yp ( x , y , Mode pn ) = 0 | | T Yp ( x , y , V xpn , V ypn ) = 0 If current coding macro block has spatial domain and time domain visual signature significance, i.e. S simultaneously yp(x, y, Mode pn)=1 and T yp(x, y, V xpn, V ypn)=1, mark ROI yp(x, y)=3; If current coding macro block only has time domain visual signature significance, do not there is spatial domain visual signature significance, i.e. T yp(x, y, V xpn, V ypn)=1 and S yp(x, y, Mode pn)=0, mark ROI yp(x, y)=2; If current coding macro block does not have time domain visual signature significance, only there is spatial domain visual signature significance, i.e. S yp(x, y, Mode pn)=1 and T yp(x, y, V xpn, V ypn)=0, mark ROI yp(x, y)=1; If current coding macro block neither has spatial domain visual signature significance and does not also have time domain visual signature significance, i.e. S yp(x, y, Mode pn)=0 and T yp(x, y, V xpn, V ypn)=0, mark ROI yp(x, y)=0; Step 8: output video encoding code stream, is specially: according to the ROI of mark yp(x, y) priority level height interested, does following processing to the luminance component Y of all macro blocks in current p frame, and the video flowing after output token, Y p ( x , y ) = 255 , ROI Yp ( x , y ) = 3 150 , ROI Yp ( x , y ) = 2 100 , ROI Yp ( x , y ) = 1 0 , ROI Yp ( x , y ) = 0 Step 9: return to step 3, next frame is processed, until travel through whole video sequence.
CN201310591430.0A 2013-11-21 2013-11-21 Video area-of-interest exacting method based on coding information Active CN103618900B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310591430.0A CN103618900B (en) 2013-11-21 2013-11-21 Video area-of-interest exacting method based on coding information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310591430.0A CN103618900B (en) 2013-11-21 2013-11-21 Video area-of-interest exacting method based on coding information

Publications (2)

Publication Number Publication Date
CN103618900A true CN103618900A (en) 2014-03-05
CN103618900B CN103618900B (en) 2016-08-17

Family

ID=50169604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310591430.0A Active CN103618900B (en) 2013-11-21 2013-11-21 Video area-of-interest exacting method based on coding information

Country Status (1)

Country Link
CN (1) CN103618900B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104079934A (en) * 2014-07-14 2014-10-01 武汉大学 Method for extracting regions of interest in real-time video communication
CN104539962A (en) * 2015-01-20 2015-04-22 北京工业大学 Layered video coding method fused with visual perception features
CN106331711A (en) * 2016-08-26 2017-01-11 北京工业大学 Dynamic bit rate control method based on network feature and video feature
CN107371029A (en) * 2017-06-28 2017-11-21 上海大学 Video packet priority distribution method based on content
CN107483934A (en) * 2017-08-17 2017-12-15 西安万像电子科技有限公司 Decoding method, device and system
CN107563371A (en) * 2017-07-17 2018-01-09 大连理工大学 The method of News Search area-of-interest based on line laser striation
CN107623848A (en) * 2017-09-04 2018-01-23 浙江大华技术股份有限公司 A kind of method for video coding and device
CN109151479A (en) * 2018-08-29 2019-01-04 南京邮电大学 Significance extracting method based on H.264 compression domain model with feature when sky
CN109379594A (en) * 2018-10-31 2019-02-22 北京佳讯飞鸿电气股份有限公司 Video coding compression method, device, equipment and medium
CN109862356A (en) * 2019-01-17 2019-06-07 中国科学院计算技术研究所 A kind of method for video coding and system based on area-of-interest
CN110572579A (en) * 2019-09-30 2019-12-13 联想(北京)有限公司 image processing method and device and electronic equipment
CN110784716A (en) * 2019-08-19 2020-02-11 腾讯科技(深圳)有限公司 Media data processing method, device and medium
CN111079567A (en) * 2019-11-28 2020-04-28 中科驭数(北京)科技有限公司 Sampling method, model generation method, video behavior identification method and device
WO2021093059A1 (en) * 2019-11-15 2021-05-20 网宿科技股份有限公司 Method, system and device for recognizing region of interest
WO2022127865A1 (en) * 2020-12-18 2022-06-23 中兴通讯股份有限公司 Video processing method, apparatus, electronic device, and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11112973A (en) * 1997-10-01 1999-04-23 Matsushita Electric Ind Co Ltd Device and method for converting video signal
CN101640802A (en) * 2009-08-28 2010-02-03 北京工业大学 Video inter-frame compression coding method based on macroblock features and statistical properties
US20120020407A1 (en) * 2010-07-20 2012-01-26 Vixs Systems, Inc. Resource adaptive video encoding system with region detection and method for use therewith
CN102510496A (en) * 2011-10-14 2012-06-20 北京工业大学 Quick size reduction transcoding method based on region of interest
CN102740073A (en) * 2012-05-30 2012-10-17 华为技术有限公司 Coding method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11112973A (en) * 1997-10-01 1999-04-23 Matsushita Electric Ind Co Ltd Device and method for converting video signal
CN101640802A (en) * 2009-08-28 2010-02-03 北京工业大学 Video inter-frame compression coding method based on macroblock features and statistical properties
US20120020407A1 (en) * 2010-07-20 2012-01-26 Vixs Systems, Inc. Resource adaptive video encoding system with region detection and method for use therewith
CN102510496A (en) * 2011-10-14 2012-06-20 北京工业大学 Quick size reduction transcoding method based on region of interest
CN102740073A (en) * 2012-05-30 2012-10-17 华为技术有限公司 Coding method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘鹏宇 贾克斌: "视频感兴趣区域快速提取与编码算法", 《电路与系统学报》 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104079934A (en) * 2014-07-14 2014-10-01 武汉大学 Method for extracting regions of interest in real-time video communication
US10313692B2 (en) 2015-01-20 2019-06-04 Beijing University Of Technology Visual perception characteristics-combining hierarchical video coding method
CN104539962A (en) * 2015-01-20 2015-04-22 北京工业大学 Layered video coding method fused with visual perception features
WO2016115968A1 (en) * 2015-01-20 2016-07-28 北京工业大学 Visual perception feature-fused scaled video coding method
CN104539962B (en) * 2015-01-20 2017-12-01 北京工业大学 It is a kind of merge visually-perceptible feature can scalable video coding method
CN106331711A (en) * 2016-08-26 2017-01-11 北京工业大学 Dynamic bit rate control method based on network feature and video feature
CN106331711B (en) * 2016-08-26 2019-07-05 北京工业大学 A kind of dynamic code rate control method based on network characterization and video features
CN107371029A (en) * 2017-06-28 2017-11-21 上海大学 Video packet priority distribution method based on content
CN107371029B (en) * 2017-06-28 2020-10-30 上海大学 Video packet priority distribution method based on content
CN107563371A (en) * 2017-07-17 2018-01-09 大连理工大学 The method of News Search area-of-interest based on line laser striation
CN107563371B (en) * 2017-07-17 2020-04-07 大连理工大学 Method for dynamically searching interesting region based on line laser light strip
CN107483934A (en) * 2017-08-17 2017-12-15 西安万像电子科技有限公司 Decoding method, device and system
CN107623848A (en) * 2017-09-04 2018-01-23 浙江大华技术股份有限公司 A kind of method for video coding and device
CN107623848B (en) * 2017-09-04 2019-11-19 浙江大华技术股份有限公司 A kind of method for video coding and device
CN109151479A (en) * 2018-08-29 2019-01-04 南京邮电大学 Significance extracting method based on H.264 compression domain model with feature when sky
CN109379594A (en) * 2018-10-31 2019-02-22 北京佳讯飞鸿电气股份有限公司 Video coding compression method, device, equipment and medium
CN109862356A (en) * 2019-01-17 2019-06-07 中国科学院计算技术研究所 A kind of method for video coding and system based on area-of-interest
CN109862356B (en) * 2019-01-17 2020-11-10 中国科学院计算技术研究所 Video coding method and system based on region of interest
CN110784716A (en) * 2019-08-19 2020-02-11 腾讯科技(深圳)有限公司 Media data processing method, device and medium
CN110784716B (en) * 2019-08-19 2023-11-17 腾讯科技(深圳)有限公司 Media data processing method, device and medium
CN110572579A (en) * 2019-09-30 2019-12-13 联想(北京)有限公司 image processing method and device and electronic equipment
WO2021093059A1 (en) * 2019-11-15 2021-05-20 网宿科技股份有限公司 Method, system and device for recognizing region of interest
CN111079567A (en) * 2019-11-28 2020-04-28 中科驭数(北京)科技有限公司 Sampling method, model generation method, video behavior identification method and device
CN111079567B (en) * 2019-11-28 2020-11-13 中科驭数(北京)科技有限公司 Sampling method, model generation method, video behavior identification method and device
WO2022127865A1 (en) * 2020-12-18 2022-06-23 中兴通讯股份有限公司 Video processing method, apparatus, electronic device, and storage medium

Also Published As

Publication number Publication date
CN103618900B (en) 2016-08-17

Similar Documents

Publication Publication Date Title
CN103618900A (en) Video region-of-interest extraction method based on encoding information
Zhao et al. Real-time moving object segmentation and classification from HEVC compressed surveillance video
CN104378643B (en) A kind of 3D video depths image method for choosing frame inner forecast mode and system
KR101823537B1 (en) Method of identifying relevant areas in digital images, method of encoding digital images, and encoder system
CN102186070B (en) Method for realizing rapid video coding by adopting hierarchical structure anticipation
CN101783957B (en) Method and device for predictive encoding of video
CN1265321C (en) Method of and system for detecting cartoon in video data stream
CN100499813C (en) Advanced video coding (AVC) intra-frame prediction system and method
WO2016173277A1 (en) Video coding and decoding methods and apparatus
CN111355956B (en) Deep learning-based rate distortion optimization rapid decision system and method in HEVC intra-frame coding
CN102724554B (en) Scene-segmentation-based semantic watermark embedding method for video resource
JP7213662B2 (en) Image processing device, image processing method
CN104065962A (en) Macroblock layer bit allocation optimization method based on visual attention
CN104093021A (en) Monitoring video compression method
US20150264357A1 (en) Method and system for encoding digital images, corresponding apparatus and computer program product
CN101478675A (en) Semantic events detection method and system in video
CN105872556B (en) Video encoding method and apparatus
CN106331730B (en) H.264 video same quantization factor double compression detection method
CN103051891A (en) Method and device for determining a saliency value of a block of a video frame block-wise predictive encoded in a data stream
CN101277447A (en) Method for rapidly predicting frame space of aerophotographic traffic video
CN102984524B (en) A kind of video coding-decoding method based on block layer decomposition
KR102090775B1 (en) method of providing extraction of moving object area out of compressed video based on syntax of the compressed video
US8644388B2 (en) Method and device for approximating a DC coefficient of a block of pixels of a frame
CN100499739C (en) Scene change real-time detecting method based on compression field
Tong et al. Encoder combined video moving object detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240130

Address after: 073099 Room 309, 3rd Floor, Commercial and Residential Building B, Xinhai Science and Technology Plaza, East Side of Beimen Street and South Side of Beimen Street Market, Dingzhou City, Baoding City, Hebei Province

Patentee after: HEBEI HONGYI ENVIRONMENTAL PROTECTION TECHNOLOGY Co.,Ltd.

Country or region after: China

Address before: 100124 No. 100 Chaoyang District Ping Tian Park, Beijing

Patentee before: Beijing University of Technology

Country or region before: China