CN103716640B - Method and device for detecting frame type - Google Patents

Method and device for detecting frame type Download PDF

Info

Publication number
CN103716640B
CN103716640B CN201310664666.2A CN201310664666A CN103716640B CN 103716640 B CN103716640 B CN 103716640B CN 201310664666 A CN201310664666 A CN 201310664666A CN 103716640 B CN103716640 B CN 103716640B
Authority
CN
China
Prior art keywords
frame
type
present
data volume
packet loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310664666.2A
Other languages
Chinese (zh)
Other versions
CN103716640A (en
Inventor
沈秋
谢清鹏
张冬
李厚强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Huawei Technologies Co Ltd
Original Assignee
University of Science and Technology of China USTC
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC, Huawei Technologies Co Ltd filed Critical University of Science and Technology of China USTC
Priority to CN201310664666.2A priority Critical patent/CN103716640B/en
Priority claimed from CN201010594322.5A external-priority patent/CN102547300B/en
Publication of CN103716640A publication Critical patent/CN103716640A/en
Application granted granted Critical
Publication of CN103716640B publication Critical patent/CN103716640B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

An embodiment of the invention discloses a method and a device for detecting a frame type, wherein the method comprises obtaining the coding type of a code stream where a received frame is, the code type includes open loop coding and closed loop coding; if the data size of the current frame is larger than a first threshold value, the current frame is determined to be an obvious intraframe coding frame I; if a former frame of the current frame is Frame I, the coding type is closed loop coding and the current frame is a unobvious Frame I, or, if the former frame of the current frame is Frame I, the coding type is open loop coding and the data size of the current frame is larger than a fourth threshold value, the current frame is determined to be a one-way predictive coding frame P; and if the current frame is neither Frame I nor Frame P, the current frame is determined to be a B frame. The technical scheme provided by the embodiment of the invention combines the relationship of the sizes of former and latter data sizes of different types of frames, and judges the frame type under the circumstance of not decoding a net load, thereby eliminating the influence of an attenuation factor, and improving the accuracy of detection of the frame type.

Description

The detection method of frame type and device
Technical field
The present invention relates to technical field of video processing, the detection method particularly to frame type and device.
Background technology
Decodable code data frame type in video encoding standard can be divided into intracoded frame (i-frame, intra coded Frames, i frame), single directional prediction coded frame (p-frame, predicted frames, p frame), bi-directional predictive coding frame (b- Frame, bi-directional predicted frames, b frame).In Video Applications, i frame initiates as decodable, Commonly referred to as random access points, it is possible to provide Stochastic accessing and fast browsing etc. service.In transmitting procedure, different frame type Error, the impact to the subjective quality of decoding end is different, and i frame has the effect of truncated error propagation, therefore, if i frame Error, then affect greatly on the decoding quality of whole video;The p frame often reference frame as other inter-frame encoding frame, its work With inferior to i frame;Because b frame is not generally as reference frame, it loses less on the impact of video decoding quality.Therefore, pass in video The different frame type distinguishing data flow in defeated application has very important meaning, such as: as the important ginseng of video quality assessment Number, the accuracy that frame type judges directly influences the accuracy of assessment result;Frame different types of in video can be carried out Equal difference is not protected and to be realized effective transmission of video, in addition to saving transfer resource, can abandon when bandwidth is not enough The frame little on subjective quality impact.
The conventional transferring technology that spreads is mainly internet stream media alliance (internet streaming media Alliance, isma) mode and work motion video experts group transport stream (the moving picture expert on Internet protocol Group-2transport stream over internet protocol, mpeg-2 ts over ip) mode, both Protocol mode, when being packaged compressed video data stream, all devises the indicating bit that can indicate video data type.isma Mode is that compressed video data stream directly adopts RTP (real-time transport protocol, rtp) It is packaged, wherein mpeg-4 part2 follows internet standard 3016(request for comments3016, Rfc3016), h.264/ audition and visual signal coding (aural and visual code, avc) follow rfc3984, with As a example rfc3984, serial number (sequence number) that rtp head comprises, timestamp (timestamp) etc. can be used to Judge frame losing and help detection frame type;Mpeg-2ts over ip mode also divides two kinds: on user datagram protocol/ip Transport stream (ts over user datagram protocol/ip, ts over udp/ip) and RTP/udp/ Transport stream (ts over real-time transport protocol/udp/ip, ts over rtp/udp/ip) on ip, The more commonly used in transmission of video is abbreviation ts over rtp after ts over rtp/udp/ip(), it is will to compress video Data stream is basic flow, further basic flow is divided into ts packet, finally ts packet is packaged with rtp and passes Defeated.
Rtp is a kind of host-host protocol for multimedia data stream, responsible offer real-time Data Transmission end to end, its report Literary composition mainly includes four parts: rtp head, rtp extension header, net carrier head, only carries data.The data comprising in rtp head mainly has: Serial number, timestamp, flag bit etc..Serial number is corresponded with rtp bag, often sends a bag increase by 1, can be used for detection and lose Bag;Timestamp can represent the sampling time of video data, and different frames has different timestamps, may indicate that broadcasting of video data Put forward sequence;Flag bit is then used for identifying the end of a frame.These information are the important evidence that frame type judges.
One ts packet has 188 bytes, is made up of packet header, variable-length adapter head and payload data, wherein The initial indicating bit (payload unit start indicator, pusi) of packet header represents whether payload data comprises Stream of packets (packet elementary stream, the pes) packet header of packing or Program Specific Information (program special Information, psi).For h.264 media formats, each pes packet header imply that the beginning of a nal unit.Ts is grouped Some flag bits in self adaptation section, such as: Stochastic accessing instruction (random access indicator), basic flow are preferential Level instruction (elementary stream priority indicator), can be used to judge the importance of transferring content, right For video, Stochastic accessing is designated as comprising sequence start information, basic flow in first pes bag that 1 expression subsequently encounters Priority indication is to have more intra block number evidence in this ts packet payload of 1 expression.
If judging that ts packet payload part comprises pes packet header by pusi, can excavate further has to transmission Information.Pes packet is made up of pes packet header and subsequent grouped data, and original stream data (video, audio frequency etc.) loads In pes bag data.Pes packet is inserted in transport stream packet, and the first character section of each pes packet header is exactly to transmit flow point The first character section of group payload.I.e. a pes packet header must be included in a new ts bag, and pes bag data will simultaneously Payload region full of ts transmission bag, if the ending of pes bag data cannot be alignd with the ending of ts bag, needs ts's The byte of padding of respective numbers is inserted so that both endings are alignd in adaptive region.Pes priority represents pes bag data In net load importance, for video, be 1 expression intra data;In addition pts represents the display time, when dts represents decoding Between, can be used to judge the dependency in front and back of video payload content, thus judging load type.
In ts over rtp mode, in order to protect the video copy content in transmission, often adopt in transmitting procedure The mode of payload encryption is transmitted.Encryption to ts packet is that the payload portions being grouped are encrypted, once ts The scrambling labelling of head puts 1, then its load is encrypted, now only can be using the data between adjacent pusi with identical pid The length (length of same frame of video) of bag is judging load data type.If pes head unencryption in ts packet, remove Can judge outside data frame type using the length of above-mentioned frame of video, can also assist judgment frame type using pts.
By above introduction: its data volume of different types of Frame is had any different, and i frame is due to only eliminating frame in Redundancy, its data volume is general bigger than the inter-frame encoding frame eliminating inter-frame redundancy, and p frame is general bigger than the data volume of b frame. For this characteristic, presently, there are some frame type detection algorithms in the case of ts block encryption, using frame data volume Lai Judgment frame type;The more two methods of utilization introduced below:
One: by parsing ts packet, obtain the length of each frame of video, by length scale information inference frame type.? Method through proposing is true frame type in the case that the payload portions being grouped for ts are encrypted.
The method is passed through to parse the lost condition of the continuity counter domain judgement packet of ts packet, by this The structural information of the image sets (group of pictures, gop) before execution judgement estimates the Packet State lost, and ties Close available information (random access indicator, rai or the elementary of ts packet header self adaptation field Stream priority indicator, espi) judging the type of frame of video.
For the identification of i frame, can be with following three kinds of methods:
1st, rai or espi is utilized to identify i frame.
2nd, when being identified using rai or espi, by caching the data of a gop, by the data of current cache In maximum as i frame, the length of gop needs pre-defined, once gop length changes, the method will lose efficacy.
3rd, it is used the value representing maximum gop length as i frame fixed cycle really, determine that the maximum amount of data frame in the cycle is I frame, determines that the cycle is the maximum in the i frame period having detected.
For p frame, with following three kinds of methods:
1st, the frame between the frame before start frame to immediately i frame, select data volume true more than each frame of surrounding frame It is set to p frame.Include framing pattern really for the gop structure processing target stream, select to determine frame with n kind from the determination cycle The corresponding successive frame of pattern as determine target frame, by determine target frame data volume between magnitude relationship with determine frame mould Formula is compared, and can determine p frame based on coupling therebetween.In gop structure, following pattern is used as determination frame mould Formula: this pattern includes all continuous b frame before p frame and a b frame in p frame next frame.Now some letters of gop Breath needs pre-enter.
2nd, the threshold value being calculated based on the meansigma methodss of the frame data amount of multiple frames of pre-position in expression mode and table Comparative result between the frame data amount of each frame in existing pattern.
3rd, adjust the threshold value for distinguishing p and b frame based on frame data amount using regulation coefficient.Regulation coefficient: given In the range of sequentially selection interim regulation coefficient executing and frame type determines that processing identical is processed, thus to previously given Know that the frame type of each frame in the cycle is estimated, calculate estimated result and the actual frame type obtaining from non-encrypted stream Mistake determine ratio, know that having lowest error determines the interim regulation coefficient of ratio as real regulation coefficient.
For b frame, determination methods are: i frame, the frame beyond p frame is defined as b frame.
The method of above judgment frame type, during for there being packet loss, based on rtp serial number and ts stem seriality Designator (cc) can detect packet loss, can estimate the Packet State of loss with pattern match by gop structure, thus reaching A certain degree of correction.But the method that can not adjust threshold value is needed to pre-enter gop information, and for adjustable thresholds Method then need to obtain frame type information from unencrypted code stream to train coefficient, need excessive manual intervention.In addition, Need to cache a gop and carry out frame type estimation again, be not suitable for real-time application.Again, i frame judges only to carry out once, adjustable Whole coefficient is the cycle, directly takes maximum to be i, only taken into account local characteristicses, do not had for global property in each cycle There is consideration.
Two: can be carried out in four steps using the method that threshold value distinguishes different frame:
1st, the renewal of threshold value:
The threshold value (ithresh) of differentiation i frame:
scaled_max_iframe=scaled_max_iframe*0.995;Wherein scaled_max_iframe is upper one Individual i frame sign.
If nbytes > scaled_max_iframe,
Then ithresh=(scaled_max_iframe/4+av_nbytes*2)/2;Wherein av_nbytes is current 8 frames Slip average.
The threshold value (pthresh) of differentiation p frame:
scaled_max_pframe=scaled_max_pframe*0.995;Wherein scaled_max_pframe is upper one Individual p frame sign.
If nbytes > scaled_max_pframe, pthresh=av_nbytes*0.75;
2nd, detect i frame: video has an i frame at set intervals, and i frame is bigger than meansigma methodss, and i frame is bigger than p frame.If worked as Previous frame data volume is bigger than ithresh then it is assumed that this frame is i frame.
3rd, detect p frame: less than meansigma methodss using b frame.If the data volume of present frame is more than pthresh, it is less than Ithresh is then it is assumed that this frame is p frame.
4th, other frames are b frame.
The method of above second judgment frame type, controls threshold value using decay factor, and this factor directly affects sentencing of i frame Disconnected, when follow-up i frame is more than current i frame, easily determine out i frame;But when follow-up i frame be much smaller than current i frame when, need through The decay of a lot of frames just can rejudge out i frame.And in algorithm, it is fixed as 0.995, do not account for the violent situation of gop change, In many cases and inapplicable.Decay factor is less, then i frame loss is less, and p is mistaken for the probability increase of i frame simultaneously;Decay The factor is bigger, then i frame loss increases (when the size variation of i frame is violent in sequence), and i frame is judged as p frame.Therefore detect accurate Really rate is relatively low.In addition, only considering to use threshold decision b/p frame, this frame structure to i/p/p/p ..., algorithm can be wrong by a lot of p frames It is judged to b frame False Rate high.
Content of the invention
Embodiment of the present invention technical problem to be solved is to provide a kind of detection method of frame type and device, improves frame class The accuracy of type detection.
For solving above-mentioned technical problem, the detection method embodiment of frame type provided by the present invention can be by following skill Art scheme is realized:
Detect the reproduction time of each frame;
If the reproduction time of present frame is less than the maximum play time of the frame having been received by it is determined that described present frame is Bi-directional predictive coding frame b frame.
A kind of detection method of frame type, comprising:
Obtain the type of coding of the frame place code stream receiving, described type of coding includes: open loop coding and closed loop coding;
If the data volume of present frame is more than first threshold, determine that present frame is obvious intracoded frame i frame, described the One threshold value is calculated by the average amount of the frame setting continuous number and i frame data amount;
If the former frame of present frame is i frame, type of coding encodes for closed loop and present frame is non-obvious i frame, or, if The data volume that the former frame of present frame is i frame, type of coding is open loop coding and present frame is more than the 4th threshold value it is determined that current Frame is single directional prediction coded frame p frame;Described 4th threshold value is p frame average amount and the b frame average data of image sets The average of amount;
If present frame non-i frame also non-p frame it is determined that present frame be b frame.
A kind of detection means of frame type, comprising:
Time detecting unit, for detecting the reproduction time of each frame;
Frame type determining units, if be less than the frame having been received by maximum and play for the reproduction time of present frame Between it is determined that described present frame be bi-directional predictive coding b frame.
A kind of detection means of frame type, comprising:
Type obtaining unit, for obtaining the type of coding of the frame place code stream having been received by, described type of coding bag Include: open loop coding and closed loop coding;
Frame type determining units, if the data volume for present frame is more than first threshold, determine that present frame is obvious i Frame, described first threshold is calculated by the average amount of the frame setting continuous number and i frame data amount;
If the former frame of present frame is i frame, type of coding encodes for closed loop and present frame is non-obvious i frame, or, if The data volume that the former frame of present frame is i frame, type of coding is open loop coding and present frame is more than the 4th threshold value it is determined that current Frame is p frame;Described 4th threshold value is the p frame average amount of image sets and the average of b frame average amount;
If present frame non-i frame also non-p frame it is determined that present frame be b frame.
Before technical scheme provided in an embodiment of the present invention, the coded sequence in conjunction with dissimilar frame and dissimilar frame Data volume magnitude relationship afterwards, judgment frame type in the case of not decoding net load, eliminate the impact of decay factor, improve frame The accuracy of type detection.
Brief description
In order to be illustrated more clearly that the technical scheme of the embodiment of the present invention, below will be to required use in embodiment description Accompanying drawing be briefly described it should be apparent that, drawings in the following description are only some embodiments of the present invention, for this For the those of ordinary skill of field, without having to pay creative labor, other can also be obtained according to these accompanying drawings Accompanying drawing.
Fig. 1 a is present invention method schematic flow sheet;
Fig. 1 b is present invention method schematic flow sheet;
Fig. 2 a is classified b frame coding structure schematic diagram for the embodiment of the present invention;
Fig. 2 b is the relation of embodiment of the present invention coded sequence and playing sequence, and the level schematic diagram of coding;
Fig. 3 is embodiment of the present invention packet loss frame structure schematic diagram;
Fig. 4 is present invention method schematic flow sheet;
Fig. 5 is embodiment of the present invention side's apparatus structure schematic diagram;
Fig. 6 is embodiment of the present invention side's apparatus structure schematic diagram;
Fig. 7 is embodiment of the present invention side's apparatus structure schematic diagram;
Fig. 8 is embodiment of the present invention side's apparatus structure schematic diagram;
Fig. 9 is embodiment of the present invention side's apparatus structure schematic diagram;
Figure 10 is embodiment of the present invention testing result schematic diagram;
Figure 11 is embodiment of the present invention testing result schematic diagram;
Figure 12 is embodiment of the present invention testing result schematic diagram;
Figure 13 is embodiment of the present invention testing result schematic diagram;
Figure 14 is embodiment of the present invention testing result schematic diagram;
Figure 15 is embodiment of the present invention testing result schematic diagram.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation description is it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of not making creative work Embodiment, broadly falls into the scope of protection of the invention.
A kind of detection method of frame type, as shown in Figure 1a, comprising:
101a: detect the reproduction time of each frame;
102a: if the reproduction time of present frame is less than the maximum play time of the frame having been received by it is determined that above-mentioned work as Previous frame is bi-directional predictive coding b frame;
Further, the embodiment of the present invention is acceptable: playing sequence and coded sequence according to each frame determine b frame in classification Affiliated level in coding.For how determining that level will be further described below.Based on the characteristic of b frame, if really Determine the level belonging to it to apply in a lot of fields, for example: in compressed data frames, the high b frame of level can be abandoned.? The application embodiment of the present invention after the level of b frame determines is refused to limit.
Before and after above-described embodiment, the coded sequence in conjunction with dissimilar frame and dissimilar frame, data volume size is closed System, judgment frame type in the case of not decoding net load, eliminate the impact of decay factor, improve the correct of frame type detection Rate.
The embodiment of the present invention additionally provides the detection method of another kind of frame type, as shown in Figure 1 b, comprising:
101b: obtain the type of coding of the frame place code stream receiving, above-mentioned type of coding includes: open loop coding and closed loop Coding;
102b: if the data volume of present frame is more than first threshold, determine that present frame is obvious i frame, above-mentioned first threshold It is calculated by the average amount and i frame data amount of the frame setting continuous number;
Above-mentioned obvious i frame belongs to i frame, if being judged as obvious i frame, then wrongheaded probability is very low, but It is possible to occur failing to judge, subsequently other judge that the mode of i frame is likely to occur the situation of misjudgement i frame.
If the former frame of present frame is i frame, type of coding encodes for closed loop and present frame is non-obvious i frame (present frame Its frame type now not clear, but can determine whether it is obvious i frame), or, if the former frame of present frame is i Frame, type of coding are more than the 4th threshold value it is determined that present frame is p frame for the data volume of open loop coding and present frame;Above-mentioned 4th Threshold value is the p frame average amount of image sets and the average of b frame average amount;
If present frame non-i frame also non-p frame it is determined that present frame be b frame.
It should be noted that the corresponding method of above-mentioned Fig. 1 b can be made it is also possible to be combined with the method for Fig. 1 a with independent utility With if be used in combination can be using implementation in the case that reproduction time cannot detect in fig 1 a.
The type of coding of the frame place code stream that above-mentioned acquisition receives includes:
Count the type of obvious i frame a later frame, if the ratio of p frame reaches setting ratio and then determines that type of coding is to close Ring encodes, and otherwise encodes for open loop.
Following examples so that the scheme of Fig. 1 b is used in combination with the scheme of Fig. 1 a as a example illustrate, if Fig. 1 b scheme is independent Use when, can without check reproduction time whether can be detected.
The embodiment of the method that further if reproduction time cannot detect in 101a, also include:
If present frame is more than Second Threshold it is determined that present frame is i frame;Above-mentioned Second Threshold is before present frame The data volume of i frame, present frame in the group of images the average amount of p frame and set number successive frame average amount in Maximum.
The embodiment of the method that further if reproduction time cannot detect in 101a, also include:
If present frame is more than the 3rd threshold value, and the interval of present frame and previous i frame exceedes fixed interval it is determined that current Frame is i frame;Above-mentioned 3rd threshold value is: the average amount of each frame of present frame place image sets, a upper i frame to present frame away from From with expected fixing i frame period away from degree, the data volume of the previous p frame of present frame and present frame place image sets The data volume of i frame is calculated;Or, above-mentioned 3rd threshold value according to the average amount of each frame of present frame place image sets and A upper i frame is to the distance of present frame and being calculated away from degree of expected fixing i frame period.
The embodiment of the method that further if reproduction time cannot detect in 101a, also include:
If the previous frame of present frame is more than the 5th threshold value for the data volume of p frame and present frame, or there is b in present image group The data volume of frame and present frame is more than the 6th threshold value it is determined that present frame is p frame;Above-mentioned 5th threshold value is: the first regulatory factor With amassing of the average amount of the p frame of present frame place image sets, above-mentioned first regulatory factor is more than 0.5 and less than 1;Above-mentioned Six threshold values are: the average of p frame average amount and b frame average amount;
If the previous frame of present frame is less than the 7th threshold value for the data volume of b frame and present frame, or there is p in present image group The data volume of frame and present frame is less than the 8th threshold value it is determined that present frame is p frame;Above-mentioned 7th threshold value is: the second regulatory factor With amassing of the average amount of the b frame of present frame place image sets, above-mentioned second regulatory factor is more than 1 less than 1.5;Above-mentioned 8th Threshold value is: the average of p frame average amount and b frame average amount.
The embodiment of the method that further if reproduction time cannot detect in 101a, also include:
After frame type judges to terminate, determine the fixed interval of i frame, if still not judging to deposit after fixed interval reach In i frame, then the frame of the maximum amount of data in set point at fixed interval is defined as i frame;And update various types of in image sets The average amount of type frame and the spacing parameter of i frame.
The embodiment of the method that further if reproduction time cannot detect in 101a, also include:
After frame type judges to terminate, count continuous b frame, if continuous b frame number is more than predictive value, will be above-mentioned continuous B frame in the maximum frame of data volume be defined as p frame;And update the average amount of all kinds frame in image sets;Above-mentioned prediction Value is less than or equal to 7 more than or equal to 3.
The embodiment of the method that further if reproduction time cannot detect in 101a, also include:
Determine whether the frame that has been received by occurs packet loss, if there is packet loss it is determined that packet loss type;
If packet loss type is frame in packet loss, calculates and during frame data amount, determine the data volume and packet loss data volume receiving frame With the data volume for this frame;
If packet loss type be interframe packet loss it is determined that whether the flag bit of bag before at packet loss is 1, if so, then will lose The data volume of bag counts a later frame, otherwise the data volume of packet loss is averagely allocated to before and after two frame.
Further above-mentioned determination packet loss type includes:
Frame type prediction coding structure is had been detected as by statistics;
If packet loss type is interframe packet loss, at packet loss before the flag bit of bag cannot detect, then according to the coding of prediction The position segmentation current data length of structure and packet loss.
The embodiment of the present invention makes full use of the header packet information of rtp or ts over rtp, in conjunction with frame dissimilar in video Coded sequence and the data volume magnitude relationship in front and back of dissimilar frame, quickly real-time in the case of the net load of not decoding video Judgment frame type, and processed by packet loss, automatically update the method raising frame type inspection that parameter and later stage frame type are corrected The accuracy surveyed.
The header packet information of the reproduction time of instruction video data is had, such as: the rtp time in isma mode in video flowing Stamp, and in ts over rtp mode pes head pts.The embodiment of the present invention is by using reproduction time information and coded sequence Mutual relation, to judge the type of coding of some special constructions, such as: b frame.But for ts over rtp mode, it is understood that there may be ts Net carry the situation that pes head cannot decode of encrypting completely, i.e. pts non-availability, therefore, the embodiment of the present invention additionally provides not using broadcasting The time that puts only carries out the scheme of frame type judgement using information such as data volumes.
Observe the video code flow in practical application it is found that different types of frame typically has more in same gop Significantly distinguish, i frame data amount is maximum, secondly, b frame is minimum for p frame.If the i frame of each gop section start can correctly be identified, Then the p frame within this gop and b frame can be judged using the data volume of this frame.But non-stationary, the different positions due to video signal The i frame data amount difference at the place of putting has larger difference, or even can with the data volume of the p frame in gop before quite, to judgement I frame brings difficulty.The embodiment of the present invention devise a set of can Intelligent adjustment dynamic parameter, with improve frame type judgement Shandong Rod and accuracy.Particularly when judging i frame, take into full account that the suitable regulation of characteristic of i frame in different application scene is sentenced Disconnected criterion and relevant parameter, greatly reduce the False Rate of i frame.
In the application scenarios damaging transmission, the video flowing of input can occur packet loss, according to the shadow to judge process for the packet loss Ring, two classes: the one, packet loss of frame in can be classified as, now the information of frame boundaries is not lost, and can first get frame side Boundary, counts the bag number of a frame with corresponding serial number;2nd, frame boundaries packet loss is (such as: in rtp, flag bit is 1 bag, or ts In over rtp, pusi puts 1 bag), now possibly cannot judge the border of before and after two frame it is also possible to the data of two frames is spelled in front and back It is connected to a frame so that frame data amount is not statistical uncertainty really, the result that impact frame type judges.The embodiment of the present invention will be lost with regard to this Bag detection, frame boundaries are estimated and partial frame type is estimated.
, because statistical data is inadequate, can there is more erroneous judgement, not only have influence on defeated in the early stage judging in frame type The result going out, more can have influence on the accuracy of follow-up judgement by changing various parameters.The embodiment of the present invention is in judgment frame type Increased frame type after flow process to correct, if output result carries out internal correction when having apparent error after data increases, internal Although correcting and can not changing the frame type having exported, the accurate of follow-up judgement can be improved by way of adjusting parameter Property.
Hereinafter three main points with regard to the embodiment of the present invention respectively are described in detail:
One: judge b frame using reproduction time or/and be classified b frame:
Due to b frame adopt forward and backward encoded frame as prediction, its coded sequence rear to after reference frame, Make its reproduction time often inconsistent with coded sequence, therefore can judge b frame with reproduction time information.If present frame Reproduction time is less than the reproduction time of the maximum of frame having been received by, then this frame is b frame certainly, otherwise for i frame or p frame.
B frame for hierarchical coding can also determine whether highest level and each b frame institute using reproduction time The level belonging to.In case of continuous 7 b frames, shown in Fig. 2 a, it is the coding structure figure of classification b frame in the case of this, first row The subscript of letter represents the level belonging to every frame, and the numeral of second row is the broadcasting sequence number of each frame.And the coded sequence of reality For (parenthetic numeral is to play sequence number) i0/p0 (0), i0/p0 (8), b1 (4), b2 (2), b3 (1), b3 (3), b2 (6), b3 (5),b3(7).Fig. 2 b is the relation of coded sequence and playing sequence, and the level of coding, and Arabic numerals represent broadcasting sequence Number, Chinese figure presentation code sequence number.
Judge that the algorithm being classified can be divided into two steps with reproduction time:
The first step: judge highest level (for 3 in this example).The level of the 0th frame is set to 0, then presses coded sequence and read Reproduction time, if when the reproduction time of former frame is less than the reproduction time of former frame, the level of present frame is the layer of former frame Level Jia 1, otherwise then as former frame.The frame being in close proximity to the 0th frame until reading reproduction time is the 1st frame, now the 1st frame institute Corresponding level is highest level.
Second step: judge the level belonging to remaining frame according to the symmetrical relationss of adjacent b frame reproduction time.The first step is complete Cheng Houtu. the level in five (b) solid box is all it has been determined that now need to detect the level belonging to b frame in dotted line frame.Detection side Method is to be traveled through in the frame have determined level, searches out average and the present frame that two frames make their reproduction times Reproduction time is equal, then the level of present frame is that the maximum level of this two frames adds 1.The oval displaying of in figure is this right Title relation, that is, in ellipse, the average of the reproduction time of two frames is equal to the reproduction time of bottom frame above, and the layer of bottom frame Level is just that the maximum of above two frame-layer levels adds 1.
2nd, using frame data amount come judgment frame type:
Whether it is b frame due to can only be distinguished according to reproduction time, present embodiments provide the information such as data volume that only utilize To judge the scheme of i frame and p frame.When b frame be can determine whether out according to reproduction time it is only necessary to the differentiation of remaining frame be No i frame or p frame;And (during such as header packet information encryption) when cannot judge b frame according to reproduction time then will All frames are judged, first determine i frame and p frame, remaining frame is then judged to b frame.
The method that the present embodiment is updated by Automatic parameter is broadly divided into following several using frame data amount come judgment frame type Individual module (as shown in figure 6): i frame judge module, p frame judge module, parameter update module and type correct module.
A:i frame judges:
In general the i frame in video can be divided into two categories below: the i frame of fixed interval, exists to meet Stochastic accessing According to fixed interval (fixing in the regular period, once user's switching channels, this interval is it may happen that change) in compression process The i frame of insertion;The i frame of self adaptation insertion, is to improve compression efficiency, the i frame of insertion at scene switching.
For the i frame of fixed interval, identification process can be estimated this fixed interval, also not sentence exceeding this interval When breaking to i frame, actively relax Rule of judgment or judge (hereinafter this will be described in detail) with the feature of local.
And the i frame for self adaptation insertion, at scene switching as sequence space Complexity classes, if being encoded to adaptive The i frame that should insert, because the compression efficiency of i frame is poor, its code check is often big than p frame before;If being encoded to p frame, due to Forecast variation, its code check also can ratio larger, now this frame is important frame, is relatively easy to be judged as i frame (p frame and i frame Data volume all ratios are larger, are mistakenly easily i frames by p frame misidentification).At space complexity simple scene switching, coding May be also less than p frame before for i frame, such i frame is had no idea correctly to identify, but those p frames thereafter or b Frame also can accordingly diminish, and by follow-up renewal, can carry out type correction, to improve the discrimination to subsequent frame type.
Therefore, i frame can be judged by three below step, that is, be respectively compared current frame data amount and given threshold value, only Current frame data amount to be more than given threshold value it is determined that i frame in a certain step:
Obvious i frame is judged according to threshold value 1;
Judge the i frame at on-fixed interval according to threshold value 2;
Judge to exceed the i frame of expected fixed interval according to threshold value 3.
B:p frame judges:
When previous frame encodes for closed loop for i frame and current video stream, b frame after i frame, will not be close to.If should Frame is not judged as i frame, then for p frame;
When previous frame encodes for open loop for i frame and current video stream, if the data volume of present frame is more than threshold Value 4, then this frame is p frame, and otherwise this frame is b frame;
When previous frame is for p frame, if current frame data amount is more than threshold value 5 or there is b frame in current gop In the case of be more than threshold value 6, then this frame be p frame;
When previous frame is for b frame, represent in current gop there is b frame, if current frame data amount is less than threshold value 7 Or it is less than threshold value 8 in the case that current gop has had determined that p frame, then this frame is p frame.
C: parameter updates:
The type of coding (open loop or closed loop) of statistics gop: in identification process, for obvious i frame, can count Its a later frame is b frame or p frame, if being all p frame after most of i frame, it is considered that this encoder is closed loop coding, otherwise It is considered that open loop encodes.
I frame fixed interval expected from calculating: after judging i frame, count the probability distribution at its interval, and flat by weighting All, the expected fixed interval obtaining.
Threshold value according in the frame type newly judged the in real time above-mentioned module of renewal:
A) threshold value 1: the average amount (av_ibpnbytes) according to 50 frames before and the data volume of previous i frame (iframe_size_gop), it is calculated according to formula (1):
Threshold value 1=delta1*iframe_size_gop+av_ibpnbytes
Wherein, delta1 is regulatory factor, and span is (0,1), is 0.5 according to the empirical value that experiment obtains.
B) threshold value 2: flat according to maximum p frame in the data volume (iframe_size_gop) of previous i frame, current gop In equal data volume (max_pframes_size_gop) and front 50 frames, the average amount (av_ipnbytes) of i frame p frame, presses It is calculated according to formula (2):
Threshold value 2=max (delta2*max_pframes_size_gop, delta2*av_ipnbytes, delta3* iframe_size_gop)
Wherein, delta2 and delta3 is respectively regulatory factor, and its empirical value is 1.5 and 0.5.
C) threshold value 3: the average amount (av_frame_size_gop) of the every frame according to current gop, previous p frame Data volume (prew_pframe_nbytes), the data volume (iframe_size_gop) of the i frame of current gop, according to formula (3) It is calculated;Or calculated according to formula (5) according to p frame average amount (av_pframes_size_gop) of current gop Arrive:
Threshold value 3=max (av_frame_size_gop, ip_thresh*prew_pframe_nbytes, iframe_ Size_gop/3) formula (3)
Wherein, ip_thresh with from the distance (curr_i_interval) of a upper i frame to present frame with expected The calculating away from degree of fixing i frame period (expected_iframe_interval):
Ip_thresh=max (2- (curr_i_interval-expected_iframe_interval) * 0.1,1.5) formula (4)
Threshold value 3=sthresh*av_pframes_size_gop+av_pframes_size_gop formula (5)
Wherein, sthresh calculates according to curr_i_interval and expected_iframe_interval:
sthresh=
max(delta4,sthresh/(delta5*curr_i_interval/expected_iframe_interval)) Formula (6)
Wherein, delta4 and delta5 is respectively regulatory factor, and its empirical value is 0.2 and 2.0.
D) threshold value 4: p frame average amount (av_pframes_size_last_gop) for a upper gop and b frame are average The average of data volume (av_bframes_size_last_gop), such as formula (7):
Threshold value 4=(av_pframes_size_last_gop+av_bframes_size_last_gop)/2
E) threshold value 5: be multiplied by p frame average amount (av_pframes_size_gop) in current gop for 0.75, such as formula (8):
Threshold value 5=delta6*av_pframes_size_gop
Wherein, delta6 is respectively regulatory factor, and its empirical value is 0.75
F) threshold value 6: for p frame average amount (av_pframes_size_gop) and b frame average amount (max_ Bframes_size_gop average), such as formula (9);
Threshold value 6=(av_pframes_size_gop+max_bframes_size_gop)/2
G) threshold value 7: be multiplied by b frame average amount (av_bframes_size_gop) in current gop for 1.25, such as formula (10):
Threshold value 7=delta7*av_bframes_size_gop
Wherein, delta7 is respectively regulatory factor, and its empirical value is 1.25
H) threshold value 8: for the average (av_ of p frame average amount (av_pframes_size_gop) and b frame average amount Bframes_size_gop), as formula (11):
Threshold value 7=(av_pframes_size_gop+av_bframes_size_gop)/2
D: type is corrected:
The i frame that correction is failed to judge:
After above-mentioned steps, it is understood that there may be exceed well over the situation that expected fixed interval but also do not judge i frame, this When although output frame type, but be available with local information correcting parameter so that follow-up frame type judge more accurate Really.In the frame that the amount of fetching data near expected fixed interval is maximum, its frame type is changed to i frame, and updates each frame in gop The parameter such as the average amount of type and i frame period.
The b frame of correction misjudgement:
Video encoder in practical application, using b frame improve code efficiency when typically can take into account decoding time delay with And decoding storage overhead, the continuous b frame more than 7 will not be encoded out, or even, more extreme, continuous b frame is not over 3 Individual.Draw the predictive value of maximum continuous b frame in this code stream by the frame type statistics judged before.One frame is being defined as b Therefore, to assure that this time continuous b frame number is less than predictive value during frame.If it exceeds this value, illustrate currently continuously to be judged as b frame May there is misjudgement in frame, need to change the original sentence to as p frame by frame maximum for data volume in these frames, and update each frame type in gop The information such as average amount.
3rd, cannot determine frame type detection when border and frame data amount:
The first two example is required for carrying out in the case of frame boundaries and frame data amount are acquired.Can lead in no packet loss Cross cc, pusi, pid(ts over rtp in serial number, timestamp, flag bit (isma mode) or rtp serial number, the ts of rtp Mode) accurately to know frame boundaries and the data volume of each frame, but in the case of there is packet loss, if being in frame boundaries Bag is lost, then cannot accurately judge the position of frame boundaries, mistake may be estimated even by the number of two frames the bag number of frame Spelling according to amount is a frame, and this will bring great interference to the detection of frame type.Therefore, then need to sentence in frame type if there are packet loss Carry out packet loss process before disconnected, to obtain the information such as frame boundaries, frame data amount and frame type.
New frame due to the change flag of rtp timestamp in isma mode to reach, therefore when there is packet loss, its process Process is fairly simple:
1) if packet loss surrounding time stamp is unchanged, represent the bag lost and be in a frame in portion, only need to be in statistics frame data The data of packet loss is considered during amount;
2) if packet loss surrounding time stamp changes, represent packet loss and occur on the border of frame, if now packet loss is previous The flag bit of individual bag is 1, then regard the data as a later frame for the packet loss, be added in the data volume of a later frame;Otherwise, by the number of packet loss It is averagely allocated to before and after two frame (it is assumed herein that Burst loss is not over length of a frame) according to amount.
The situation of ts over rtp is relative complex, due to can only be by whether there being pes head (i.e. pusi be 1) to judge The beginning of one frame, if there is packet loss, is difficult to judge that the data between two bags having pes head is belonging to a frame or multiframe, such as Shown in Fig. 3, the data between two bags having pes head there occurs 3 packet losses, but due to whether cannot know in the bag lost Also there is pes head (representing the beginning of a frame) it is impossible to judge whether these data belong to same frame.Present case difference in terms of two Provide solution.
If pes head can solve, current data length (i.e. two bags having pes head can be judged according to pts therein Between data length) whether comprise frame originating point information:
1) order of the pts of gop that statistics correctly detects, by distribution probability and the distance weighted work apart from present frame For forward index, obtain expected coding structure;
2) pts according to the series of frames starting from i frame in reception order to current pts and next pts and expection Coding structure mated;
If a) meeting expected coding structure then it is assumed that not comprising frame originating point information in the packet loss of this data length, that is, when Front data length is a frame, and packet loss occurs in this frame in portion it is not necessary to split;
If b) not meeting expected coding structure, illustrate to be likely to contain frame originating point information in packet loss, it is anticipated that volume Current data length is split in the position (continuous length, packet loss length etc.) of code structure and packet loss generation, distributes rational frame Type and frame sign and pts.
3) if being judged as before being subsequently found that losing the frame of frame head, the judgement knot before updating in aligning step Really.
Furthermore it is possible to according to packet loss length, continuous length, maximum continuous length, maximum packet loss length etc. is current to judge Data length whether be a frame and belong to which kind of frame type:
1) if the length of this data length and previous i frame is almost then it is assumed that belong to same i frame;If this number Almost big according to length and p frame, and maximum continuous length bigger than the data volume of the average b frame within 50 frames then it is assumed that this data Length broadly falls into same p frame;2 are gone to other situations);
2) if this data length and two p frames are almost big, two p frames to be split as, this is changed data length and divides For two sections so that every section of length is all closest with p frame, and second segment to be guaranteed is started with lost package;Other situations are turned To 3);
3) if this data length and p frame add b frame almost, p frame+b frame to be split as, by the bag that continuous length is maximum It is attributed to p frame, on this basis this data length is divided into two sections so that every section of length is respectively close to p frame and b frame, and Second segment to be guaranteed is started with lost package;4 are gone to other situations);
4) if maximum continuous length be less than b frame and this data length and three b frames almost, three b will be split as Frame, this data length is divided into three sections so that every section of length is all close to b frame, and the 3rd section of second segment to be guaranteed is to lose Unwrap head;5 are gone to other situations);
5) if maximum continuous length be less than b frame and this data length and two b frames almost, two b will be split as Frame, the bag of this data length is divided into two sections so that every section of length is all close to b frame, and the 3rd section of second segment to be guaranteed with Lost package starts;6 are gone to other situations);
6) think in the case of other that this data length all belongs to a frame.
The present embodiment combines each example above, provides an optional frame type detection scheme, idiographic flow is as shown in Figure 4: Be divided into following several stages: processed using pts preliminary judgment frame type, packet loss, determined whether using data volume frame type and Type is corrected.
401: after data input, judge whether packet header can solve, be, execute according to reproduction time judgment frame type, otherwise hold Row packet loss is processed;After frame type judges to terminate, the previous frame judging whether determines whether by mistake, have then execution frame type to correct, The circulation of frame type judgement otherwise can be entered, that is, enter 401, concrete execution is as follows:
According to reproduction time judgment frame type: the code stream inputting first is determined whether with the bag of ts over rtp, if Then need whether the pes head judging ts bag is encrypted.The bag of the ts over rtp that can solve for rtp bag or pes head, can be according to broadcasting Temporal information tentatively determines whether b frame, is embodied as referring to main points one;
Packet loss is processed: detects whether there is packet loss, if no packet loss directly counts data volume entrance following frame type judgement Step;If there being packet loss, need to carry out packet loss process respectively for rtp or ts over rtp bag, estimate frame boundaries, frame data amount or Partial frame type, is embodied as referring to main points three;
According to data volume judgment frame type: this process real-time judge frame type, and the adjustment relevant parameter of dynamic and intelligent, tool Body is implemented to refer to main points two;
Type is corrected: if the judged result before finding in judge process is wrong, can be corrected, this process not shadow Ring output result, but can be used for updating relevant parameter, to improve the accuracy of follow-up judgement, be embodied as referring to main points two.
The embodiment of the present invention additionally provides a kind of detection means of frame type, as shown in Figure 5, comprising:
Time detecting unit 501, for detecting the reproduction time of each frame;
Frame type determining units 502, if the reproduction time for present frame is play less than the maximum of the frame having been received by Time is it is determined that above-mentioned present frame is bi-directional predictive coding b frame;
Further, can also include in above-mentioned Fig. 5:
Level determining unit 503, for determining b frame institute in hierarchical coding according to the playing sequence of each frame and coded sequence The level belonging to;It should be noted that it is not the essential features that the embodiment of the present invention determines b frame that level determines, this technology is special Levy when being only used as subsequently carrying out the relevant treatment needing hierarchical information just needs.
The embodiment of the present invention additionally provides the detection means of another kind of frame type, as shown in Figure 6, comprising:
Type obtaining unit 601, for obtaining the type of coding of the frame place code stream having been received by, above-mentioned type of coding Including: open loop coding and closed loop coding;
Frame type determining units 602, if the data volume being additionally operable to present frame is more than first threshold, determine that present frame is bright Aobvious i frame, above-mentioned first threshold is calculated by the average amount of the frame setting continuous number and i frame data amount;
If the former frame of present frame is i frame, type of coding encodes for closed loop and present frame is non-obvious i frame, or, if The data volume that the former frame of present frame is i frame, type of coding is open loop coding and present frame is more than the 4th threshold value it is determined that current Frame is p frame;Above-mentioned 4th threshold value is the p frame average amount of image sets and the average of b frame average amount;
If present frame non-i frame also non-p frame it is determined that present frame be b frame.
Further, above-mentioned frame type determining units 602, if be additionally operable to present frame to be more than Second Threshold it is determined that current Frame is i frame;Above-mentioned Second Threshold is the data volume of an i frame before present frame, present frame p frame in the group of images average Maximum in the average amount of data volume and setting number successive frame.
Further, above-mentioned frame type determining units 602, if the interval being additionally operable to present frame with previous i frame exceedes admittedly Fixed interval, and present frame is more than the 3rd threshold value it is determined that present frame is i frame;Above-mentioned 3rd threshold value is: present frame place image sets The data volume of the average amount of each frame, the data volume of previous p frame of present frame and present frame place image sets i frame, on One i frame is to the distance of present frame and being calculated away from degree of expected fixing i frame period;Or, above-mentioned 3rd threshold value Average amount according to each frame of present frame place image sets and a upper i frame are to the distance of present frame and expected fixing i Being calculated away from degree of frame period.
Further, above-mentioned frame type determining units 602, if the previous frame being additionally operable to present frame is p frame and present frame Data volume is more than the 5th threshold value, or present image group there is b frame and the data volume of present frame is more than the 6th threshold value it is determined that working as Previous frame is p frame;Above-mentioned 5th threshold value is: the first regulatory factor is long-pending with the average amount of p frame of present frame place image sets, Above-mentioned first regulatory factor is more than 0.5 and is less than 1;Above-mentioned 6th threshold value is: p frame average amount and b frame average amount Average;
If the previous frame of present frame is less than the 7th threshold value for the data volume of b frame and present frame, or present image group is slightly in p The data volume of frame and present frame is less than the 8th threshold value it is determined that present frame is p frame;Above-mentioned 7th threshold value is: the second regulatory factor With amassing of the average amount of the b frame of present frame place image sets, above-mentioned second regulatory factor is more than 1 less than 1.5;Above-mentioned 8th Threshold value is: the average of p frame average amount and b frame average amount.
Further, as shown in fig. 7, said apparatus also include:
Interval acquiring unit 701, for, after frame type judges to terminate, determining the fixed interval of i frame;
Above-mentioned frame type determining units 602, if being additionally operable to still not judge there is i frame after fixed interval reach, The frame of the maximum amount of data in set point at fixed interval is defined as i frame;
First updating block 702, the interval for updating the average amount of all kinds frame and i frame in image sets is joined Number.
Further, as shown in figure 8, said apparatus also include:
Statistic unit 801, for, after frame type judges to terminate, counting continuous b frame;
Above-mentioned frame type determining units 602, if the quantity being additionally operable to continuous b frame is more than predictive value, by above-mentioned continuous b In frame, the maximum frame of data volume is defined as p frame;Above-mentioned predictive value is more than or equal to 3 and is less than or equal to 7
Second updating block 802, for updating the average amount of all kinds frame in image sets.
Further, as shown in figure 9, said apparatus also include:
Packet loss type determining units 901, for determining whether the frame having been received by occurs packet loss, if there is packet loss, Determine packet loss type;
Data volume determining unit 902, if being frame in packet loss for packet loss type, calculating determination during frame data amount and receiving frame Data volume and packet loss data volume and for this frame data volume;
If packet loss type be interframe packet loss it is determined that whether the flag bit of bag before at packet loss is 1, if so, then will lose The data volume of bag counts a later frame, otherwise the data volume of packet loss is averagely allocated to before and after two frame.
It should be noted that the device of the device of this enforcement and Fig. 4 or Fig. 5 can merge use, frame type determines Unit 502 can be realized using same functional unit with frame type determining units 602.
The embodiment of the present invention makes full use of the header packet information of rtp or ts over rtp, in conjunction with frame dissimilar in video Coded sequence and the data volume magnitude relationship in front and back of dissimilar frame, quickly real-time in the case of the net load of not decoding video Judgment frame type, and processed by packet loss, automatically update the method raising frame type inspection that parameter and later stage frame type are corrected The accuracy surveyed.
The header packet information of the reproduction time of instruction video data is had, such as: the rtp time in isma mode in video flowing Stamp, and in ts over rtp mode pes head pts.The embodiment of the present invention is by using reproduction time information and coded sequence Mutual relation, to judge the type of coding of some special constructions, such as: b frame.But for ts over rtp mode, it is understood that there may be ts Net carry the situation that pes head cannot decode of encrypting completely, i.e. pts non-availability, therefore, the embodiment of the present invention additionally provides not using broadcasting The time that puts only carries out the scheme of frame type judgement using information such as data volumes.
Observe the video code flow in practical application it is found that different types of frame typically has more in same gop Significantly distinguish, i frame data amount is maximum, secondly, b frame is minimum for p frame.If the i frame of each gop section start can correctly be identified, Then the p frame within this gop and b frame can be judged using the data volume of this frame.But non-stationary, the different positions due to video signal The i frame data amount difference at the place of putting has larger difference, or even can with the data volume of the p frame in gop before quite, to judgement I frame brings difficulty.The embodiment of the present invention devise a set of can Intelligent adjustment dynamic parameter, with improve frame type judgement Shandong Rod and accuracy.Particularly when judging i frame, take into full account that the suitable regulation of characteristic of i frame in different application scene is sentenced Disconnected criterion and relevant parameter, greatly reduce the False Rate of i frame.
In the application scenarios damaging transmission, the video flowing of input can occur packet loss, according to the shadow to judge process for the packet loss Ring, two classes: the one, packet loss of frame in can be classified as, now the information of frame boundaries is not lost, and can first get frame side Boundary, counts the bag number of a frame with corresponding serial number;2nd, frame boundaries packet loss is (such as: in rtp, flag bit is 1 bag, or ts In over rtp, pusi puts 1 bag), now possibly cannot judge the border of before and after two frame it is also possible to the data of two frames is spelled in front and back It is connected to a frame so that frame data amount is not statistical uncertainty really, the result that impact frame type judges.The embodiment of the present invention will be lost with regard to this Bag detection, frame boundaries are estimated and partial frame type is estimated.
, because statistical data is inadequate, can there is more erroneous judgement, not only have influence on defeated in the early stage judging in frame type The result going out, more can have influence on the accuracy of follow-up judgement by changing various parameters.The embodiment of the present invention is in judgment frame type Increased frame type after flow process to correct, if output result carries out internal correction when having apparent error after data increases, internal Although correcting and can not changing the frame type having exported, the accurate of follow-up judgement can be improved by way of adjusting parameter Property.
The following is several applications after frame type judges it is to be appreciated that the applicating example after frame type determines should not be managed Solve as exhaustion, the embodiment of the present invention is not constituted and limit.
1. unequal loss protection is carried out according to the frame type judging: during Bandwidth-Constrained, can be according to different frame type to video The difference of quality impact carries out unequal loss protection so that video reception quality reaches optimum.
2. video fast browsing can be realized with the expected cycle with reference to the average bit rate of gop: for being stored in local code Stream user is not desired to browse whole videos, can extract i frame correspondence position thus realizing quick stream by quick pretreatment Look at.For the code stream being stored in server, user is not desired to browse whole videos, server can by quick pretreatment, Extract i frame correspondence position thus selectively transmission key frame information is to user.
3. service quality (quality of service, qos): when bandwidth is not enough, in intermediate node, can be according to sentencing Break the frame type, intelligence abandons a part of b frame or p frame (the p frame terminating near gop) so that while reducing code check, to the greatest extent The impact video quality that may lack.
It is additionally based on experiment, the effect of the technical scheme of the embodiment of the present invention is tested, the following is test result.
The experiment of this section in the case of there is no packet loss, to using reproduction time with two kinds of feelings not utilizing reproduction time Condition, is contrasted with the scheme two in background technology, result is as shown in table 1 respectively.
Table 1 cycle testss
Cycle testss: tested using the ts code stream of existing network capture and the code stream of constant bit rate coding, such as table one, wherein The code stream of existing network capture first three individual (iptv137, iptv138, iptv139) is payload segment encryption but pes head unencrypted Code stream;The code stream code check of constant bit rate coding is (1500,3000,4000,5000,6000,7000,9000,12000,15000). From code stream be all h.264 to encode, its frame type is divided into tri- kinds of i, p, b, and no be classified b.The frame of above sequence is given below Type detection experimental result, as shown in table 2.
Table 2 context of methods is contrasted with existing method testing result
As shown in Table 2, this experiment compares the following factor: i frame loss is that the i frame of missing inspection is total with i frame in sequence The ratio of number;I frame fallout ratio is that p or b is mistaken for the ratio of the number of i frame and i frame sum (it should be noted that most absolutely It is all only p to be misjudged as i in the case of number, b can be misjudged as i under few cases, this is far smaller than i frame code check with b frame code check The fact consistent);P- > i error rate is the ratio of the wrong number that p frame is judged to i frame and actual p frame sum;P- > b error rate is The wrong number that p frame is judged to b frame and the ratio of actual p frame sum;B- > p error rate is wrong number and the reality that b frame is judged to p frame The ratio of border b frame sum;Total error rate is the ratio of the number and totalframes misjudged (as long as the frame type judging and actual type Do not meet as misjudgement).I frame loss and i frame fallout ratio meansigma methodss can embody the correct detection probability for i frame.
Accuracy rate due to judging b frame using pts is 100%, is therefore no longer individually compared with reproduction time and does not utilize The result of reproduction time.Meanwhile, in order to fully demonstrate the superiority of the embodiment of the present invention two, in the case of using reproduction time Existing method is also increased with the process judging b frame using reproduction time, therefore, the difference of performance is essentially from utilizing frame data The difference of the method that amount judges.Result shows, can not sentenced using reproduction time judgment frame type and using reproduction time In the case of disconnected frame type, the code stream that this method is intercepted and captured for existing network and self-editing code stream are all good than existing method, especially right Self-editing code stream, this method Detection results are even more substantially, or even in some cases can be error-free, and existing method then seldom exists no Wrong situation.
Figure 10 to Figure 15 gives the detailed testing result of some sequences, and wherein actual lines use circular indicia, in advance The lines triangle surveyed identifies;Including i frame distribution situation, (transverse axis represents i frame period, is spaced apart two adjacent frames of 1 expression and is I, is spaced apart the i frame period that 0 expression i frame period is context of methods prediction more than 49, i frame predetermined period, and i frame actual cycle is real The i frame period on border) and the distribution situation of frame type (in figure form, diagonal of a matrix is the correct frame number judging, other positions It is set to misjudgement).Icon entitled sequence name+totalframes+total fallout ratio.It can be seen that the sequence of existing network is typically all to there is a fixation I frame period (in figure maximum), along with the switching of scene, can be adaptively inserted some i frames, thus causing in maximum A neighbouring disturbance, defines the i frame distribution situation of in figure.For fifa sequence (figure 14) it can be seen that actual cycle Two maximum of middle presence, this paper algorithm also can more accurately tell two maximum.According to this paper algorithm estimate pre- Phase i frame period is much like with actual i frame period, therefore can be used to instruct frame-skipping during fast browsing.
Figure 10: iptv13715861 (error0.6%) result is as shown in table 3:
Table 3
iptv137 It is detected as p It is detected as b It is detected as i
Actual type p 4909 0 61
Actual type b 1 10215 0
Actual type i 36 0 639
Figure 11: iptv13817320 (error0.1%), result is as shown in table 4:
Table 4
iptv138 It is detected as p It is detected as b It is detected as i
Actual type p 5676 0 8
Actual type b 0 10903 0
Actual type i 10 0 723
Figure 12: song38741 (error0.9%), result is as shown in table 5:
Table 5
song It is detected as p It is detected as b It is detected as i
Actual type p 16698 0 149
Actual type b 0 20217 0
Actual type i 210 0 1467
Figure 13: fifa9517 (error1.3%), result is as shown in table 6:
Table 6
fifa It is detected as p It is detected as b It is detected as i
Actual type p 4267 0 21
Actual type b 0 4693 0
Actual type i 106 0 430
Figure 14: travel1486 (error0.8%), result is as shown in table 7:
Table 7
travel It is detected as p It is detected as b It is detected as i
Actual type p 493 0 11
Actual type b 0 934 0
Actual type i 1 0 47
Figure 15: sport1156 (error0.3%), result is as shown in table 8:
Table 8
sport It is detected as p It is detected as b It is detected as i
Actual type p 396 0 4
Actual type b 0 719 0
Actual type i 0 0 37
One of ordinary skill in the art will appreciate that it is permissible for realizing all or part of step in above-described embodiment method The hardware being instructed correlation by program is completed, and above-mentioned program can be stored in a kind of computer-readable recording medium, on Stating the storage medium mentioned can be read only memory, disk or CD etc..
The detection method of frame the type above embodiment of the present invention being provided and device are described in detail, herein Apply specific case the principle of the present invention and embodiment are set forth, the explanation of above example is only intended to help Understand the method for the present invention and its core concept;Simultaneously for one of ordinary skill in the art, according to the thought of the present invention, All will change in specific embodiments and applications, to sum up, this specification content should not be construed as to the present invention Restriction.

Claims (16)

1. a kind of detection method of frame type is it is characterised in that include:
Obtain the type of coding of the frame place code stream receiving, described type of coding includes: open loop coding and closed loop coding;
If the data volume of present frame is more than first threshold, determine that present frame is obvious intracoded frame i frame, described first threshold Value is calculated by the average amount of the frame setting continuous number and i frame data amount;
If the former frame of present frame is i frame, type of coding encodes for closed loop and present frame is non-obvious i frame, or, if currently The former frame of frame is i frame, type of coding is more than the 4th threshold value it is determined that present frame is for the data volume of open loop coding and present frame Single directional prediction coded frame p frame;Described 4th threshold value is p frame average amount and the b frame average amount of image sets Average;
If present frame non-i frame also non-p frame it is determined that present frame be b frame.
2. according to claim 1 the detection method of frame type it is characterised in that the frame place code stream that receives of described acquisition Type of coding include:
Count the type of obvious i frame a later frame, if the ratio of p frame reaches setting ratio and then determines that type of coding is that closed loop is compiled Code, otherwise encodes for open loop.
3. according to claim 1 the detection method of frame type it is characterised in that also including:
If the data volume of present frame is more than Second Threshold it is determined that present frame is i frame;Before described Second Threshold is present frame The data volume of one i frame, present frame in the group of images the average amount of p frame and set number successive frame average data Maximum in amount.
4. according to claim 1 the detection method of frame type it is characterised in that also including:
If the interval of present frame and previous i frame exceedes fixed interval, and the data volume of present frame be more than the 3rd threshold value it is determined that Present frame is i frame;The average amount according to each frame of present frame place image sets for described 3rd threshold value, the previous p of present frame The data volume of the data volume of frame and present frame place image sets i frame, the distance of a upper i frame to present frame and expected fixation Being calculated away from degree of i frame period;Or, described 3rd threshold value is according to the average data of each frame of present frame place image sets Amount and a upper i frame are to the distance of present frame and being calculated away from degree of expected fixing i frame period.
5. according to claim 1 the detection method of frame type it is characterised in that also including:
If the data volume that the previous frame of present frame is p frame and present frame is more than the 5th threshold value, or present image group exist b frame and The data volume of present frame is more than the 6th threshold value it is determined that present frame is p frame;Described 5th threshold value is: the first regulatory factor with work as The average amount of p frame of previous frame place image sets long-pending, described first regulatory factor is more than 0.5 and is less than 1;Described 6th threshold It is worth and is: the average of p frame average amount and b frame average amount;
If the data volume that the previous frame of present frame is b frame and present frame is less than the 7th threshold value, or present image group exist p frame and The data volume of present frame is less than the 8th threshold value it is determined that present frame is p frame;Described 7th threshold value is: the second regulatory factor with work as The average amount of b frame of previous frame place image sets long-pending, described second regulatory factor is more than 1 and is less than 1.5;Described 8th threshold It is worth and is: the average of p frame average amount and b frame average amount.
6. according to claim 1 to 5 any one the detection method of frame type it is characterised in that also including:
After frame type judges to terminate, determine the fixed interval of i frame, if still not judging there is i after fixed interval reach Frame, then be defined as i frame by the frame of the maximum amount of data in set point at fixed interval;And update all kinds frame in image sets Average amount and i frame spacing parameter.
7. according to claim 1 to 5 any one the detection method of frame type it is characterised in that also including:
After frame type judges to terminate, count continuous b frame, if the quantity of continuous b frame is more than predictive value, will be described continuous In b frame, the maximum frame of data volume is defined as p frame;And update the average amount of all kinds frame in image sets;Described predictive value More than or equal to 3 and less than or equal to 7.
8. according to claim 1 to 5 any one the detection method of frame type it is characterised in that also including:
Determine whether the frame that has been received by occurs packet loss, if there is packet loss it is determined that packet loss type;
If packet loss type is frame in packet loss, calculates and determine receiving the data volume of frame and packet loss data volume during frame data amount and be The data volume of this frame;
If packet loss type is interframe packet loss it is determined that whether the flag bit of bag before at packet loss is 1, if so, then by packet loss Data volume counts a later frame, otherwise the data volume of packet loss is averagely allocated to before and after two frame.
9. according to claim 8 the detection method of frame type it is characterised in that also including:
Frame type prediction coding structure is had been detected as by statistics;
If packet loss type is interframe packet loss, at packet loss before the flag bit of bag cannot detect, then according to the coding structure of prediction And the position segmentation current data length of packet loss.
10. a kind of detection means of frame type is it is characterised in that include:
Type obtaining unit, for obtaining the type of coding of the frame place code stream having been received by, described type of coding includes: opens Ring coding and closed loop coding;
Frame type determining units, if the data volume for present frame is more than first threshold, determine that present frame is obvious i frame, institute State first threshold to be calculated by the average amount of the frame setting continuous number and i frame data amount;
If the former frame of present frame is i frame, type of coding encodes for closed loop and present frame is non-obvious i frame, or, if currently The former frame of frame is i frame, type of coding is more than the 4th threshold value it is determined that present frame is for the data volume of open loop coding and present frame P frame;Described 4th threshold value is the p frame average amount of image sets and the average of b frame average amount;
If present frame non-i frame also non-p frame it is determined that present frame be b frame.
11. according to claim 10 frame type detection means it is characterised in that
Described frame type determining units, if the data volume being additionally operable to present frame is more than Second Threshold it is determined that present frame is i frame; Described Second Threshold is the data volume of an i frame before present frame, present frame in the group of images p frame average amount with And set the maximum in the average amount of number successive frame.
12. according to claim 10 frame type detection means it is characterised in that
Described frame type determining units, if the interval being additionally operable to present frame and previous i frame exceedes fixed interval, and present frame Data volume is more than the 3rd threshold value it is determined that present frame is i frame;Described 3rd threshold value is: each frame of present frame place image sets flat All the data volume of data volume, the data volume of previous p frame of present frame and present frame place image sets i frame, a upper i frame arrive The distance of present frame is calculated away from degree with expected fixing i frame period;Or, described 3rd threshold value is according to present frame The average amount of each frame of place image sets and a upper i frame are remote to distance and the expected fixing i frame period of present frame It is calculated from degree.
13. according to claim 10 frame type detection means it is characterised in that
Described frame type determining units, if the previous frame being additionally operable to present frame is more than the 5th threshold for the data volume of p frame and present frame Value, or present image group there is b frame and the data volume of present frame is more than the 6th threshold value it is determined that present frame is p frame;Described Five threshold values are: the first regulatory factor long-pending, described first regulatory factor with the average amount of p frame of present frame place image sets More than 0.5 and less than 1;Described 6th threshold value is: the average of p frame average amount and b frame average amount;
If the previous frame of present frame is less than the 7th threshold value for the data volume of b frame and present frame, or present image group is slightly in p frame And the data volume of present frame is less than the 8th threshold value it is determined that present frame is p frame;Described 7th threshold value is: the second regulatory factor with The average amount of b frame of present frame place image sets long-pending, described second regulatory factor is more than 1 and is less than 1.5;Described 8th Threshold value is: the average of p frame average amount and b frame average amount.
14. according to claim 10 to 13 any one frame type detection means it is characterised in that also including:
Interval acquiring unit, for, after frame type judges to terminate, determining the fixed interval of i frame;
Described frame type determining units, if be additionally operable to still not judge there is i frame after fixed interval reach, between fixing The frame of the maximum amount of data in place's set point is defined as i frame;
First updating block, for updating the spacing parameter of the average amount of all kinds frame and i frame in image sets.
15. according to claim 10 to 13 any one frame type detection means it is characterised in that also including:
Statistic unit, for, after frame type judges to terminate, counting continuous b frame;
Described frame type determining units, if the quantity being additionally operable to continuous b frame is more than predictive value, by number in described continuous b frame It is defined as p frame according to the maximum frame of amount;Described predictive value is more than or equal to 3 and is less than or equal to 7;
Second updating block, for updating the average amount of all kinds frame in image sets.
16. according to claim 10 to 13 any one frame type detection means it is characterised in that also including:
, for determining whether the frame that has been received by occurs packet loss, if there is packet loss it is determined that packet loss in packet loss type determining units Type;
Data volume determining unit, if being frame in packet loss for packet loss type, calculating and determining the data receiving frame during frame data amount Amount and data volume that is packet loss data volume and being this frame;
If packet loss type is interframe packet loss it is determined that whether the flag bit of bag before at packet loss is 1, if so, then by packet loss Data volume counts a later frame, otherwise the data volume of packet loss is averagely allocated to before and after two frame.
CN201310664666.2A 2010-12-17 2010-12-17 Method and device for detecting frame type Active CN103716640B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310664666.2A CN103716640B (en) 2010-12-17 2010-12-17 Method and device for detecting frame type

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310664666.2A CN103716640B (en) 2010-12-17 2010-12-17 Method and device for detecting frame type
CN201010594322.5A CN102547300B (en) 2010-12-17 2010-12-17 Method for detecting frame types and device

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201010594322.5A Division CN102547300B (en) 2010-12-17 2010-12-17 Method for detecting frame types and device

Publications (2)

Publication Number Publication Date
CN103716640A CN103716640A (en) 2014-04-09
CN103716640B true CN103716640B (en) 2017-02-01

Family

ID=50409145

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310664666.2A Active CN103716640B (en) 2010-12-17 2010-12-17 Method and device for detecting frame type

Country Status (1)

Country Link
CN (1) CN103716640B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107659822B (en) * 2017-09-29 2019-07-09 武汉斗鱼网络科技有限公司 A kind of method and device for the frame type judging video frame
CN109089153B (en) * 2018-08-31 2022-08-19 平安科技(深圳)有限公司 PS data stream decoding method, device, computer equipment and storage medium
CN110213614B (en) * 2019-05-08 2021-11-02 北京字节跳动网络技术有限公司 Method and device for extracting key frame from video file
CN113727116B (en) * 2021-07-21 2024-04-23 天津津航计算技术研究所 Video decoding method based on filtering mechanism

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1211877A (en) * 1998-07-15 1999-03-24 国家科学技术委员会高技术研究发展中心 MPEG-2 vedio-frequency decoder and its input buffer control method
EP2077672A1 (en) * 2007-08-22 2009-07-08 Nippon Telegraph and Telephone Corporation Video quality estimation device, video quality estimation method, frame type judgment method, and recording medium
CN101518657A (en) * 2008-12-31 2009-09-02 上海序参量科技发展有限公司 Sector device for eliminating environmental pollution
CN101651815A (en) * 2009-09-01 2010-02-17 中兴通讯股份有限公司 Visual telephone and method for enhancing video quality by utilizing same

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1211877A (en) * 1998-07-15 1999-03-24 国家科学技术委员会高技术研究发展中心 MPEG-2 vedio-frequency decoder and its input buffer control method
EP2077672A1 (en) * 2007-08-22 2009-07-08 Nippon Telegraph and Telephone Corporation Video quality estimation device, video quality estimation method, frame type judgment method, and recording medium
CN101518657A (en) * 2008-12-31 2009-09-02 上海序参量科技发展有限公司 Sector device for eliminating environmental pollution
CN101651815A (en) * 2009-09-01 2010-02-17 中兴通讯股份有限公司 Visual telephone and method for enhancing video quality by utilizing same

Also Published As

Publication number Publication date
CN103716640A (en) 2014-04-09

Similar Documents

Publication Publication Date Title
CN102547300B (en) Method for detecting frame types and device
KR101828628B1 (en) Methods and apparatuses for temporal synchronisation between the video bit stream and the output video sequence
CN101505316B (en) Method and device for reordering and multiplexing multimedia packets from multimedia streams pertaining to interrelated sessions
JP5996541B2 (en) Method and system for measuring the quality of audio and video bitstream transmission over a transmission chain
CN104639943B (en) A kind of generic video encryption method and system based on H.264 coding standard
CN102714752B (en) Technique for video quality estimation
KR101834031B1 (en) Method and apparatus for assessing the quality of a video signal during encoding and transmission of the video signal
US20100238792A1 (en) Information acquisition system, transmit apparatus, data obtaining apparatus, transmission method, and data obtaining method
CN103716640B (en) Method and device for detecting frame type
CN103053134A (en) Method for estimating type of group of picture structure of plurality of video frames in video stream
EP3171586A2 (en) Scene change detection for perceptual quality evaluation in video sequences
CN101662680A (en) Method, device and system for measuring video flow performance
CN106303537A (en) A kind of many code stream transmission methods of openh264
JP5472120B2 (en) Picture type estimation apparatus, method, and program
CN108900831B (en) Flower screen event detecting method and its detection system
CN108989326A (en) A method of comparison network transmission TS Streaming Media consistency
WO2009057898A1 (en) Apparatus and method for analysis of image
Argyropoulos et al. Scene change detection in encrypted video bit streams
Díaz et al. Adaptive protection scheme for MVC-encoded stereoscopic video streaming in IP-based networks
Kollar et al. A method of MOS evaluation for video based services
Yang et al. A no-reference quality assessment system for video streaming over RTP

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant