CN103716640A - Method and device for detecting frame type - Google Patents
Method and device for detecting frame type Download PDFInfo
- Publication number
- CN103716640A CN103716640A CN201310664666.2A CN201310664666A CN103716640A CN 103716640 A CN103716640 A CN 103716640A CN 201310664666 A CN201310664666 A CN 201310664666A CN 103716640 A CN103716640 A CN 103716640A
- Authority
- CN
- China
- Prior art keywords
- frame
- present
- type
- data volume
- threshold value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
An embodiment of the invention discloses a method and a device for detecting a frame type, wherein the method comprises obtaining the coding type of a code stream where a received frame is, the code type includes open loop coding and closed loop coding; if the data size of the current frame is larger than a first threshold value, the current frame is determined to be an obvious intraframe coding frame I; if a former frame of the current frame is Frame I, the coding type is closed loop coding and the current frame is a unobvious Frame I, or, if the former frame of the current frame is Frame I, the coding type is open loop coding and the data size of the current frame is larger than a fourth threshold value, the current frame is determined to be a one-way predictive coding frame P; and if the current frame is neither Frame I nor Frame P, the current frame is determined to be a B frame. The technical scheme provided by the embodiment of the invention combines the relationship of the sizes of former and latter data sizes of different types of frames, and judges the frame type under the circumstance of not decoding a net load, thereby eliminating the influence of an attenuation factor, and improving the accuracy of detection of the frame type.
Description
Technical field
The present invention relates to technical field of video processing, particularly the detection method of frame type and device.
Background technology
Decodable code data frame type in video encoding standard can be divided into intracoded frame (I-Frame, Intra coded frames, I frame), single directional prediction coded frame (P-Frame, Predicted frames, P frame), bi-directional predictive coding frame (B-Frame, Bi-directional predicted frames, B frame).In Video Applications, I frame, as decodable initial, is commonly referred to as random access point, and the services such as random access and fast browsing can be provided.In transmitting procedure, different frame types are made mistakes, and on the impact of the subjective quality of decoding end, are different, and I frame has the effect that truncated error is propagated, and therefore, if I frame is made mistakes, the decoding quality of whole video are affected greatly; P frame tends to the reference frame as other inter-frame encoding frame, and it acts on inferior to I frame; Because B frame is not conventionally as reference frame, it loses video decode quality impact less.Therefore, the different frame type of distinguishing data flow in transmission of video application has very important meaning, such as: as the important parameter of video quality assessment, the accuracy of frame type judgement directly has influence on the accuracy of assessment result; Can carry out effective transmission that equal difference protection is not realized video to frame dissimilar in video, in addition in order to save transfer resource, when bandwidth is not enough, can abandon some affects little frame to subjective quality.
Conventional flow transmission technology is mainly (the Internet Streaming Media Alliance of internet stream media alliance, ISMA) mode and motion video expert group transport stream (the Moving Picture Expert Group-2Transport Stream over Internet Protocol on Internet protocol alive, MPEG-2 TS over IP) mode, these two kinds of protocol modes, when compressed video data stream is encapsulated, have all designed the indicating bit of energy instruction video data type.ISMA mode is that compressed video data stream is directly adopted to RTP (Real-time Transport Protocol, RTP) encapsulate, wherein MPEG-4 Part2 follows internet standard 3016(Request For Comments3016, RFC3016), H.264/ the sense of hearing and visual signal coding (Aural and Visual Code, AVC) follow RFC3984, take RFC3984 as example, and the sequence number that RTP head comprises (Sequence Number), timestamp (Timestamp) etc. can be used for judging frame losing and help to detect frame type, MPEG-2TS over IP mode is also divided two kinds: (the TS over User Datagram Protocol/IP of the transport stream on user datagram protocol/IP, TS over UDP/IP) (the TS over Real-time Transport Protocol/UDP/IP of the transport stream and on RTP/UDP/IP, TS over RTP/UDP/IP), relatively conventional in transmission of video is to be called for short TS over RTP after TS over RTP/UDP/IP(), that compressed video data stream is encapsulated as to Basic Flow, further Basic Flow is divided into TS grouping, finally to TS, grouping encapsulates and transmits with RTP.
RTP is a kind of host-host protocol for multimedia data stream, is responsible for providing real-time Data Transmission end to end, and its message mainly comprises four parts: RTP head, RTP extension header, carries head only, only carries data.The data that comprise in RTP head mainly contain: sequence number, timestamp, flag bit etc.Sequence number is corresponding one by one with RTP bag, and bag of every transmission increases by 1, can be used for detecting packet loss; Timestamp can represent the sampling time of video data, and different frames has different timestamps, playing sequence that can instruction video data; Flag bit is used for identifying the end of a frame.These information are important evidence of frame type judgement.
A TS grouping has 188 bytes, by packet header, variable-length adapter head and net load data, formed, the initial indicating bit of packet header (payload unit start indicator wherein, PUSI) represent whether net load data comprise stream of packets (the Packet Elementary Stream of packing, PES) packet header or Program Specific Information (Program Special Information, PSI).For media formats H.264, each PES packet header is indicating the beginning of a NAL unit.Some flag bits in TS grouping self adaptation section, as: random access indication (random access indicator), Basic Flow priority indication (elementary stream priority indicator), can be used for judging the importance of transferring content, for video, random access is designated as in first PES bag that 1 expression runs into subsequently and comprises sequence start information, and Basic Flow priority is designated as in this TS packet payload of 1 expression and has more Intra blocks of data.
If judge TS packet payload by PUSI, partly comprise PES packet header, can further excavate transmission Useful Information.PES grouping is comprised of PES packet header and subsequent grouped data, and primary flow data (video, audio frequency etc.) are carried in PES bag data.PES grouping is inserted in transport stream packet, and first byte of each PES packet header is exactly first byte of transport stream packet pay(useful) load.A PES packet header must be included in a new TS bag, PES bag data will be full of the Payload region that TS transmits bag simultaneously, if the ending of PES bag data cannot be alignd with the ending of TS bag, need in the adaptive region of TS, insert the byte of padding of respective numbers, make both ending alignment.PES priority represents the clean importance of carrying in PES bag data, for video, is 1 expression Intra data; PTS represents displaying time in addition, and DTS represents decode time, can be used for judging the front and back correlation of video payload content, thus judgement load type.
In TS over RTP mode, in order to protect the video content of copyright in transmission, in transmitting procedure, tend to adopt the mode that load is encrypted to transmit.To the encryption of TS grouping, be that the payload portions of grouping is encrypted, once the scrambling mark of TS head puts 1, its load is encrypted, now only can utilize the length (length of same frame of video) between adjacent PUSI with the packet of identical PID to judge load data type.If PES unencryption in TS grouping,, except utilizing the length of above-mentioned frame of video to judge data frame type, can also utilize PTS to assist judgment frame type.
By above introduce known: its data volume of dissimilar Frame is had any different, and I frame is owing to only having removed the redundancy in frame, and its data volume is generally large than the inter-frame encoding frame of having removed interframe redundancy, and P frame is general larger than the data volume of B frame.For this characteristic, exist at present some frame type detection algorithms in the situation that of TS block encryption, utilize the data volume of frame to carry out judgment frame type; Below introduce the two kinds of many methods of using:
One: by resolving TS, divide into groups, obtain the length of each frame of video, by length scale information, infer frame type.The method having proposed is in the situation of having encrypted for the payload portions of TS grouping, determines frame type.
The method is by resolving the lost condition of the Continuity Counter territory judgement grouping of TS grouping, by this, carry out judgement image sets (Group Of Pictures before, GOP) structural information is estimated the Packet State of losing, and in conjunction with available information (the Random Access Indicator of TS packet header self adaptation field, RAI or Elementary Stream Priority Indicator, ESPI) judge the type of frame of video.
For the identification of I frame, can be by following three kinds of methods:
1, utilize RAI or ESPI identification I frame.
2, in the time can not utilizing RAI or ESPI identification, by the data of a GOP of buffer memory, by the maximum in the data of current cache, as I frame, the length of GOP need to pre-define, once GOP length changes, the method will lose efficacy.
3, the value that use to represent maximum GOP length, as I frame fixed cycle really, determines that the maximum amount of data frame in the cycle is I frame, determines that the cycle is the maximum in detected I frame period.
For P frame, by following three kinds of methods:
1,, the frame between the frame before from start frame to I frame immediately, select data volume to be greater than each frame of frame around and be defined as P frame.Gop structure for processing target stream comprises framing pattern really, from determining the cycle, select to determine that with N kind the corresponding successive frame of frame pattern is as definite target frame, magnitude relationship and definite frame pattern between the data volume of definite target frame are compared, can the coupling based on therebetween determine P frame.In gop structure, use following pattern as determining frame pattern: this pattern comprises all continuous B frame before P frame immediately and at a B frame of P frame next frame.Now some informational needs of GOP pre-enter.
2, the mean value of the frame data amount of a plurality of frames based on pre-position in expression mode and the comparative result between the frame data amount of each frame in the threshold value calculated and expression mode.
3, with adjusting coefficient, based on frame data amount, adjust for distinguishing the threshold value of P and B frame.Adjust coefficient: the interim adjustment coefficient of selective sequential is carried out with frame type and determined and process identical processing in given range, thereby the frame type to each frame in learn cycle given in advance is estimated, calculate estimated result and determine ratio with the mistake of the actual frame type of obtaining from non-encrypted stream, know and there is interim adjustment coefficient that lowest error determines ratio as real adjustment coefficient.
For B frame, determination methods is: I frame, the frame beyond P frame is defined as B frame.
The method of above judgment frame type, for the situation that has packet loss, based on RTP sequence number and TS stem continuity designator (CC), can detect packet loss, by gop structure, can pattern matching estimate the Packet State of losing, thus the correction acquiring a certain degree.But for the method that can not adjust threshold value, need to pre-enter GOP information, for the method for adjustable thresholds, need to from unencrypted code stream, obtain frame type information and train coefficient, need too much manual intervention.In addition, need GOP of buffer memory to carry out again frame type estimation, be not suitable for real-time application.Again, the judgement of I frame is only carried out once, and adjustable coefficient is the cycle, and in each cycle, directly getting maximum is I, has only considered local characteristics, for global property, does not consider.
Two: the method for utilizing threshold value to distinguish different frame can divide four steps to carry out:
1, the renewal of threshold value:
Distinguish the threshold value (Ithresh) of I frame:
Scaled_max_iframe=scaled_max_iframe*0.995; Wherein scaled_max_iframe is a upper I frame sign.
If nbytes>scaled_max_iframe,
Ithresh=(scaled_max_iframe/4+av_nbytes*2)/2; Wherein av_nbytes is the slip average of current 8 frames.
Distinguish the threshold value (Pthresh) of P frame:
Scaled_max_pframe=scaled_max_pframe*0.995; Wherein scaled_max_pframe is a upper P frame sign.
If nbytes>scaled_max_pframe, pthresh=av_nbytes*0.75;
2, detect I frame: video has an I frame at set intervals, and I frame is larger than mean value, and I frame is larger than P frame.If current frame data amount is larger than Ithresh, think that this frame is I frame.
3, detect P frame: utilize B frame less than mean value.If the data volume of present frame is greater than Pthresh, be less than Ithresh, think that this frame is P frame.
4, other frame is B frame.
The method of above the second judgment frame type, adopts decay factor to control threshold value, and this factor directly affects the judgement of I frame, when follow-up I frame is greater than current I frame, easily judges I frame; But when follow-up I frame is during much smaller than current I frame, need to just can rejudge out I frame through the decay of a lot of frames.And in algorithm, be fixed as 0.995, do not consider that GOP changes violent situation, also inapplicable in a lot of situations.Decay factor is less, and I frame loss is less, and P is mistaken for the probability increase of I frame simultaneously; Decay factor is larger, and I frame loss increases (when the size variation of I frame is violent in sequence), and I frame is judged as to P frame.Therefore Detection accuracy is lower.In addition, only consider to use threshold decision B/P frame, to I/P/P/P ... this frame structure, algorithm can be that B frame False Rate is high by a lot of P frame misjudgement.
Summary of the invention
The technical problem that the embodiment of the present invention will solve is to provide a kind of detection method and device of frame type, improves the accuracy that frame type detects.
For solving the problems of the technologies described above, the detection method embodiment of frame type provided by the present invention can be achieved through the following technical solutions:
Detect the reproduction time of each frame;
If the reproduction time of present frame is less than the maximum play time of the frame having received, determine that described present frame is bi-directional predictive coding frame B frame.
A detection method for frame type, comprising:
The type of coding of the frame place code stream that acquisition receives, described type of coding comprises: open loop coding and closed loop coding;
If the data volume of present frame is greater than first threshold, determine that present frame is obvious intracoded frame I frame, described first threshold is calculated by average amount and the I frame data amount of setting the frame of continuous number;
If the former frame of present frame is I frame, type of coding, be that closed loop coding and present frame are non-obvious I frame, or, if the former frame of present frame is I frame, type of coding, be that the data volume of open loop coding and present frame is greater than the 4th threshold value, determine that present frame is single directional prediction coded frame P frame; The P frame average amount that described the 4th threshold value is an image sets and the average of B frame average amount;
If the non-I frame of present frame is non-P frame also, determine that present frame is B frame.
A checkout gear for frame type, comprising:
Time detecting unit, for detection of the reproduction time of each frame;
Frame type determining unit, if be less than the maximum play time of the frame having received for the reproduction time of present frame, determines that described present frame is bi-directional predictive coding B frame.
A checkout gear for frame type, comprising:
Type obtains unit, and for obtaining the type of coding of the frame place code stream having received, described type of coding comprises: open loop coding and closed loop coding;
Frame type determining unit, determines that present frame is obvious I frame if be greater than first threshold for the data volume of present frame, and described first threshold is calculated by average amount and the I frame data amount of setting the frame of continuous number;
If the former frame of present frame is I frame, type of coding, be that closed loop coding and present frame are non-obvious I frame, or, if the former frame of present frame is I frame, type of coding, be that the data volume of open loop coding and present frame is greater than the 4th threshold value, determine that present frame is P frame; The P frame average amount that described the 4th threshold value is an image sets and the average of B frame average amount;
If the non-I frame of present frame is non-P frame also, determine that present frame is B frame.
The technical scheme that the embodiment of the present invention provides, front and back data volume magnitude relationship in conjunction with coded sequence and the dissimilar frame of dissimilar frame, at the clean situation of carrying of the not decoding frame type that judges, eliminated the impact of decay factor, improved the accuracy that frame type detects.
Accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme of the embodiment of the present invention, below the accompanying drawing of required use during embodiment is described is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Figure 1A is embodiment of the present invention method flow schematic diagram;
Figure 1B is embodiment of the present invention method flow schematic diagram;
Fig. 2 a is embodiment of the present invention classification B frame coding structure schematic diagram;
Fig. 2 b is the relation of embodiment of the present invention coded sequence and playing sequence, and the level schematic diagram of coding;
Fig. 3 is embodiment of the present invention packet loss frame structure schematic diagram;
Fig. 4 is embodiment of the present invention method flow schematic diagram;
Fig. 5 is embodiment of the present invention side's apparatus structure schematic diagram;
Fig. 6 is embodiment of the present invention side's apparatus structure schematic diagram;
Fig. 7 is embodiment of the present invention side's apparatus structure schematic diagram;
Fig. 8 is embodiment of the present invention side's apparatus structure schematic diagram;
Fig. 9 is embodiment of the present invention side's apparatus structure schematic diagram;
Figure 10 is embodiment of the present invention testing result schematic diagram;
Figure 11 is embodiment of the present invention testing result schematic diagram;
Figure 12 is embodiment of the present invention testing result schematic diagram;
Figure 13 is embodiment of the present invention testing result schematic diagram;
Figure 14 is embodiment of the present invention testing result schematic diagram;
Figure 15 is embodiment of the present invention testing result schematic diagram.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Embodiment based in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
A detection method for frame type, as shown in Figure 1A, comprising:
101A: the reproduction time that detects each frame;
102A: if the reproduction time of present frame is less than the maximum play time of the frame having received, determine that above-mentioned present frame is bi-directional predictive coding B frame;
Further, the embodiment of the present invention is all right: according to the playing sequence of each frame and coded sequence, determine the level under B frame is in hierarchical coding.For how to confirm level, will be further described below.Characteristic based on B frame, if determined, the level under it can for example, in the application of a lot of fields: when compressed data frames, can abandon the B frame that level is high.The application embodiment of the present invention after the level of B frame is determined will not limit.
Above-described embodiment, in conjunction with the front and back data volume magnitude relationship of coded sequence and the dissimilar frame of dissimilar frame, at the clean situation of carrying of the not decoding frame type that judges, has eliminated the impact of decay factor, has improved the accuracy that frame type detects.
The embodiment of the present invention also provides the detection method of another kind of frame type, as shown in Figure 1B, comprising:
101B: obtain the type of coding of the frame place code stream receiving, above-mentioned type of coding comprises: open loop coding and closed loop coding;
102B: determine that present frame is obvious I frame if the data volume of present frame is greater than first threshold, above-mentioned first threshold is calculated by average amount and the I frame data amount of setting the frame of continuous number;
Above-mentioned obvious I frame belongs to I frame, if be judged as obvious I frame, so wrongheaded probability is very low, but likely occurs failing to judge, and the situation of I frame may appear misjudging in the mode of follow-up other judgements I frame.
If the former frame of present frame is I frame, type of coding, be that closed loop coding and present frame are that (present frame is not clear its frame type now for non-obvious I frame, but can determine whether it is obvious I frame), or, if the former frame of present frame is I frame, type of coding, be that the data volume of open loop coding and present frame is greater than the 4th threshold value, determine that present frame is P frame; The P frame average amount that above-mentioned the 4th threshold value is an image sets and the average of B frame average amount;
If the non-I frame of present frame is non-P frame also, determine that present frame is B frame.
It should be noted that, method corresponding to above-mentioned Figure 1B can independent utility, also can be combined with the method for Figure 1A, uses implementation if be combined with can reproduction time cannot detect in Figure 1A in the situation that.
The type of coding of the frame place code stream that above-mentioned acquisition receives comprises:
Add up the type of a frame after obvious I frame, if the ratio of P frame reaches preset proportion, determine that type of coding is closed loop coding, otherwise encode for open loop.
Following examples be take the scheme of Figure 1B and the scheme of Figure 1A and are combined with as example and describe, if Figure 1B scheme independently use time, can check whether reproduction time can be detected.
If the embodiment of the method that reproduction time cannot detect in 101A further, also comprises:
If present frame is greater than Second Threshold, determine that present frame is I frame; Above-mentioned Second Threshold is the maximum in the average amount of P frame in the data volume, present frame place image sets of an I frame before present frame and the average amount of setting number successive frame.
If the embodiment of the method that reproduction time cannot detect in 101A further, also comprises:
If present frame is greater than the 3rd threshold value, and the interval of present frame and previous I frame is over fixed intervals, determines that present frame is I frame; Above-mentioned the 3rd threshold value is: the average amount of present frame place each frame of image sets, a upper I frame calculate to the data volume of the previous P frame away from degree, present frame of the distance of present frame and the fixedly I frame period of expection and the data volume of present frame place image sets I frame; Or above-mentioned the 3rd threshold value is calculating away from degree to the distance of present frame and the fixedly I frame period of expection according to the average amount of present frame place each frame of image sets and a upper I frame.
If the embodiment of the method that reproduction time cannot detect in 101A further, also comprises:
If the previous frame of present frame is the data volume of P frame and present frame, be greater than the 5th threshold value, or present image group exists the data volume of B frame and present frame to be greater than the 6th threshold value, determine that present frame is P frame; Above-mentioned the 5th threshold value is: the average amount of the P frame of the first regulatory factor and present frame place image sets long-pending, and above-mentioned the first regulatory factor is greater than 0.5 and be less than 1; Above-mentioned the 6th threshold value is: the average of P frame average amount and B frame average amount;
If the previous frame of present frame is the data volume of B frame and present frame, be less than the 7th threshold value, or present image group exists the data volume of P frame and present frame to be less than the 8th threshold value, determine that present frame is P frame; Above-mentioned the 7th threshold value is: the average amount of the B frame of the second regulatory factor and present frame place image sets long-pending, and above-mentioned the second regulatory factor is greater than 1 and is less than 1.5; Above-mentioned the 8th threshold value is: the average of P frame average amount and B frame average amount.
If the embodiment of the method that reproduction time cannot detect in 101A further, also comprises:
After frame type judgement finishes, determine the fixed intervals of I frame, if still do not judge I frame after fixed intervals reach, the frame of the maximum amount of data in fixed intervals place setting range is defined as to I frame; And upgrade the average amount of all kinds frame in image sets and the spacing parameter of I frame.
If the embodiment of the method that reproduction time cannot detect in 101A further, also comprises:
After frame type judgement finishes, add up continuous B frame, if continuous B frame number is greater than predicted value, the frame of data volume maximum in above-mentioned continuous B frame is defined as to P frame; And upgrade the average amount of all kinds frame in image sets; Above-mentioned predicted value is more than or equal to 3 and is less than or equal to 7.
If the embodiment of the method that reproduction time cannot detect in 101A further, also comprises:
Determine whether the frame having received packet loss occurs, if there is packet loss, determine packet loss type;
If packet loss type is packet loss in frame, while calculating frame data amount, determine receive the data volume of frame and packet loss data volume and be the data volume of this frame;
If packet loss type is interframe packet loss, determine whether the flag bit of packet loss place bag is before 1, if so, the data volume of packet loss is calculated into a rear frame, otherwise two frames before and after the data volume of packet loss is averagely allocated to.
Above-mentioned definite packet loss type comprises further:
By adding up detected frame type forecast coding structure;
If packet loss type is interframe packet loss, the flag bit of the bag before packet loss place cannot detect, and according to the coding structure of prediction and the position of packet loss, cuts apart current data length.
The embodiment of the present invention makes full use of the header packet information of RTP or TS over RTP, front and back data volume magnitude relationship in conjunction with coded sequence and the dissimilar frame of dissimilar frame in video, real-time judgment frame type fast in the situation that decoded video not carries only, and improve the accuracy of frame type detection by the method that packet loss is processed, undated parameter and later stage frame type are corrected automatically.
In video flowing, have the header packet information of the reproduction time of instruction video data, as the RTP timestamp in ISMA mode, and the PTS of PES head in TS over RTP mode.The embodiment of the present invention will be utilized the correlation of reproduction time information and coded sequence, judges the type of coding of some special construction, as: B frame.But for TS over RTP mode, may exist TS only to carry the situation that PES head cannot be decoded of encrypting completely, i.e. PTS non-availability, therefore, the embodiment of the present invention also provides the scheme of not utilizing reproduction time only to utilize the information such as data volume to carry out frame type judgement.
The video code flow of observing in practical application can find, in same GOP, dissimilar frame generally has comparatively significantly difference, and I frame data amount is maximum, and secondly, B frame is minimum for P frame.If can correctly identify the I frame of each GOP section start, can utilize the data volume of this frame to judge P frame and the B frame of this GOP inside.But non-stationary due to vision signal, the I frame data amount difference at diverse location place exists larger difference, even can with GOP before in the data volume of P frame suitable, brought difficulty to judgement I frame.The embodiment of the present invention has designed a set of dynamic parameter can intelligence regulating, to improve robustness and the accuracy of frame type judgement.Particularly, when judgement I frame, take into full account suitable adjusting judgment criterion and the relevant parameter of characteristic of I frame in different application scene, greatly reduced the False Rate of I frame.
In damaging the application scenarios of transmission, can there is packet loss in the video flowing of input, impact according to packet loss on deterministic process, can be divided into two classes: one, the packet loss in frame, now the information of frame boundaries is not lost, can first get frame boundaries, with corresponding sequence number, add up the bag number of a frame; Two, the frame boundaries packet loss (as: bag that in RTP, flag bit is 1, or in TS over RTP, PUSI puts 1 bag), the border of two frames before and after now possibly cannot judging, also before and after possibility, the data of two frames are spliced to a frame, make frame data amount not statistical uncertainty really, affect the result of frame type judgement.The embodiment of the present invention is estimated the frame type that at this point carries out packet loss detection, frame boundaries estimation and part.
, because statistics is inadequate, can there is more erroneous judgement in the early stage in frame type judgement, not only have influence on the result of having exported, more can be by changing various parameter influences to the accuracy of follow-up judgement.The embodiment of the present invention has increased frame type correction after judgment frame type flow process, if carrying out inside when Output rusults has apparent error after data increase corrects, although inner, correct and can not change the frame type of having exported, can improve the accuracy of follow-up judgement by adjusting the mode of parameter.
Below three main points with regard to the embodiment of the present invention are respectively elaborated:
One: utilize reproduction time judgement B frame or/and classification B frame:
Because B frame adopts forward direction and backward coded frame as prediction, its coded sequence after reference frame, makes its reproduction time often inconsistent with coded sequence rear, therefore can judge B frame by reproduction time information.If the reproduction time of present frame is less than the maximum reproduction time of the frame having received, this frame is B frame certainly, otherwise is I frame or P frame.
For the B frame of hierarchical coding, also can utilize reproduction time further to judge the level under highest level and each B frame.The situation of continuous 7 B frames of take is example, shown in Fig. 2 a, is the coding structure figure of classification B frame in this situation, and the subscript of first row letter represents the level under every frame, the broadcasting sequence number that the numeral of second row is each frame.And actual coded sequence is (parenthetic numeral is for playing sequence number) I0/P0 (0), I0/P0 (8), B1 (4), B2 (2), B3 (1), B3 (3), B2 (6), B3 (5), B3 (7).Fig. 2 b is the relation of coded sequence and playing sequence, and the level of coding, and Arabic numerals represent to play sequence number, Chinese figure presentation code sequence number.
With the algorithm of reproduction time judgement classification, can be divided into two steps:
The first step: judgement highest level (being 3 in this example).The level of the 0th frame is made as to 0, then by coded sequence, reads reproduction time, if be less than the reproduction time of former frame when the reproduction time of former frame, the level that the level of present frame is former frame adds 1, on the contrary the same with former frame.Until read i.e. the 1st frame of frame that reproduction time is in close proximity to the 0th frame, now the corresponding level of the 1st frame is highest level.
Second step: judge the level under remaining frame according to the symmetric relation of adjacent B frame reproduction time.Figure after the first step completes. the level in five (b) solid box is all determined, now needs to detect the affiliated level of B frame in dotted line frame.Detection method is to travel through in the frame of determining level, searches out two frames they are equated with the reproduction time of present frame the average of reproduction time, and the maximum level that level of present frame is these two frames adds 1.What ellipse in figure was shown is this symmetric relation, and in ellipse, the average of the reproduction time of two frames equals the reproduction time of bottom frame above, and the level of bottom frame is just for the maximum of above two frame levels adds 1.
Two, utilize frame data amount to carry out judgment frame type:
Whether owing to can only distinguishing according to reproduction time, is B frame, the present embodiment provides and has only utilized the information such as data volume to judge the scheme of I frame and P frame.For judging the situation of B frame according to reproduction time, only need to whether distinguish I frame or P frame to remaining frame; For example, for the situation (header packet information encrypt situation) that cannot judge B frame according to reproduction time, will judge all frames, first determine I frame and P frame, remaining frame is judged to be B frame.
The method that the present embodiment upgrades by Automatic parameter utilizes frame data amount to carry out judgment frame type, is mainly divided into following module (as shown in figure six): I frame judge module, P frame judge module, parameter update module and type are corrected module.
The judgement of A:I frame:
In general the I frame in video can be divided into following two classes: the I frame of fixed intervals, in order to meet to access at random in compression process, according to fixed intervals, (in the regular period, fix, once user's switching channels, this interval may change) the I frame that inserts; The I frame that self adaptation is inserted, is in order to improve compression efficiency, the I frame inserting in scene switching place.
For the I frame of fixed intervals, in identifying, can estimate these fixed intervals, when also not determining I frame over this interval, initiatively relax Rule of judgment or judge (hereinafter to this, will be described in detail) by local feature.
And the I frame inserting for self adaptation, scene switching place like sequence space Complexity classes, if be encoded to the I frame that self adaptation is inserted, because the compression efficiency of I frame is poor, it is larger than P frame before that its code check tends to; If be encoded to P frame, due to forecast variation, its code check also can be larger, and now this frame is important frame, is comparatively easy to be judged as I frame (P frame and I frame data amount are all larger, are easily I frames by P frame misidentification mistakenly).For switching place of the simple scene of space complexity, being encoded to I frame may be also less than P frame before, for this type of I frame, have no idea correctly to identify, but those P frames or B frame thereafter also can correspondingly diminish, by follow-up renewal, can carry out type correction, to improve the discrimination to subsequent frame type.
Therefore, can judge I frame by following three steps, i.e. relatively current frame data amount and given threshold value respectively, as long as current frame data amount is greater than given threshold value and is just judged to be I frame in a certain step:
According to the obvious I frame of threshold value 1 judgement;
According to the I frame at threshold value 2 judgement on-fixed intervals;
According to threshold value, 3 judgements surpass the I frame of the fixed intervals of expection.
The judgement of B:P frame:
For previous frame, be the situation that I frame and current video stream is encoded for closed loop, can be close to B frame after I frame.If this frame is not judged as I frame, it is P frame;
For previous frame, be the situation that I frame and current video stream are open loop coding, if the data volume of present frame is greater than threshold value 4, this frame is P frame, otherwise this frame is B frame;
For previous frame, be the situation of P frame, if current frame data amount is greater than threshold value 5 or is greater than threshold value 6 in the situation that current GOP exists B frame, this frame is P frame so;
For previous frame, be the situation of B frame, represent to have B frame in current GOP, if current frame data amount is less than threshold value 7 or is less than threshold value 8 in the situation that current GOP has judged P frame, this frame is P frame so.
C: parameter is upgraded:
The type of coding (open loop or closed loop) of statistics GOP: in identifying, for obvious I frame, can add up a frame is thereafter B frame or P frame, if be all P frame after most of I frame, can think that this encoder is closed loop coding, otherwise think open loop coding.
Calculate the I frame fixed intervals of expection: after judging I frame, add up the probability distribution at its interval, and by weighted average, the fixed intervals of the expection obtaining.
According to the threshold value in the above-mentioned module of the real-time renewal of the frame type of newly judging:
A) threshold value 1: according to the data volume (iframe_size_GOP) of the average amount of 50 frames (av_IBPnbytes) and previous I frame before, according to formula (1), calculate:
Wherein, delta1 is regulatory factor, and span is (0,1), and the empirical value obtaining according to experiment is 0.5.
B) threshold value 2: the average amount (av_IPnbytes) according to I frame P frame in the average amount (max_pframes_size_GOP) of P frame maximum in the data volume of previous I frame (iframe_size_GOP), current GOP and front 50 frames, calculates according to formula (2):
Wherein, delta2 and delta3 are respectively regulatory factor, and its empirical value is 1.5 and 0.5.
C) threshold value 3: according to the average amount of every frame of current GOP (av_frame_size_GOP), the data volume of previous P frame (prew_pframe_nbytes), the data volume (iframe_size_GOP) of the I frame of current GOP, calculates according to formula (3); Or calculate according to formula (5) according to the P frame average amount (av_pframes_size_GOP) of current GOP:
Wherein, ip_thresh calculating away from degree along with the fixedly I frame period (expected_iframe_interval) of distance (curr_i_interval) from a upper I frame to present frame and expection:
Ip_thresh=max (2-(curr_i_interval-expected_iframe_interval) * 0.1,1.5) formula (4)
Wherein, sThresh calculates according to curr_i_interval and expected_iframe_interval:
SThresh=
Max (delta4, SThresh/ (delta5*curr_i_interval/expected_iframe_interval)) formula (6)
Wherein, delta4 and delta5 are respectively regulatory factor, and its empirical value is 0.2 and 2.0.
D) threshold value 4: they are the P frame average amount (av_pframes_size_Last_GOP) of a upper GOP and the average of B frame average amount (av_bframes_size_Last_GOP), as formula (7):
E) threshold value 5: they are 0.75 to be multiplied by P frame average amount (av_pframes_size_GOP) in current GOP, as formula (8):
Wherein, delta6 is respectively regulatory factor, and its empirical value is 0.75
F) threshold value 6: for the average of P frame average amount (av_pframes_size_GOP) and B frame average amount (max_bframes_size_GOP), as formula (9);
G) threshold value 7: they are 1.25 to be multiplied by B frame average amount (av_bframes_size_GOP) in current GOP, as formula (10):
Wherein, delta7 is respectively regulatory factor, and its empirical value is 1.25
H) threshold value 8: they are the average (av_bframes_size_GOP) of P frame average amount (av_pframes_size_GOP) and B frame average amount, as formula (11):
D: type is corrected:
The I frame that correction is failed to judge:
After above-mentioned steps, may exist the far away fixed intervals that surpass expection also not judge the situation of I frame, now, though output frame type, local information correcting parameter can be utilized, make follow-up frame type judgement more accurate.Near the fixed intervals that approach expection, the frame of the amount of fetching data maximum, changes its frame type into I frame, and upgrades the parameters such as the average amount of each frame type in GOP and I frame period.
Correct the B frame of misjudgement:
Video encoder in practical application generally can take into account decoding time delay and decode stored expense when utilizing B frame to improve code efficiency, can not encode out and surpass the continuous B frame of 7, and even, more extreme, B frame can be over 3 continuously.By the frame type statistics of judging before, draw the maximum predicted value of B frame continuously in this code stream.When a frame is defined as to B frame, need to guarantee that this time continuous B frame number is no more than predicted value.If surpass this value, illustrate in the current frame that is judged as continuously B frame and may have misjudgement, the frame of data volume maximum in these frames need to be changed the original sentence to the frame into P, and upgrade the information such as average amount of each frame type in GOP.
Three, the frame type in the time of cannot determining border and frame data amount detects:
The first two example all need to carry out in frame boundaries and the acquired situation of frame data amount.CC, PUSI, PID(TS over RTP mode in sequence number, timestamp, flag bit (ISMA mode) or RTP sequence number that can be by RTP when without packet loss, TS) know accurately the data volume of frame boundaries and each frame, but in the situation that there is packet loss, if the bag in frame boundaries is lost, the accurate position on judgment frame border, the bag of frame may be counted to misjudgment is even a frame by the data volume spelling of two frames, and this will bring great interference to the detection of frame type.Therefore, if there is packet loss, need to before frame type judgement, carry out packet loss processing, obtain the information such as frame boundaries, frame data amount and frame type.
Because the change flag of RTP timestamp in ISMA mode new frame arrival, therefore, when there is packet loss, its processing procedure is fairly simple:
1) if packet loss surrounding time stamp is unchanged, the bag that representative is lost is inner in a frame, only need when statistics frame data volume, consider the data of packet loss;
2) if packet loss surrounding time stamp changes, representing that packet loss occurs in the border of frame, if now the flag bit of the previous bag of packet loss is 1, is the data of a rear frame depending on packet loss, adds in the data volume of a rear frame; Otherwise, two frames before and after the data volume of packet loss is averagely allocated to (supposing that a Burst loss can not surpass the length of a frame herein).
The situation of TS over RTP is wanted relative complex, due to can only by whether having PES head (being that PUSI is 1) to judge the beginning of a frame, if generation packet loss, the data that are difficult to two of judgements and have between the bag of PES head are to belong to a frame or multiframe, as shown in Figure 3, having there is packet loss 3 times in the data between two bags that have a PES head, but owing to cannot knowing in the bag of loss whether also have PES head (representing the beginning of a frame), cannot judge whether these data belong to same frame.Present case provides respectively solution from two aspects.
If PES head can be separated, can judge according to PTS wherein whether current data length (i.e. data length between two bags that have a PES head) comprises frame originating point information:
1) order of the PTS of the GOP that statistical correction detects, distance weighted as forward index using distribution probability with apart from present frame, obtains expecting coding structure;
2) according to the PTS of the series of frames starting from I frame in reception order, to current PTS and next PTS, mate with the coding structure of expection;
If a) meet the coding structure of expection, think in the packet loss of this data length and do not comprise frame originating point information, current data length is a frame, it is inner that packet loss occurs in this frame, do not need to cut apart;
B) if do not meet the coding structure of expection, illustrate and in packet loss, probably comprise frame originating point information, current data length is cut apart in the position (continuous length, packet loss length etc.) occurring according to the coding structure of expection and packet loss, distributes rational frame type and frame sign and PTS.
3) if be judged as the frame of losing frame head before follow-up discovery, the judged result before upgrading in aligning step.
In addition, can be according to packet loss length, continuous length, maximum continuous length, maximum packet loss length etc. judges whether current data length is a frame and belongs to which kind of frame type:
1), if the length difference of this data length and previous I frame is few, thinks and belong to same I frame; If this data length and P frame are almost large, and maximum continuous length is larger than the data volume of the average B frame within 50 frames, thinks that this data length all belongs to same P frame; Other situations are forwarded to 2);
2) if this data length and two P frames are almost large, to be split as two P frames, this be changed to data length and be divided into two sections, make the length of every section all the most approaching with P frame, and will guarantee that second segment starts with lost package; Other situations are forwarded to 3);
3) similar if this data length and P frame add B frame, to be split as P frame+B frame, the bag of continuous length maximum is attributed to P frame, on this basis this data length is divided into two sections, make the length of every section approach respectively P frame and B frame, and will guarantee that second segment starts with lost package; Other situations are forwarded to 4);
4) similar if maximum continuous length is less than B frame and this data length and three B frames, to be split as three B frames, this data length is divided into three sections, make the length of every section all approach B frame, and will guarantee that the 3rd section of second segment starts with lost package; Other situations are forwarded to 5);
5) similar if maximum continuous length is less than B frame and this data length and two B frames, to be split as two B frames, the bag of this data length is divided into two sections, make the length of every section all approach B frame, and will guarantee that the 3rd section of second segment starts with lost package; Other situations are forwarded to 6);
6) in other situations, think that this data length all belongs to a frame.
The present embodiment, in conjunction with above each example, provides an optional frame type detection scheme, idiographic flow as shown in Figure 4: be divided into following several stages: utilize the preliminary judgment frame type of PTS, packet loss to process, utilize the further judgment frame type of data volume and type to correct.
401: after data input, judge whether packet header can separate, and is to carry out according to reproduction time judgment frame type, otherwise carry out packet loss, process; After frame type judgement finishes, before judging whether, frame judges whether wrongly, has and carries out frame type and correct, otherwise can enter the circulation of frame type judgement, enters 401, concrete carry out as follows:
According to reproduction time judgment frame type: the code stream of input is first determined whether to the bag of TS over RTP, if it is need the PES head that judges TS bag whether to encrypt.The bag of the TS over RTP that can separate for RTP bag or PES head, can tentatively determine whether B frame according to reproduction time information, and concrete enforcement can be with reference to main points one;
Packet loss is processed: detect whether there is packet loss, if directly count data volume without packet loss, enter following frame type determining step; If there is packet loss need carry out respectively packet loss processing for RTP or TS over RTP bag, estimated frame border, frame data amount or partial frame type, concrete enforcement can be with reference to main points three;
According to data volume judgment frame type: this process real-time judge frame type, and the adjustment relevant parameter of dynamic and intelligent, concrete enforcement can be with reference to main points two;
Type is corrected: if the judged result before finding in deterministic process is wrong, can correct, this process does not affect Output rusults, but can be used for upgrading relevant parameter, and to improve the accuracy of follow-up judgement, concrete enforcement can be with reference to main points two.
The embodiment of the present invention also provides a kind of checkout gear of frame type, as shown in Figure 5, comprising:
Frame type determining unit 502, if be less than the maximum play time of the frame having received for the reproduction time of present frame, determines that above-mentioned present frame is bi-directional predictive coding B frame;
Further, in above-mentioned Fig. 5, can also comprise:
The embodiment of the present invention also provides the checkout gear of another kind of frame type, as shown in Figure 6, comprising:
Type obtains unit 601, and for obtaining the type of coding of the frame place code stream having received, above-mentioned type of coding comprises: open loop coding and closed loop coding;
Frame type determining unit 602, determines that present frame is obvious I frame if be also greater than first threshold for the data volume of present frame, and above-mentioned first threshold is calculated by average amount and the I frame data amount of setting the frame of continuous number;
If the former frame of present frame is I frame, type of coding, be that closed loop coding and present frame are non-obvious I frame, or, if the former frame of present frame is I frame, type of coding, be that the data volume of open loop coding and present frame is greater than the 4th threshold value, determine that present frame is P frame; The P frame average amount that above-mentioned the 4th threshold value is an image sets and the average of B frame average amount;
If the non-I frame of present frame is non-P frame also, determine that present frame is B frame.
Further, above-mentioned frame type determining unit 602, if be also greater than Second Threshold for present frame, determines that present frame is I frame; Above-mentioned Second Threshold is the maximum in the average amount of P frame in the data volume, present frame place image sets of an I frame before present frame and the average amount of setting number successive frame.
Further, above-mentioned frame type determining unit 602, if also surpass fixed intervals for the interval of present frame and previous I frame, and present frame is greater than the 3rd threshold value, determines that present frame is I frame; Above-mentioned the 3rd threshold value is: data volume, a upper I frame of the data volume of the average amount of present frame place each frame of image sets, the previous P frame of present frame and present frame place image sets I frame calculate away from degree to the distance of present frame and the fixedly I frame period of expection; Or above-mentioned the 3rd threshold value is calculating away from degree to the distance of present frame and the fixedly I frame period of expection according to the average amount of present frame place each frame of image sets and a upper I frame.
Further, above-mentioned frame type determining unit 602, if the data volume that is also P frame and present frame for the previous frame of present frame is greater than the 5th threshold value, or present image group exists the data volume of B frame and present frame to be greater than the 6th threshold value, determines that present frame is P frame; Above-mentioned the 5th threshold value is: the average amount of the P frame of the first regulatory factor and present frame place image sets long-pending, and above-mentioned the first regulatory factor is greater than 0.5 and be less than 1; Above-mentioned the 6th threshold value is: the average of P frame average amount and B frame average amount;
If the previous frame of present frame is the data volume of B frame and present frame, be less than the 7th threshold value, or present image group is slightly less than the 8th threshold value in the data volume of P frame and present frame, determines that present frame is P frame; Above-mentioned the 7th threshold value is: the average amount of the B frame of the second regulatory factor and present frame place image sets long-pending, and above-mentioned the second regulatory factor is greater than 1 and is less than 1.5; Above-mentioned the 8th threshold value is: the average of P frame average amount and B frame average amount.
Further, as shown in Figure 7, said apparatus also comprises:
Above-mentioned frame type determining unit 602, if be not also defined as I frame by the frame of the maximum amount of data in fixed intervals place setting range by I frame for still judging after reaching in fixed intervals;
The first updating block 702, for upgrading the average amount of image sets all kinds frame and the spacing parameter of I frame.
Further, as shown in Figure 8, said apparatus also comprises:
Above-mentioned frame type determining unit 602, if be also greater than predicted value for the continuous quantity of B frame, is defined as P frame by the frame of data volume maximum in above-mentioned continuous B frame; Above-mentioned predicted value is more than or equal to 3 and is less than or equal to 7
The second updating block 802, for upgrading the average amount of image sets all kinds frame.
Further, as shown in Figure 9, said apparatus also comprises:
Packet loss type determining unit 901, for determining whether the frame having received packet loss occurs, if there is packet loss, determines packet loss type;
Data volume determining unit 902, if be packet loss in frame for packet loss type, while calculating frame data amount, determine receive the data volume of frame and packet loss data volume and be the data volume of this frame;
If packet loss type is interframe packet loss, determine whether the flag bit of packet loss place bag is before 1, if so, the data volume of packet loss is calculated into a rear frame, otherwise two frames before and after the data volume of packet loss is averagely allocated to.
It should be noted that, the device of the device of this enforcement and Fig. 4 or Fig. 5 can merge use, and frame type determining unit 502 can be used same functional unit to realize with frame type determining unit 602.
The embodiment of the present invention makes full use of the header packet information of RTP or TS over RTP, front and back data volume magnitude relationship in conjunction with coded sequence and the dissimilar frame of dissimilar frame in video, real-time judgment frame type fast in the situation that decoded video not carries only, and improve the accuracy of frame type detection by the method that packet loss is processed, undated parameter and later stage frame type are corrected automatically.
In video flowing, have the header packet information of the reproduction time of instruction video data, as the RTP timestamp in ISMA mode, and the PTS of PES head in TS over RTP mode.The embodiment of the present invention will be utilized the correlation of reproduction time information and coded sequence, judges the type of coding of some special construction, as: B frame.But for TS over RTP mode, may exist TS only to carry the situation that PES head cannot be decoded of encrypting completely, i.e. PTS non-availability, therefore, the embodiment of the present invention also provides the scheme of not utilizing reproduction time only to utilize the information such as data volume to carry out frame type judgement.
The video code flow of observing in practical application can find, in same GOP, dissimilar frame generally has comparatively significantly difference, and I frame data amount is maximum, and secondly, B frame is minimum for P frame.If can correctly identify the I frame of each GOP section start, can utilize the data volume of this frame to judge P frame and the B frame of this GOP inside.But non-stationary due to vision signal, the I frame data amount difference at diverse location place exists larger difference, even can with GOP before in the data volume of P frame suitable, brought difficulty to judgement I frame.The embodiment of the present invention has designed a set of dynamic parameter can intelligence regulating, to improve robustness and the accuracy of frame type judgement.Particularly, when judgement I frame, take into full account suitable adjusting judgment criterion and the relevant parameter of characteristic of I frame in different application scene, greatly reduced the False Rate of I frame.
In damaging the application scenarios of transmission, can there is packet loss in the video flowing of input, impact according to packet loss on deterministic process, can be divided into two classes: one, the packet loss in frame, now the information of frame boundaries is not lost, can first get frame boundaries, with corresponding sequence number, add up the bag number of a frame; Two, the frame boundaries packet loss (as: bag that in RTP, flag bit is 1, or in TS over RTP, PUSI puts 1 bag), the border of two frames before and after now possibly cannot judging, also before and after possibility, the data of two frames are spliced to a frame, make frame data amount not statistical uncertainty really, affect the result of frame type judgement.The embodiment of the present invention is estimated the frame type that at this point carries out packet loss detection, frame boundaries estimation and part.
, because statistics is inadequate, can there is more erroneous judgement in the early stage in frame type judgement, not only have influence on the result of having exported, more can be by changing various parameter influences to the accuracy of follow-up judgement.The embodiment of the present invention has increased frame type correction after judgment frame type flow process, if carrying out inside when Output rusults has apparent error after data increase corrects, although inner, correct and can not change the frame type of having exported, can improve the accuracy of follow-up judgement by adjusting the mode of parameter.
Be below several application after frame type judgement, be understandable that applicating example after frame type is determined should not be construed as exhaustive, the embodiment of the present invention is not formed and limited.
1. according to the frame type judging, carry out unequal loss protection: during Bandwidth-Constrained, can on the difference of video quality impact, carry out unequal loss protection according to different frame type, make video reception quality reach optimum.
2. with the expection cycle, in conjunction with the average bit rate of GOP, can realize video fast browsing: for being stored in local code stream user, do not want to browse whole videos, can pass through preliminary treatment fast, thereby extract the realization of I frame correspondence position, scan fast.For the code stream that is stored in server, user does not want to browse whole videos, and server can pass through preliminary treatment fast, thereby extract I frame correspondence position, selectively transmits key frame information to user.
3. service quality (Quality of Service, QOS): when bandwidth is not enough, at intermediate node, can be according to the frame type of judging, intelligence abandons a part of B frame or P frame (the P frame finishing near GOP), when making to reduce code check, and the least possible video quality that affects.
Based on experiment, the effect of the technical scheme of the embodiment of the present invention being tested in addition, is below test result.
The experiment of this section in the situation that there is no packet loss, to utilizing reproduction time and not utilizing two kinds of situations of reproduction time, respectively with background technology in scheme two contrast, result is as shown in table 1.
Table 1 cycle tests
Cycle tests: the code stream of the TS code stream that use existing network is caught and constant bit rate coding is tested, as table one, first three (iptv137, iptv138, iptv139) is that payload segment is encrypted but PES head unencrypted code stream for the code stream that wherein existing network is caught; The code stream code check of constant bit rate coding is (1500,3000,4000,5000,6000,7000,9000,12000,15000).H.264, the code stream of selecting is all for encoding, and its frame type is divided into tri-kinds of I, P, B, and without classification B.Provide the frame type test experience result of above sequence below, as shown in table 2.
Table 2 this paper method and existing method testing result contrast
As shown in Table 2, the following factor has been compared in this experiment: I frame loss is the ratio of I frame sum in undetected I frame and sequence; I frame fallout ratio is for being mistaken for the number of I frame and the ratio of I frame sum (it should be noted that being all in most cases can be only I by P misjudgement, can be I by B misjudgement under few cases, and this is consistent with the fact that B frame code check is far smaller than I frame code check) by P or B; P->I error rate is judged to the number of I frame and the ratio of actual P frame sum for wrong by P frame; P->B error rate is judged to the number of B frame and the ratio of actual P frame sum for wrong by P frame; B->P error rate is judged to the number of P frame and the ratio of actual B frame sum for wrong by B frame; Total error rate is the number of misjudgement and the ratio of totalframes (not being misjudgement as long as the frame type of judgement and actual type meet).I frame loss and I frame fallout ratio mean value can embody the correct detection probability for I frame.
Owing to utilizing the accuracy rate of PTS judgement B frame, be 100%, so the no longer independent result of relatively utilizing reproduction time and not utilizing reproduction time.Meanwhile, in order to fully demonstrate the superiority of the embodiment of the present invention two, in the situation that utilizing reproduction time, existing method has also been increased to the process of utilizing reproduction time judgement B frame, therefore, the difference of performance is mainly from the difference of utilizing the method for frame data amount judgement.Result shows, in the situation that can utilizing reproduction time judgment frame type and do not utilize reproduction time judgment frame type, the code stream that this method is intercepted and captured for existing network and self-editing code stream are all good than existing method, especially to self-editing code stream, it is obvious especially that this method detects effect, even in some cases can be error-free, seldom there is error-free situation in existing method.
Figure 10 to Figure 15 has provided the detailed testing result of some sequences, wherein on actual lines, with circular, identifies, and the lines of prediction identify with triangle; Comprise that (transverse axis represents I frame period to I frame distribution situation, being spaced apart two adjacent frames of 1 expression is I, be spaced apart 0 expression I frame period and be greater than 49, I frame predetermined period is the I frame period of this paper method prediction, I frame actual cycle is the actual I frame period) and the distribution situation of frame type (in figure in form, diagonal of a matrix is the frame number of correct judgement, and other positions are misjudgement).Icon is entitled as sequence name+totalframes+total fallout ratio.The sequence of visible existing network is all generally to have a fixing I frame period (maximum in figure), is accompanied by the switching of scene, can insert adaptively some I frames, thereby cause near disturbance maximum, has formed the I frame distribution situation in figure.For FIFA sequence (figure 14), can see in actual cycle and have two maximum, algorithm also can be told two maximum more accurately herein.The expection I frame period estimating according to this paper algorithm is very similar to actual I frame period, the frame-skipping in the time of therefore can being used for instructing fast browsing.
Figure 10: iptv13715861 (error0.6%) result is as shown in table 3:
Table 3
iptv137 | Detect as P | Detect as B | Detect as I |
Actual type P | 4909 | 0 | 61 |
|
1 | 10215 | 0 |
Actual type I | 36 | 0 | 639 |
Figure 11: iptv13817320 (error0.1%), result is as shown in table 4:
Table 4
iptv138 | Detect as P | Detect as B | Detect as I |
Actual type P | 5676 | 0 | 8 |
|
0 | 10903 | 0 |
Actual type I | 10 | 0 | 723 |
Figure 12: song38741 (error0.9%), result is as shown in table 5:
Table 5
song | Detect as P | Detect as B | Detect as I |
Actual type P | 16698 | 0 | 149 |
|
0 | 20217 | 0 |
Actual type I | 210 | 0 | 1467 |
Figure 13: FIFA9517 (error1.3%), result is as shown in table 6:
Table 6
FIFA | Detect as P | Detect as B | Detect as I |
Actual type P | 4267 | 0 | 21 |
|
0 | 4693 | 0 |
Actual type I | 106 | 0 | 430 |
Figure 14: travel1486 (error0.8%), result is as shown in table 7:
Table 7
travel | Detect as P | Detect as B | Detect as I |
Actual type P | 493 | 0 | 11 |
|
0 | 934 | 0 |
Actual type I | 1 | 0 | 47 |
Figure 15: sport1156 (error0.3%), result is as shown in table 8:
Table 8
sport | Detect as P | Detect as B | Detect as I |
Actual type P | 396 | 0 | 4 |
|
0 | 719 | 0 |
Actual type I | 0 | 0 | 37 |
One of ordinary skill in the art will appreciate that all or part of step realizing in above-described embodiment method is to come the hardware that instruction is relevant to complete by program, above-mentioned program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium of mentioning can be read-only memory, disk or CD etc.
Detection method and the device of the frame type above embodiment of the present invention being provided are described in detail, applied specific case herein principle of the present invention and execution mode are set forth, the explanation of above embodiment is just for helping to understand method of the present invention and core concept thereof; , for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications, to sum up, this description should not be construed as limitation of the present invention meanwhile.
Claims (16)
1. a detection method for frame type, is characterized in that, comprising:
The type of coding of the frame place code stream that acquisition receives, described type of coding comprises: open loop coding and closed loop coding;
If the data volume of present frame is greater than first threshold, determine that present frame is obvious intracoded frame I frame, described first threshold is calculated by average amount and the I frame data amount of setting the frame of continuous number;
If the former frame of present frame is I frame, type of coding, be that closed loop coding and present frame are non-obvious I frame, or, if the former frame of present frame is I frame, type of coding, be that the data volume of open loop coding and present frame is greater than the 4th threshold value, determine that present frame is single directional prediction coded frame P frame; The P frame average amount that described the 4th threshold value is an image sets and the average of B frame average amount;
If the non-I frame of present frame is non-P frame also, determine that present frame is B frame.
2. method according to claim 1, is characterized in that, the type of coding of the frame place code stream that described acquisition has received comprises:
Add up the type of a frame after obvious I frame, if the ratio of P frame reaches preset proportion, determine that type of coding is closed loop coding, otherwise encode for open loop.
3. method according to claim 1, is characterized in that, also comprises:
If the data volume of present frame is greater than Second Threshold, determine that present frame is I frame; Described Second Threshold is the maximum in the average amount of P frame in the data volume, present frame place image sets of an I frame before present frame and the average amount of setting number successive frame.
4. method according to claim 1, is characterized in that, also comprises:
If the interval of present frame and previous I frame surpasses fixed intervals, and the data volume of present frame is greater than the 3rd threshold value, determines that present frame is I frame; Described the 3rd threshold value calculates away from degree to the distance of present frame and the fixedly I frame period of expection according to the data volume of the data volume of the previous P frame of the average amount of present frame place each frame of image sets, present frame and present frame place image sets I frame, a upper I frame; Or described the 3rd threshold value is calculating away from degree to the distance of present frame and the fixedly I frame period of expection according to the average amount of present frame place each frame of image sets and a upper I frame.
5. method according to claim 1, is characterized in that, also comprises:
If the previous frame of present frame is the data volume of P frame and present frame, be greater than the 5th threshold value, or present image group exists the data volume of B frame and present frame to be greater than the 6th threshold value, determine that present frame is P frame; Described the 5th threshold value is: the average amount of the P frame of the first regulatory factor and present frame place image sets long-pending, and described the first regulatory factor is greater than 0.5 and be less than 1; Described the 6th threshold value is: the average of P frame average amount and B frame average amount;
If the previous frame of present frame is the data volume of B frame and present frame, be less than the 7th threshold value, or present image group exists the data volume of P frame and present frame to be less than the 8th threshold value, determine that present frame is P frame; Described the 7th threshold value is: the average amount of the B frame of the second regulatory factor and present frame place image sets long-pending, and described the second regulatory factor is greater than 1 and is less than 1.5; Described the 8th threshold value is: the average of P frame average amount and B frame average amount.
6. according to method described in claim 1 to 5 any one, it is characterized in that, also comprise:
After frame type judgement finishes, determine the fixed intervals of I frame, if still do not judge I frame after fixed intervals reach, the frame of the maximum amount of data in fixed intervals place setting range is defined as to I frame; And upgrade the average amount of all kinds frame in image sets and the spacing parameter of I frame.
7. according to method described in claim 1 to 5 any one, it is characterized in that, also comprise:
After frame type judgement finishes, add up continuous B frame, if the quantity of B frame is greater than predicted value continuously, the frame of data volume maximum in described continuous B frame is defined as to P frame; And upgrade the average amount of all kinds frame in image sets; Described predicted value is more than or equal to 3 and is less than or equal to 7.
8. according to method described in claim 1 to 5 any one, it is characterized in that, also comprise:
Determine whether the frame having received packet loss occurs, if there is packet loss, determine packet loss type;
If packet loss type is packet loss in frame, while calculating frame data amount, determine receive the data volume of frame and packet loss data volume and be the data volume of this frame;
If packet loss type is interframe packet loss, determine whether the flag bit of packet loss place bag is before 1, if so, the data volume of packet loss is calculated into a rear frame, otherwise two frames before and after the data volume of packet loss is averagely allocated to.
9. method according to claim 8, is characterized in that, also comprises:
By adding up detected frame type forecast coding structure;
If packet loss type is interframe packet loss, the flag bit of the bag before packet loss place cannot detect, and according to the coding structure of prediction and the position of packet loss, cuts apart current data length.
10. a checkout gear for frame type, is characterized in that, comprising:
Type obtains unit, and for obtaining the type of coding of the frame place code stream having received, described type of coding comprises: open loop coding and closed loop coding;
Frame type determining unit, determines that present frame is obvious I frame if be greater than first threshold for the data volume of present frame, and described first threshold is calculated by average amount and the I frame data amount of setting the frame of continuous number;
If the former frame of present frame is I frame, type of coding, be that closed loop coding and present frame are non-obvious I frame, or, if the former frame of present frame is I frame, type of coding, be that the data volume of open loop coding and present frame is greater than the 4th threshold value, determine that present frame is P frame; The P frame average amount that described the 4th threshold value is an image sets and the average of B frame average amount;
If the non-I frame of present frame is non-P frame also, determine that present frame is B frame.
11. install according to claim 10, it is characterized in that,
Described frame type determining unit, if be also greater than Second Threshold for the data volume of present frame, determines that present frame is I frame; Described Second Threshold is the maximum in the average amount of P frame in the data volume, present frame place image sets of an I frame before present frame and the average amount of setting number successive frame.
12. install according to claim 10, it is characterized in that,
Described frame type determining unit, if also surpass fixed intervals for the interval of present frame and previous I frame, and the data volume of present frame is greater than the 3rd threshold value, determines that present frame is I frame; Described the 3rd threshold value is: data volume, a upper I frame of the data volume of the average amount of present frame place each frame of image sets, the previous P frame of present frame and present frame place image sets I frame calculate away from degree to the distance of present frame and the fixedly I frame period of expection; Or described the 3rd threshold value is calculating away from degree to the distance of present frame and the fixedly I frame period of expection according to the average amount of present frame place each frame of image sets and a upper I frame.
13. install according to claim 10, it is characterized in that,
Described frame type determining unit, if the data volume that is also P frame and present frame for the previous frame of present frame is greater than the 5th threshold value, or present image group exists the data volume of B frame and present frame to be greater than the 6th threshold value, determines that present frame is P frame; Described the 5th threshold value is: the average amount of the P frame of the first regulatory factor and present frame place image sets long-pending, and described the first regulatory factor is greater than 0.5 and be less than 1; Described the 6th threshold value is: the average of P frame average amount and B frame average amount;
If the previous frame of present frame is the data volume of B frame and present frame, be less than the 7th threshold value, or present image group is slightly less than the 8th threshold value in the data volume of P frame and present frame, determines that present frame is P frame; Described the 7th threshold value is: the average amount of the B frame of the second regulatory factor and present frame place image sets long-pending, and described the second regulatory factor is greater than 1 and is less than 1.5; Described the 8th threshold value is: the average of P frame average amount and B frame average amount.
14. install described in 13 any one, it is characterized in that, also comprise:
Interval acquiring unit, after finishing in frame type judgement, determines the fixed intervals of I frame;
Described frame type determining unit, if be not also defined as I frame by the frame of the maximum amount of data in fixed intervals place setting range by I frame for still judging after reaching in fixed intervals;
The first updating block, for upgrading the average amount of image sets all kinds frame and the spacing parameter of I frame.
15. install described in 13 any one, it is characterized in that, also comprise:
Statistic unit, after finishing in frame type judgement, adds up continuous B frame;
Described frame type determining unit, if be also greater than predicted value for the continuous quantity of B frame, is defined as P frame by the frame of data volume maximum in described continuous B frame; Described predicted value is more than or equal to 3 and is less than or equal to 7;
The second updating block, for upgrading the average amount of image sets all kinds frame.
16. install described in 13 any one, it is characterized in that, also comprise:
Packet loss type determining unit, for determining whether the frame having received packet loss occurs, if there is packet loss, determines packet loss type;
Data volume determining unit, if be packet loss in frame for packet loss type, while calculating frame data amount, determine receive the data volume of frame and packet loss data volume and be the data volume of this frame;
If packet loss type is interframe packet loss, determine whether the flag bit of packet loss place bag is before 1, if so, the data volume of packet loss is calculated into a rear frame, otherwise two frames before and after the data volume of packet loss is averagely allocated to.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310664666.2A CN103716640B (en) | 2010-12-17 | 2010-12-17 | Method and device for detecting frame type |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201010594322.5A CN102547300B (en) | 2010-12-17 | 2010-12-17 | Method for detecting frame types and device |
CN201310664666.2A CN103716640B (en) | 2010-12-17 | 2010-12-17 | Method and device for detecting frame type |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201010594322.5A Division CN102547300B (en) | 2010-12-17 | 2010-12-17 | Method for detecting frame types and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103716640A true CN103716640A (en) | 2014-04-09 |
CN103716640B CN103716640B (en) | 2017-02-01 |
Family
ID=50409145
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310664666.2A Active CN103716640B (en) | 2010-12-17 | 2010-12-17 | Method and device for detecting frame type |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103716640B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107659822A (en) * | 2017-09-29 | 2018-02-02 | 武汉斗鱼网络科技有限公司 | A kind of method and device for the frame type for judging frame of video |
CN109089153A (en) * | 2018-08-31 | 2018-12-25 | 平安科技(深圳)有限公司 | PS data stream method, apparatus, computer equipment and storage medium |
CN110213614A (en) * | 2019-05-08 | 2019-09-06 | 北京字节跳动网络技术有限公司 | The method and apparatus of key frame are extracted from video file |
CN113727116A (en) * | 2021-07-21 | 2021-11-30 | 天津津航计算技术研究所 | Video decoding method based on filtering mechanism |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1211877A (en) * | 1998-07-15 | 1999-03-24 | 国家科学技术委员会高技术研究发展中心 | MPEG-2 vedio-frequency decoder and its input buffer control method |
EP2077672A1 (en) * | 2007-08-22 | 2009-07-08 | Nippon Telegraph and Telephone Corporation | Video quality estimation device, video quality estimation method, frame type judgment method, and recording medium |
CN101518657A (en) * | 2008-12-31 | 2009-09-02 | 上海序参量科技发展有限公司 | Sector device for eliminating environmental pollution |
CN101651815A (en) * | 2009-09-01 | 2010-02-17 | 中兴通讯股份有限公司 | Visual telephone and method for enhancing video quality by utilizing same |
-
2010
- 2010-12-17 CN CN201310664666.2A patent/CN103716640B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1211877A (en) * | 1998-07-15 | 1999-03-24 | 国家科学技术委员会高技术研究发展中心 | MPEG-2 vedio-frequency decoder and its input buffer control method |
EP2077672A1 (en) * | 2007-08-22 | 2009-07-08 | Nippon Telegraph and Telephone Corporation | Video quality estimation device, video quality estimation method, frame type judgment method, and recording medium |
CN101518657A (en) * | 2008-12-31 | 2009-09-02 | 上海序参量科技发展有限公司 | Sector device for eliminating environmental pollution |
CN101651815A (en) * | 2009-09-01 | 2010-02-17 | 中兴通讯股份有限公司 | Visual telephone and method for enhancing video quality by utilizing same |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107659822A (en) * | 2017-09-29 | 2018-02-02 | 武汉斗鱼网络科技有限公司 | A kind of method and device for the frame type for judging frame of video |
CN107659822B (en) * | 2017-09-29 | 2019-07-09 | 武汉斗鱼网络科技有限公司 | A kind of method and device for the frame type judging video frame |
CN109089153A (en) * | 2018-08-31 | 2018-12-25 | 平安科技(深圳)有限公司 | PS data stream method, apparatus, computer equipment and storage medium |
CN109089153B (en) * | 2018-08-31 | 2022-08-19 | 平安科技(深圳)有限公司 | PS data stream decoding method, device, computer equipment and storage medium |
CN110213614A (en) * | 2019-05-08 | 2019-09-06 | 北京字节跳动网络技术有限公司 | The method and apparatus of key frame are extracted from video file |
CN110213614B (en) * | 2019-05-08 | 2021-11-02 | 北京字节跳动网络技术有限公司 | Method and device for extracting key frame from video file |
CN113727116A (en) * | 2021-07-21 | 2021-11-30 | 天津津航计算技术研究所 | Video decoding method based on filtering mechanism |
CN113727116B (en) * | 2021-07-21 | 2024-04-23 | 天津津航计算技术研究所 | Video decoding method based on filtering mechanism |
Also Published As
Publication number | Publication date |
---|---|
CN103716640B (en) | 2017-02-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102547300B (en) | Method for detecting frame types and device | |
CN103026719B (en) | For the method and apparatus of the time synchronized between video bit stream and output video sequence | |
US20100238792A1 (en) | Information acquisition system, transmit apparatus, data obtaining apparatus, transmission method, and data obtaining method | |
US20120201310A1 (en) | Video quality estimation apparatus, video quality estimation method, and program | |
US9374282B2 (en) | Method of and system for measuring quality of audio and video bit stream transmissions over a transmission chain | |
US20100166388A1 (en) | Video quality estimation apparatus, video quality estimation method, frame type determination method, and recording medium | |
US20100110199A1 (en) | Measuring Video Quality Using Partial Decoding | |
KR101834031B1 (en) | Method and apparatus for assessing the quality of a video signal during encoding and transmission of the video signal | |
CN101800671B (en) | Method for detecting packet loss of H.264 video file | |
CN103053134A (en) | Method for estimating type of group of picture structure of plurality of video frames in video stream | |
CN107404646B (en) | The method, apparatus and headend of video quality assessment | |
CN102714752A (en) | Technique for video quality estimation | |
KR20140088148A (en) | Scene change detection for perceptual quality evaluation in video sequences | |
US20110211629A1 (en) | Method and System for Determining a Quality Value of a Video Stream | |
CN102056004A (en) | Video quality evaluation method, equipment and system | |
CN103716640A (en) | Method and device for detecting frame type | |
US20060215711A1 (en) | Apparatus for receiving packet stream | |
Yamada et al. | Accurate video-quality estimation without video decoding | |
CN101662680A (en) | Method, device and system for measuring video flow performance | |
US20090196307A1 (en) | Transmitter apparatus, receiver apparatus, moving image and audio transmission evaluation method, and computer readable storage medium storing program thereof | |
JP5472120B2 (en) | Picture type estimation apparatus, method, and program | |
KR101358802B1 (en) | System for checking qulity of service in multimedia broadcasting and management method thereof | |
CN108900831B (en) | Flower screen event detecting method and its detection system | |
CN106303752A (en) | A kind of MPEG2-TS/UDP/IP code stream packet loss failure judgment method | |
WO2009057898A1 (en) | Apparatus and method for analysis of image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |