CN1529514A - Layering coding and decoding method for video signal - Google Patents

Layering coding and decoding method for video signal Download PDF

Info

Publication number
CN1529514A
CN1529514A CNA031512488A CN03151248A CN1529514A CN 1529514 A CN1529514 A CN 1529514A CN A031512488 A CNA031512488 A CN A031512488A CN 03151248 A CN03151248 A CN 03151248A CN 1529514 A CN1529514 A CN 1529514A
Authority
CN
China
Prior art keywords
additional video
video signal
additional
video
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA031512488A
Other languages
Chinese (zh)
Inventor
赵海武
陈勇
宋利
诸维佳
王国中
李国平
徐建峰
何芸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central Academy of SVA Group Co Ltd
Original Assignee
Central Academy of SVA Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central Academy of SVA Group Co Ltd filed Critical Central Academy of SVA Group Co Ltd
Priority to CNA031512488A priority Critical patent/CN1529514A/en
Publication of CN1529514A publication Critical patent/CN1529514A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The method includes following steps: (1) dividing video signal into main video signal and accessional video signal; (2) encoding main video signal; (3) based on encoded main video signal, editing accessional video signal including inserting, deleting and modifying accessional video object; (4) transferring coded video signal by using transmission channel, or storing coded video signal by using storage device; (5) decoder obtains coded video signal;(6) decoding main video signal (7) decoding accessional video signal; (8) superimposing accessional video signal on main video signal; (9) displaying output. It becomes very simple thing to insert, delete and modify additional video information when encoding video signal by using the invented method. Editing accessional video signal can be finished without need of editing equipment to decode main video signal. Equipment for editing coded video signal is not necessary to possess capability for encoding and decoding main video signal.

Description

The layering decoding method of vision signal
Technical field
The present invention relates to a kind of video coding and decoding method in the signal processing, relate in particular to the processing method of captions, TV station's sign, copy and other additional video information being inserted the digital video signal of coding.
Background technology
H.261 traditional video encoding standard such as ITU formulate, H.263, H.263+, H.264 the MPEG-1 that organizes to set up of the MPEG of standard and ISO, MPEG-2 etc. all be vision signal as video camera take obtain line by line or interlace signal encode, do not consider that video information also needed to carry out a lot of editors and modification before presenting to audience, therefore adopt the video information editor of these standard codes to get up not to be easily often to need following steps: a) encoded video signal is decompressed; B) the enterprising edlin of vision signal that is reducing; C) vision signal after compression is edited again. Do so following shortcoming: 1) require editing equipment to have the ability of the vision signal that Code And Decode compressed, make the editing equipment cost higher; 2) behind the editor, raw video signal is damaged, and can't return to the front state of editor. 3) when finding that Edit Error need to be revised, must update at raw video signal, cause editor's efficient not high.
The MPEG-4 standard has been considered the structure of scene, vision signal is decomposed into background and some objects are encoded, and improves the editability of compression efficiency and encode video information with this. But the original intention of MPEG-4 is that the background in the vision signal and object are analyzed out, then encodes respectively, and this just needs very complicated scene analysis algorithm. Although the scene analysis algorithm is not as the part of MPEG-4 standard, but because the scene analysis algorithm does not reach practical requirement on real-time and accuracy at present at all, so that the so-called OO video-frequency compression method that MPEG-4 advocates can't use in practice. Simultaneously, MPEG-4 does not give to pay close attention to and support especially at algorithm especially to video frequency program post-production and the various demands performed in a radio or TV programme, so that software and equipment that the vision signal of MPEG-4 compression is edited are all very complicated.
Summary of the invention
Technical problem to be solved by this invention provides a kind of vision signal coding and decoding method, by vision signal being divided into main video layer and some additional video layers are encoded, makes the additional video letter
Number insertion and deletion can not affect the content of main video signal, and do not need the main video signal of decoding.
In order to solve the problems of the technologies described above, the present invention adopts following technical scheme:
A kind of hierarchy encoding method of vision signal is provided, may further comprise the steps:
A. vision signal is divided into main video signal and supplemental video signal;
The said vision signal of the present invention refers to finally be presented at the information on the display screen, namely the actual all videos information of seeing of spectators.
The said encoded video signal of the present invention refers to through the later vision signal of encoding.
The said main video signal of the present invention refers to the main contents of vision signal, and the picture of the film and television of for example taking with video camera is perhaps with synthetic picture of additive method etc.
The said supplemental video signal of the present invention comprises the TV station's sign that is presented on the screen, captions (dialogue, the lyrics, explanation etc.), the comment in the various program, the credits present of various programs, temporal information, copy, banner advertisement, the station synchronization column sign of TV station's fixed time airplay and other vision signals intercutted in the film and television program etc.
The division of main video signal and supplemental video signal is not absolute, can specify according to actual needs.
The said additional video object of the present invention refers to some concrete supplemental video signals, such as TV station's station symbol, delegation's spolen title, a banner advertisement etc.
The said main video pictures of the present invention refers to a frame of main video information or one.
It is considered herein that the picture that is presented on the screen is to be formed by stacking according to certain order by main video pictures and some additional video objects.
B. the main video signal of encoding;
The coding of main video signal can adopt the existing or following video Signal encoding method, such as MPEG-1, MPEG-2, MPEG-4, H.261, H.263, H.263+, H.264 wait. But need in the middle of the data head of each coded picture (frame or field), establish a flag bit, whether have supplemental video signal to need to show to show this picture. When not having supplemental video signal, these signs all are set to 0.
C. the supplemental video signal of encoding may further comprise the steps:
C1. the classification of supplemental video signal;
The present invention has defined 9 kinds of additional video objects according to the characteristics of various supplemental video signals, sees Table 1, can also define as required new additional video object type later on. Can define at most 255 kinds of additional video object types.
Title and the numbering of the various additional video objects of table 1
type Title Describe
 0 Forbid Forbid
 1 Sign Station symbol, field mark, actionless advertisement and other actionless signs, characteristics are transfixion, size, out-of-shape consistent with display window, use multiple color, and transparency is constant.
2 Common captions Dialogue, the common lyrics, static comment, static credits present, static copy etc. Characteristics are transfixion, size, out-of-shape consistent with display window, use a kind of color, and transparency is constant.
3 Roll titles The credits present of rolling news literal, rolling, the copy of rolling etc., characteristics are can translation on screen, size and display window is inconsistent, out-of-shape, uses a kind of color, and transparency is constant.
4 The captions of being fade-in fade-out Size, out-of-shape consistent with display window used a kind of color, and transparency is variable, shows gradually in some way or fades away.
5 Enter to shift out captions Size, out-of-shape consistent with display window used a kind of color, and transparency is constant, and display window can be mobile at screen.
6 The picture of being fade-in fade-out Size, out-of-shape consistent with display window, the use multiple color, transparency is variable, shows gradually in some way or fades away
7 Move in and out picture Size, out-of-shape consistent with display window used multiple color, and transparency is constant, and display window can be mobile at screen.
8 The Karaoke captions Size, out-of-shape consistent with display window used two kinds of colors, and transparency is constant.
9 Video sequence Size, regular shape consistent with display window used multiple color, and transparency is constant, and display window is motionless, adopts the coded system same with main video.
10~ 255 Keep
C2. the description of additional video object;
Characteristics according to various additional video objects are described them with different parameters, for example numbering, place layer, transparency, shape, size, position, color, affiliated language etc. Table 2~table 5 describes the describing method of each additional video object in detail. When the additional video object occurs in encoded data stream for the first time or data need to refresh the time, should the described structure of use table 2~table 5 be described. Table 6 has defined an additional video object at the describing method of the difference of current picture and last picture. When the additional video object is not when appearance and data do not need to refresh in encoded data stream for the first time, should the described structure of use table 6 to be described.
Table 2 additional video object data structure
VideoObjectData(){ Descriptor
  Level  u(8)
  target_object_id  u(8)
  superpose_order  u(8)
  Type  u(8)
  display_window_x0  u(16)
  display_window_y0  u(16)
  display_window_x1  u(16)
  display_window_y1  u(16)
  if(type==1){
    transparency_quotiety  u(8)
    ShapeMatrix()
    color_number  u(8)
    ColorTable(color_number)
    if(color_number>1)
      ColorIdxMatrix()
  }
  if(type==2){
    transparency_quotiety  u(8)
    language_code  u(8)
    ShapeMatrix()
    ColorTable(1)
  }
  if(type==3){
    transparency_quotiety  u(8)
    language_code  u(8)
    ShapeMatrix()
    ColorTable(1)
    Width  u(16)
    Height  u(16)
    offset_x  s(16)
    offset_y  s(16)
    move_direct  u(2)
    move_pixels_per_pic  u(16)
  }
  if(type==4){
    language_code  u(8)
    ShapeMatrix()
    ColorTable(1)
    fade_in_mode  u(8)
    still_pic_num  u(8)
    fade_out_mode  u(8)
    start_at  u(8)
  }
  if(type==5){
    transparency_quotiety  u(8)
    language_code  u(8)
    ShapeMatrix()
    ColorTable(1)
    start_x_offset  s(16)
    start_y_offset  s(16)
  }
  if(type==6){
    ShapeMatrix()
    color_number  u(8)
    ColorTable(color_number)
    ColorIdxMatrix()
    fade_in_mode  u(8)
    still_pic_num  u(8)
    fade_out_mode  u(8)
    start_at  u(8)
  }
  if(type==7){
    transparency_quotiety  u(8)
    ShapeMatrix()
    color_number  u(8)
    ColorTable(color_number)
    ColorIdxMatrix()
    start_x_offset  s(16)
    start_y_offset  s(16)
  }
  if(type==8){
    transparency_quotiety  u(8)
    language_code  u(8)
    ShapeMatrix()
    ColorTable(2)
    color_change_position  u(16)
  }
  if(type==9){
    transparency_quotiety  u(8)
    CodedFrameData()
  }
}
Table 3 form matrix structure
ShapeMatrix(){ Descriptor
  for(i=0;i<Height;i++){
   for(j=0;j<Width;j++){
     marker_bit  u(1)
   }
  }
}
Table 4 color table structure
ColorTable(Num){ Descriptor
  for(i=0;i<Num;i++){
    value_y  u(8)
    value_u  u(8)
    value_v  u(8)
  }
}
Table 5 color index matrix structure
ColorIdxMatrix(){ Descriptor
  for(i=0;i<Height;i++){
   for(j=0;j<Width;j++){
    if(ShapeMatrix[i][j]==1)
      color_idx  u(v)
    }
  }
}
Table 6 object video difference structure
VideoObjectDifference(){ Descriptor
  if(type==1‖type==2‖type==3‖type==4‖type ==6){
  }
  if(type==5‖type==7){
      current_x_offset s(16)
      current_y_offset s(16)
  }
  if(type==8){ u(1)
      current_color_change_position u(16)
  }
  if(type==9){ u(1)
      CodedFrameData() u(16)
  }
}
Wherein parameter corresponding to the expressions such as u (1), u (2), u (8), u (16) is a signless integer, represents with 1,2,8 or 16 binary digit. Parameter corresponding to s (16) expression is a signed integer, represents first is-symbol position with 16 binary digits. Parameter corresponding to u (v) expression is a signless integer, is used for representing that its number of bits is definite by other parameters. For example, when an additional video object had used 4 kinds of colors, the expression color index just only needed 2 binary digits.
The meaning of parameters is as follows in table 2~table 6:
Level points out the layer at additional video object place. Regulation main video signal of the present invention is 0 layer, and vision signal comprises at most 256 layers, so this parameter value 1~255 of additional video object.
Target_object_id points out which additional video object the additional video object will be added on value 0~255. Additional video object can not be added to level number more than or equal on the additional video object layer by layer number of its place. Value is that 0 this additional video object of expression directly is added on the main video pictures.
Superpose_order points out the order of mutual stack between the additional video object, value 0~255. The additional video object that this parameter value is large should be added to first on its destination object, then is added on other object videos as an object.
Type points out the type of additional video object, value 1~255, and the implication of each value is referring to table 1.
Display_window_x0 display_window_y0 display_window_x1 display_window_yl points out to show the position of the window of this additional video object, (display_window_x0, display_window_y0) point out that the upper left corner of window is with respect to the coordinate in the main video upper left corner, the lower right corner that (display_window_x1, display_window_y1) points out window is with respect to the coordinate in the main video upper left corner.
Transparency_quotiety points out the transparency of additional video object, value 0~255.
Color_number points out the number of colors that the additional video object uses, value 1~255.
Language_code points out the language that the additional video object belongs to, and the language codes that can adopt international standards also can adopt self-defining language codes, and the present invention does not do regulation. When having multilingual captions, the user can select oneself to want the captions seen as required. The programmer also can utilize this parameter to provide multilingual captions in a program.
Width_height points out width and the height of additional video object, and take the pixel of main video signal as unit, span is determined by horizontal resolution and the vertical resolution of main video signal. When the size of additional video object was identical with display window, its width and height were subtracted each other by the upper left corner coordinate of display window and lower right corner coordinate and obtain.
The coordinate that offset_x_offset_y points out additional video object top left pixel when initial with respect to the skew of display window upper left corner coordinate.
Move_direct points out the rotating direction of the additional video object of roll titles type, value 0~3. 0 expression is bottom-up, and 1 represents from up to down, and 2 represent from left to right, 3 expression right-to-lefts.
Move_pixels_per_pic points out the rolling speed of the additional video object of roll titles type, and unit is pixel/picture. Wherein pixel is the pixel of main video signal.
Fade_in_mode fade_out_mode point out to be fade-in fade-out type the additional video object fade in and the mode of fading out value 0~255. The present invention does not stipulate the implication of each value, has only stipulated can have at most 256 kinds and has faded in and the mode of fading out, and the mode of fading in can be different with the mode of fading out.
Still_pic_num point out the to be fade-in fade-out additional video object of type is presented at time on the screen fully, take the picture of main video signal as unit, and value 0~255.
Start_at point out the to be fade-in fade-out initial state of additional video object of type. In the procedure for displaying of the additional video object of the type of being fade-in fade-out, for the function of random access is provided, may need refresh data. After this just causes the partial data of object to arrive, might not begin to show from initial fading in, but some states begin to show from the centre. Therefore it may be noted that the residing status of processes of being fade-in fade-out of object.
The additional video object that start_x_offset start_y_offset points out to move in and out type when beginning display window with respect to the skew of coordinate (display_window_x0, display_window_y0).
The position of variable color line of additional video object that color_change_position points out to play Karaoka the captions type is apart from the number of picture elements on limit, display window left side.
The marker_bit value is to represent that point (j, i) did not belong to this additional video object at 0 o'clock, and value is to represent that point (j, i) belonged to this additional video object at 1 o'clock. (j, i) is the coordinate with respect to this additional video object top left pixel.
Value_y value_u value v points out the Y of a color, U, the value of V component.
Color_idx points out the index of the color that this additional video object is located at (j, i).
Current_x_offset_current_y_offset points out to move in and out the additional video object of type in the skew of current picture disply window upper left corner coordinate with respect to coordinate (display_window_x0, display_window_y0).
Current_color_change_position points out to play Karaoka the additional video object of captions type in the position of the variable color line of current picture, apart from the number of picture elements on limit, display window left side.
CodedFrameData () construction packages the coded data of a picture of video, coding method is the same with the coding method of main video. Define such additional video to liking in order to support complicated additional video object, the advertisement icon that for example constantly changes, undersized of short duration video sequence that news program intercuts etc.
C3. for the layer at each additional video Object Selection place, mutual overlaying relation and overlay order: suppose object A are added on the object B, should guarantee that each pixel of A can find corresponding with it pixel in B.
C4. be each additional video Object Selection start and end time, can represent with the sequence number of the picture of main video signal:
D. the additional video object of coding is inserted in the main video signal of having encoded, may further comprise the steps:
D1. determine random access point.
The said random access point of the present invention refers to the code stream position that decoder can begin to be correctly decoded. Random access point at first is subject to the impact of coded system of each picture of main video signal. The picture that only has the employing frame mode to encode just might be as random access point. But do not require that each picture that adopts the frame mode coding is as random access point. The present invention does not stipulate the concrete method to set up of random access point.
At each random access point picture, must provide the complete descriptor of each additional video object, i.e. VideoObjectData () (additional video object data) structure.
D2. determine that according to the start and end time of each additional video object they are in the data structure of each main video pictures. First picture in that each additional video object begins to show also must provide complete descriptor, i.e. VideoObjectData () structure. Except first picture and random access point of beginning to show, the parameter that other pictures that need to show at an additional video object provide VideoObjectDifference () (additional video object difference) structure to describe object video changes.
D3. generate additional video information for each of main video signal need to show the picture of supplemental video signal, namely FuJiaVideoInfo () (additional video information) structure sees Table 7. The generation of additional video information may further comprise the steps:
Table 7 supplemental video signal structure
FuJiaVideoInfo(){ Descriptor
  fujia_video_info_length_in bit  u(24)
  video_object_number  u(8)
  for(i=0;i<video_object_number;i++)
    VideoObject()
  }
Table 8 object video structure
  VideoObject(){ Descriptor
  video_object_id  u(8)
  if(RandomAccessPoint‖NewVideoObject)
    VideoObjectData()
  Else
    VideoObjectDifference()
  }
D31. add up the number of each picture additional video object, obtain video_object_number (additional video number of objects).
D32. distribute a number video_object_id (additional video object number) for each additional video object of describing with VideoObjectData () structure. Video_object_id is unique between twice Refresh Data, comes additional video object of unique identification with video_object_id during twice Refresh Data. When Refresh Data, can redistribute video_object_id for the additional video object. The additional video object of describing with VideoObjectDifference () structure need to video_object_id point out it described be which object video. The value of video_object_id is 0~255, but 0 fixedly is used for representing main video pictures.
D33. video_object_id and VideoObjectData () or VideoObjectDifference () structural group are made into VideoObject () (additional video object) structure.
D34. all VideoObject () structural orders of each picture are arranged, added up the number of bits that they occupy, add again 32, as the value of fujia_video_info_length_in_bit (additional video information length). Form at last FuJiaVideoInfo () (additional video information) structure shown in table 7 and the table 8.
D4. FuJiaVideoInfo () structure is inserted corresponding main video pictures, concrete grammar is that the supplemental video signal sign in the header information of the coded data of the main video pictures of correspondence is set to 1, then FuJiaVideoInfo () structure is inserted in the back of this sign, other coded data integral body of main video signal are mobile backward.
A kind of hierarchical decoding method of vision signal may further comprise the steps:
A. decode each picture of main video signal. The decoding method of main video signal is not within the scope of the present invention.
If b. the supplemental video signal sign of main video signal picture is set to 1, the supplemental video signal of then decoding. May further comprise the steps:
B1. read the value of fujia_video_info_length_in_bit. Value according to fujia_video_info_length_in_bit continues to read fujia_video_info_length_in_bit binary digit.
B2. read the value of video_object_number.
B3. video_object_number the additional video object of decoding. May further comprise the steps:
B31. according to the descriptor of each additional video object, obtain their parameter.
B32. according to the parameter of each additional video object, acquisition belongs to the color of their each pixel.
C. the order of pointing out according to superpose_order is added to each additional video object on its destination object successively. The additional video object that the value of superpose_order equates should be first on low its destination object that is added to of level number.
Superposition algorithm is as follows: for each pixel of this object video, if the value of some components of its color is v1, the value of the color component of the correspondence of the pixel of destination object correspondence position is v2, the transparency coefficient of this object video (transparency_quotiety) is t, and then the value of this color component of this pixel is after the stack
v=(v2*t+v1*(256-t))/256
Insert the method for an additional video object, may further comprise the steps:
A. determine its classification, parameter, place layer and with overlaying relation and the overlay order of other already present additional video objects.
B. determine time started and the concluding time of its demonstration.
C. prepare VideoObjectData () and VideoObjectDifference () structure according to random access point and time started.
D. VideoObjectData () and VideoObjectDifference () structure are inserted in the FuJiaVideoInfo () structure of corresponding picture. And revise the value of corresponding fujia_video_info_length_in_bit and video_object_number. If this picture did not have FuJiaVideoInfo () structure originally, then produce FuJiaVideoInfo () structure according to the steps d of above-mentioned " a kind of hierarchy encoding method of vision signal " and be inserted in the middle of the coded data of corresponding picture.
The method of an additional video object of deletion may further comprise the steps:
A. in the data flow of encoded video signal, find the FuJiaVideoInfo () structure of corresponding picture.
B. will belong to VideoObjectData () and the deletion of VideoObjectDifference () structure of this additional video object in the FuJiaVideoInfo () structure, delete simultaneously take this object all objects as the object that directly or indirectly superposes.
C. revise the value of fujia_video_info_length_in_bit and video_object_number in the FuJiaVideoInfo () structure. If the value of video_object_number is 0, then in the data of this picture, deletes whole FuJiaVideoInfo () structure, and exist sign to be set to 0 supplemental video signal.
Revise the method for an additional video object, may further comprise the steps:
A. in the data flow of encoded video signal, find the FuJiaVideoInfo () structure of corresponding picture.
B. find in the FuJiaVideoInfo () structure VideoObjectData () and the deletion of VideoObjectDifference () structure that should the additional video object.
C. revise the parameter in VideoObjectData () and the VideoObjectDifference () structure. If have influence on the overlaying relation between the additional video object then need to revise the VideoObjectData () of relevant additional video object and the relevant parameter in the VideoObjectDifference () structure. If changed the VideoObjectData () of this object video and the length of VideoObjectDifference () structure, also need to revise the value of the fujia_video_info_length_in_bit in the FuJiaVideoInfo () structure.
Owing to adopted above-mentioned layering coding and decoding method, in encoded video signal, insert, delete and revise additional video information and become very simple, the editing equipment editor that main video signal just can be finished supplemental video signal that do not need to decode, the equipment of editing like this encoded video signal can not need to have the ability of Code And Decode main video signal. The insertion that another one advantage of the present invention is supplemental video signal can not destroy the content of main video signal.
Description of drawings
Fig. 1 is the flow chart of an embodiment of hierarchical video signal coding method of the present invention;
Fig. 2 is the flow chart of an embodiment of hierarchical video signal coding/decoding method of the present invention;
Fig. 3 is the flow chart of an embodiment of hierarchical video signal decoding method of the present invention;
Fig. 4 is the schematic diagram of the encoded video signal data structure of hierarchical video signal coding method of the present invention.
The specific embodiment
Further specify technical scheme of the present invention below in conjunction with drawings and Examples.
Fig. 1 is the flow chart of an embodiment of hierarchical video signal coding method of the present invention. As shown in Figure 1, this embodiment may further comprise the steps:
A. vision signal is divided into main video signal and supplemental video signal; Main video signal refers to the main contents of vision signal, and the picture of the film and television of for example taking with video camera is perhaps with synthetic picture of additive method etc. Supplemental video signal comprises the TV station's sign that is presented on the screen, captions (dialogue, the lyrics, explanation etc.), the comment in the various program, the credits present of various programs, temporal information, copy, banner advertisement, the station synchronization column sign of TV station's fixed time airplay and other vision signals intercutted in the film and television program etc. The division of main video signal and supplemental video signal is not absolute, can specify according to actual needs.
B. the main video signal of encoding;
The coding of main video signal can adopt the existing or following video Signal encoding method, such as MPEG-1, MPEG-2, MPEG-4, H.261, H.263, H.263+, H.264 wait. But need in the middle of the data head of each coded picture (frame or field), establish a flag bit, whether have supplemental video signal to need to show to show this picture. When not having supplemental video signal, these signs all are set to 0.
C. supplemental video signal is decomposed into some additional video objects, for example captions, the comment in the various program, the credits present of various programs, temporal information, copy, banner advertisement, the station synchronization column sign of TV station's fixed time airplay and other vision signals of intercutting in TV station's sign, the film and television program etc. Characteristics according to various additional video objects are classified to the described a certain class of table 1.
D. determine the start and end time of each additional video object, can represent with the sequence number of the picture of main video signal:
E. determine random access point. Determine random access point according to the coded system of each picture of main video signal and the requirement of random access. Only have the picture that adopts the frame mode coding as random access point, not require that still each picture that adopts frame mode coding is as random access point.
F. determine that according to start and end time and the random access point of each additional video object they are in the data structure of each main video pictures. When the additional video object occurs in encoded data stream for the first time or when random access point occurs, should the described structure of use table 2~table 5 be described. When the additional video object is not to occur in encoded data stream for the first time and not when random access point occurs, should the described structure of use table 6 be described.
Each picture that g. will show for each additional video object is prepared corresponding description scheme, VideoObjectData () structure or VideoObjectDifference () structure.
H. add up the number of the additional video object that each picture need to show, obtain video_object_number.
I. distribute a number video_object_id for each additional video object of describing with VideoObjectData () structure. And indicate the additional video object that each VideoObjectDifference () structure is described with this number. Form VideoObject () structure.
J. determine the residing layer of each additional video object and mutual overlaying relation.
K. all VideoObject () structural orders of each picture are arranged, added up the number of bits that they occupy, add again 32, as the value of fujia_video_info_length_in_bit. Form at last the FuJiaVideoInfo () structure shown in table 7 and the table 8.
1. FuJiaVideoInfo () structure is inserted corresponding main video pictures, concrete grammar is that the supplemental video signal sign in the header information of the coded data of the main video pictures of correspondence is set to 1, then FuJiaVideoInfo () structure is inserted in the back of this sign, other coded data integral body of main video signal are mobile backward.
Fig. 2 is the flow chart of an embodiment of hierarchical video signal coding/decoding method of the present invention, as shown in Figure 2, comprises among this embodiment:
A '. each picture of decoding main video signal.
B '. if the supplemental video signal sign of a picture of main video signal is set to 1, then reads a FuJiaVideoInfo () structure.
C '. resolve the value of fujia_video_info_length_in_bit and video_object_number and the descriptor of each additional video object, obtain their parameter.
D '. according to the parameter of each additional video object, acquisition belongs to the color of their each pixel.
E '. the order of pointing out according to superpose_order is added to each additional video object on its destination object successively. The additional video object that the value of superpose_order equates should be first on low its destination object that is added to of level number. Superposition algorithm is as follows: for each pixel of this object video, if the value of some components of its color is v1, the value of the color component of the correspondence of the pixel of destination object correspondence position is v2, the transparency coefficient of this object video (transparency_quotiety) is t, and then the value of this color component of this pixel is after the stack
                 v=(v2*t+v1*(256-t))/256
F. the picture after the output stack is to display device.
Fig. 3 is the flow chart of an embodiment of hierarchical video signal decoding method of the present invention, and as shown in Figure 3, this embodiment may further comprise the steps:
A ". the coding main video signal.
B ". supplemental video signal is edited on the basis at the main video signal of having encoded, and comprises insertion, deletion and revises the additional video object.
C ". utilize transmission channel to transmit encoded video signal, perhaps utilize the memory device, stores encoded video signal.
D ". decoder obtains the vision signal of coding.
E ". the decoding main video signal.
F ". the decoding supplemental video signal.
G ". supplemental video signal is added on the main video signal.
H ". show output.
Shown in Figure 4 is the encoded video signal data structure of hierarchical video signal coding method of the present invention, a video sequence comprises several pictures as shown in the figure, wherein each picture includes picture start code, picture data head, coded picture data, wherein the picture data head comprises several syntactic elements, additional video sign, additional video information, and wherein additional video information comprises additional video information length, additional video number of objects, several additional video objects.

Claims (5)

1, a kind of layering decoding method of vision signal is characterized in that, comprises the steps:
A, vision signal is divided into main video signal and supplemental video signal;
B, coding main video signal;
C, edit supplemental video signal on the basis of the main video signal of having encoded, comprise insertion, deletion and revise the additional video object;
D, utilize transmission channel to transmit encoded video signal, perhaps utilize the memory device, stores encoded video signal;
E, decoder obtain the vision signal of coding;
F, decoding main video signal;
G, decoding supplemental video signal;
H, supplemental video signal is added on the main video signal;
I, demonstration output.
2, the layering decoding method of described vision signal according to claim 1 is characterized in that the method for described insertion additional video object comprises the steps:
A, determine its classification, parameter, place layer and with overlaying relation and the overlay order of other already present additional video objects;
B, the time started of determining its demonstration and concluding time;
C, prepare additional video object data structure and additional video object difference structure according to random access point and time started;
D, determine whether this picture has the additional video information structure, if then do not produce the additional video information structure and be inserted in the middle of the coded data of corresponding picture;
E, additional video object data structure and additional video object difference structure are inserted in the additional video information structure of corresponding picture, and revise the value of corresponding additional video information length and additional video number of objects.
3, the layering decoding method of vision signal according to claim 1 is characterized in that, the method for described deletion additional video object may further comprise the steps:
A, in the data flow of encoded video signal, find the additional video information structure of corresponding picture;
B, the additional video object data structure that will belong to this additional video object in the additional video information structure and the deletion of additional video object difference structure are deleted simultaneously take this object all objects as the object that directly or indirectly superposes;
The value of additional video information length and additional video number of objects in c, the modification additional video information structure. If the value of additional video number of objects is 0, then in the data of this picture, deletes whole additional video information structure, and exist sign to be set to 0 supplemental video signal;
4, the layering decoding method of vision signal according to claim 1 is characterized in that, the method for described modification additional video object may further comprise the steps:
A, in the data flow of encoded video signal, find the additional video information structure of corresponding picture;
B, find in the additional video information structure additional video object data structure and the deletion of additional video object difference structure that should the additional video object;
Parameter in c, modification additional video object data structure and the additional video object difference structure. If have influence on the overlaying relation between the additional video object then need to revise the additional video object data structure of relevant additional video object and the relevant parameter in the additional video object difference structure; If changed the additional video object data structure of this object video and the length of additional video object difference structure, also need to revise the value of the additional video information length in the additional video information structure;
5, according to claim 1 and 2 or the layering decoding method of 3 or 4 described vision signals, it is characterized in that, the data structure of a described video sequence comprises: several pictures, wherein each picture includes picture start code, picture data head, coded picture data, wherein the picture data head comprises several syntactic elements, additional video sign, additional video information, and wherein additional video information comprises additional video information length, additional video number of objects, several additional video objects.
CNA031512488A 2003-09-26 2003-09-26 Layering coding and decoding method for video signal Pending CN1529514A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA031512488A CN1529514A (en) 2003-09-26 2003-09-26 Layering coding and decoding method for video signal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA031512488A CN1529514A (en) 2003-09-26 2003-09-26 Layering coding and decoding method for video signal

Publications (1)

Publication Number Publication Date
CN1529514A true CN1529514A (en) 2004-09-15

Family

ID=34286984

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA031512488A Pending CN1529514A (en) 2003-09-26 2003-09-26 Layering coding and decoding method for video signal

Country Status (1)

Country Link
CN (1) CN1529514A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019184501A1 (en) * 2018-03-29 2019-10-03 上海掌门科技有限公司 Video editing method, video playing method, device and computer-readable medium
CN114697670A (en) * 2022-04-02 2022-07-01 北京广播电视台 Cluster workstation editing system and editing method
WO2023035551A1 (en) * 2021-09-13 2023-03-16 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Video coding by object recognition and feature extraction
WO2023035552A1 (en) * 2021-09-13 2023-03-16 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Video coding by object recognition and feature unit management

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019184501A1 (en) * 2018-03-29 2019-10-03 上海掌门科技有限公司 Video editing method, video playing method, device and computer-readable medium
WO2023035551A1 (en) * 2021-09-13 2023-03-16 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Video coding by object recognition and feature extraction
WO2023035552A1 (en) * 2021-09-13 2023-03-16 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Video coding by object recognition and feature unit management
CN114697670A (en) * 2022-04-02 2022-07-01 北京广播电视台 Cluster workstation editing system and editing method
CN114697670B (en) * 2022-04-02 2023-01-06 北京广播电视台 Cluster workstation editing system and editing method

Similar Documents

Publication Publication Date Title
CN1164120C (en) Image encoder and image decoder
EP3523970B1 (en) Source color volume information messaging
CN1208971C (en) Predictive coding method and decoding method for dynamic image
CN1278550C (en) Method and apparatus for regenerating image and image recording device
CN1166191C (en) Subtitling transmission system
CN1200557C (en) Distribution system of digital image content and reproducing method and medium recording its reproduction program
CN1781315A (en) Method for coding sequences of pictures
CN1540986A (en) Method and device of transmitting received TV signals from mobile communication terminal
CN1496105A (en) Method and device for processing, recording and reproducing image and TV set using the same
CN1647546A (en) Stereoscopic video sequences coding system and method
CN1296620A (en) Recording carrier, and apparatus and for playing back a record carrier, and method for making the same
CN101031085A (en) Method for processing mobile-terminal frame carboon
CN1946163A (en) Method for making and playing interactive video frequency with heat spot zone
CN1778113A (en) Robust mode staggercasting capable of regulating delay distortion
CN1094555A (en) The device of record and playback digital video and audio signal
CN101076111A (en) Method for acquiring keyframe section positioning infromation in video fluid
CN101043600A (en) Playback apparatus and playback method using the playback apparatus
CN1777919A (en) Apparatus and method for adapting graphics contents and system therefor
CN1529988A (en) Image encoding method, image decoding method, image encoding device, image decoding device, program, computer dato signal and image transmission system
CN1618234A (en) Method for syntactically analyzing a bit stream using a schema and a method of generating a bit stream based thereon
CN1262845A (en) Data signal for modifying graphic scene, corresponding method and device
CN1529514A (en) Layering coding and decoding method for video signal
CN1148976C (en) Image data structure, transmitting method, decoding device and dara recording media
CN1257649C (en) Coding process and device for displaying of zoomed MPEG2 coded image
CN1529513A (en) Layering coding and decoding method for video signal

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication