Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Embodiment based in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
First SVC technology being done to one simply introduces.SVC (Scalable Video Coding, can layered video encoding and decoding), being a kind ofly video flowing can be divided into the technology of a plurality of resolution, quality and frame rate layers, is the expansion of H.264 video encoding and decoding standard that current most of video conference devices are adopted.Video conference device employing SVC technology is received and dispatched the multi-layer video being comprised of a little basic layer and a plurality of other optional layer that improves resolution, frame rate and quality and is flowed.This layered mode of multi-layer video stream can improve error resilience and video quality greatly, and bandwidth is not had to very high requirement.
As shown in Figure 1, the embodiment of the present invention one provides a kind of coding method, comprising:
Step 11, obtain the basic tomographic image of a width and at least one width enhancement layer image in the scene of meeting-place respectively.
When realizing the many scenes demonstrations in meeting-place, often need to, when showing meeting-place panoramic picture, clearly show the image of the local scene in meeting-place.Therefore, when showing, the resolution of panoramic picture is relatively lower, and the resolution of the image of local scene is relatively high.Therefore, in embodiments of the present invention, according to the feature of SVC technology, the panoramic picture in the scene of meeting-place can be used as basic tomographic image, and the image of the local scene of meeting-place scene is as enhancement layer image.Or while not needing to show the panorama in meeting-place, described basic tomographic image, enhancement layer image all can be the image of the local scene of meeting-place scene, just the resolution of the two is different.
Step 12, described basic tomographic image is encoded as basic layer that can layered video encoding and decoding SVC, the enhancement layer using described enhancement layer image as described SVC, form encoding code stream.
Description by the embodiment of the present invention one can find out, in the process of coding, for basic tomographic image and enhancement layer image in the meeting-place scene obtaining, encodes respectively as basic layer, enhancement layer that can layered video encoding and decoding SVC.The encoding code stream forming in this manner, takes full advantage of the characteristic of SVC coded system, can comprise the code stream corresponding to image of a plurality of different resolutions in the encoding code stream forming.Therefore, utilize the coding method of the embodiment of the present invention, can improve the quality of image shown when realizing the many scenes demonstrations in meeting-place.
Corresponding with the coding method of embodiment mono-, the embodiment of the present invention two provides a kind of coding/decoding method.As shown in Figure 2, the method for the embodiment of the present invention two comprises:
Step 21, received code code stream, wherein said encoding code stream is to be encoded and obtained by the enhancement layer using the basic tomographic image in the scene of meeting-place as basic layer that can layered video encoding and decoding SVC, using the enhancement layer image in the scene of meeting-place as described SVC.
Step 22, according to described encoding code stream, obtain respectively basic layer bit stream and enhancement layer bitstream.
Step 23, respectively described basic layer bit stream and enhancement layer bitstream are decoded, obtain described basic tomographic image and described enhancement layer image.
Description by the embodiment of the present invention two can find out, owing to being by basic tomographic image and the enhancement layer image of meeting-place scene in the process at coding, respectively as basic layer that can layered video encoding and decoding SVC, enhancement layer is encoded and form encoding code stream.That is to say the encoding code stream forming in this manner, take full advantage of the characteristic of SVC coded system, in the encoding code stream forming, can comprise the code stream corresponding to image of a plurality of different resolutions.Therefore, the basic tomographic image that coding/decoding method that can the utility embodiment of the present invention obtains and the quality of enhancement layer image, thus improve the quality of image shown when realizing the many scenes in meeting-place and show.
Below, in conjunction with different embodiment, describing is how to realize Code And Decode.
In conjunction with the coding method flow chart of the encoding-decoding process schematic diagram shown in Fig. 3 and Fig. 4, the coding method of the embodiment of the present invention three comprises:
Step 31, utilize one or more than one image collecting device, as video camera, obtain at least two same meeting-place scene Scene content is identical but image that resolution is different.In Fig. 3, be to utilize two image collecting devices to obtain respectively enhancement layer image.For example can be layer 1 enhancement layer image that call of image collecting device 1 acquisition, layer 2 enhancement layer image that call of image collecting device 2 acquisitions.
Step 32, in the described image obtaining, select the little image of resolution, for example layer 1 enhancement layer image, carries out down-sampling, obtains described basic tomographic image.At this, the object of selecting the less image of resolution to sample is damage picture quality being brought in order to reduce down-sampling.Meanwhile, using the described image obtaining as described enhancement layer image.
Whether step 33, definite enhancement layer image obtaining need to carry out panorama coding.If so, directly the enhancement layer using enhancement layer image as SVC is encoded.If not, utilize positioner or by the mode of manual appointment, obtain the cutting parameter of enhancement layer image regional area, according to described cutting parameter, determine the regional area of the enhancement layer image that need to encode.
For example, can be in meeting-place setting identification manually in advance, to identify the regional area requiring emphasis.Then, re-use the image collecting device collection meeting field picture such as video camera, the region of finding sign to define in the meeting field picture collecting.
Wherein, described cutting parameter comprises:
Frame_cropping_flag: whether will will export after image cropping for indicating to decoding device, whether sign has clipping region, 1 represents it is that 0 expression is no.
Frame_crop_left_offset: for indicate the crop left boundary parameter in image cropping region to decoding device, can calculate the distance of clipping region left margin and original image left margin by this parameter.
Frame_crop_right_offset: for indicate the crop right boundary parameter in image cropping region to decoding device, can calculate the distance of clipping region right margin and original image right margin by this parameter.
Frame_crop_top_offset: for indicate the cutting coboundary parameter in image cropping region to decoding device, can calculate the distance of coboundary, clipping region and original image coboundary by this parameter.
Frame_crop_bottom_offset: for indicate the cutting lower boundary parameter in image cropping region to decoding device, can calculate the distance of clipping region lower boundary and original image lower boundary by this parameter.
According to consensus standard H.264, above-mentioned cutting parameter can be documented in sequence parameter set (SequenceParameter Set, SPS).Wherein, H.264 can support maximum 32 SPS.And above-mentioned cutting parameter can send decoding device to together by the encoding code stream forming below.Wherein, according to the ID (sign) that has its picture parameter set (PictureParameter Set, PPS) of using in band (slice) code stream of every layer of consensus standard H.264.Which PPS that by this ID, can know current band code stream use is, then can from this PPS, obtain the ID (sign) of the SPS of this band code stream use.So just can know the SPS that this band code stream is used, thereby can obtain the cutting parameter of this tomographic image.Therefore, decoding device can be determined the local scene image that needs demonstration by the cutting parameter transmitting in rebuilding image, thereby reaches the object that obtains the local scene image in meeting-place, realizes the local scene in meeting-place and emphasizes or give prominence to texts.
The detailed process of below encoding, to form encoding code stream, comprising:
Step 34, for basic tomographic image, the basic layer using it as SVC, adopts H.264 agreement to encode.The process of this coding is same as the prior art, does not repeat them here.
Step 35, for enhancement layer image, first according to described basic tomographic image, obtain the coded reference image of described enhancement layer image, and then utilize the enhancement layer of described coded reference image using described enhancement layer image as described SVC to encode.
When forming the coded reference image of enhancement layer image, can utilize the up-sampling device of SVC agreement to carry out up-sampling to basic tomographic image, obtain having with each enhancement layer image the coded reference image of equal resolution.Wherein, coded reference image and described basic tomographic image are image in the same time, the similarity of enhancement layer image and its coded reference image can be very high so, this just makes the accuracy of the estimation in cataloged procedure also very high, thereby improve significantly code efficiency, reduce encoder bit rate, save video bandwidth.
In cataloged procedure, need to obtain the residual values between coded reference image and enhancement layer image, then this residual values is encoded as coded data.
According to step 33, if need to carry out regional area cutting to obtain the local scene image in meeting-place to the enhancement layer image of certain resolution, thereby reach, emphasize or outstanding display effect, this enhancement layer image also needs cutting parameter to encode and send to decoding device.These cutting parameters all with the form coding of syntactic element in the sequence parameter set SPS of encoding code stream.
As previously mentioned, H.264 can support altogether 32 SPS, each enhancement layer image can be specified corresponding SPS in the process of coding so.The corresponding relation of this enhancement layer image and SPS can be stored in PPS, by encoding code stream, sends to together decoding device.
Certainly, in this encoding code stream, also may include other syntactic element, such as for identifying code stream identification of basic layer bit stream or enhancement layer bitstream etc.
In conjunction with the encoding-decoding process schematic diagram shown in Fig. 3 and the flow chart of Fig. 5, the coding/decoding method of the embodiment of the present invention four comprises:
Step 41, decoding device received code code stream.
Step 42, for the encoding code stream receiving, read the code stream identification in described encoding code stream, according to described code stream identification, obtain respectively basic layer bit stream and the enhancement layer bitstream in described encoding code stream.
Wherein, described code stream identification, can be for identifying the syntactic element of basic layer bit stream and enhancement layer bitstream in SVC extension syntax.This syntactic element can be for carrying out the syntactic element dependency_id in syntax parsing process to nal_unit_header_svc_extension (), if it is basic layer bit stream that dependency_id=0 represents this code stream, if dependency_id > 0 represents that this code stream is enhancement layer bitstream.Wherein for representing that the dependency_id value of each enhancement layer bitstream can difference be the code stream which enhancement layer is corresponding to distinguish.
Step 43, the sequence parameter set SPS and the picture parameter set PPS that decode in described encoding code stream.At this, can carry out by the nal_unit () to code stream the value that syntax parsing obtains syntactic element nal_unit_type.Wherein the NAL of nal_unit_type=7 (Network Abstraction Layer, network abstraction layer) code stream is SPS, and the NAL code stream of nal_unit_type=8 is PPS.
At this, NAL is done to simplicity of explanation.According to consensus standard H.264, H.264 algorithm conceptive can be divided into two-layer: Video coding layer (VCL:Video Coding Layer) is responsible for efficient video frequency content expressing, and network abstraction layer is responsible for data being packed and transmitting in the desired appropriate mode of network.
Step 44, for basic layer bit stream, can directly to it, decode according to the SPS and the PPS that obtain, obtain basic tomographic image.
Step 45, for enhancement layer bitstream, the needs according to showing, can judge whether all to decode to whole enhancement layer bitstreams.If so, all decodings, execution step 46; If not the enhancement layer bitstream of only needs being decoded is decoded, to save decode resources, execution step 46.
Step 46, the enhancement layer bitstream of decoding for needs, can first according to the basic tomographic image of step 44, obtain the decoded reference pictures of described enhancement layer bitstream, then utilize described decoded reference pictures to decode to described enhancement layer bitstream, obtain described enhancement layer image.
In detailed process, can carry out the decoded reference pictures that up-sampling obtains described enhancement layer bitstream to described basic tomographic image.
The decode slice header of described encoding code stream, obtains the ID syntactic element pic_parameter_set_id of the picture parameter set of described encoding code stream, the PPS that this syntactic element specifies current code stream to use.And then by described picture parameter set PPS, obtain the ID syntactic element seq_parameter_set_id of described sequence parameter set, the SPS that this syntactic element specifies current code stream to use.Finally, by described SPS, obtain corresponding cutting parameter, thereby according to described cutting parameter, determine the local scene image in meeting-place.
Step 47, the basic tomographic image that decoding is drawn and enhancement layer image shows or carry out other processing.
Wherein, the decoding basic tomographic image and the enhancement layer image that obtain can show or process in same demonstration or after-treatment device, take picture in turn switching mode carry out image demonstration or processing; Also can use respectively different demonstrations or after-treatment device show simultaneously or further process.
Fig. 6 is another schematic diagram of embodiment of the present invention encoding-decoding process, corresponding, in conjunction with Fig. 6 embodiment of the present invention five and six, also provides respectively a kind of decoding method, and what wherein the coding/decoding method of the embodiment of the present invention six was described with embodiment tetra-is identical.
For coding method, comparison diagram 3 and 6 can be found out, in the coding method of embodiment five, has changed the obtain manner of basic tomographic image and enhancement layer image.In Fig. 6, by an image collecting device, obtain, this can omit more image collecting device compared to the mode shown in Fig. 3.
In this embodiment, first utilize an image collecting device to gather the image of the resolution of actual needs.The resolution of described actual needs can be the highest resolution in the different resolution of display device needs.For example, suppose that the home court scape image resolution ratio that current meeting-place needs is 720x480, for to certain regional area of meeting-place, as rostrum and near region thereof, carry out feature or Special attention will be given to etc., the image in this part region need to be amplified to 4 times.So, the resolution that the resolution of the image of regional area is equivalent to main scene image need to be 1440x960 size.Then by the mode described in embodiment tri-above, utilize cutting parameter that the regional area needing is identified out.Like this, the image that need to be 1440x960 by resolution of this image collecting device collection, and then obtain the image that resolution is 720x480 after this image is processed.
For the image collecting in the manner described above, by downsampling device, carry out down-sampling, obtain the image of piece image or a plurality of resolution, as 720P (1280x720), the images of 800x640 etc., these images and the image collecting all can be encoded as enhancement layer image respectively.Or directly using the described image collecting as enhancement layer image, or using the described image collecting and image that the described image collecting is carried out obtaining after down-sampling as enhancement layer image.And any enhancement layer image can arrange the cutting parameter of oneself, make decoding device end can get the local scene image in meeting-place.
For the image collecting in the manner described above, by downsampling device, carry out down-sampling, obtain basic tomographic image.Wherein, the resolution of basic tomographic image can not be greater than the resolution of arbitrary enhancement layer image.Also be that the resolution of described basic tomographic image is all below the resolution of described enhancement layer image.
That in other processes in the present embodiment five and embodiment tri-, describes is identical.
Fig. 7 is another schematic diagram of embodiment of the present invention encoding-decoding process, corresponding, in conjunction with Fig. 7 embodiment of the present invention seven and eight, also provides respectively a kind of decoding method, and what wherein the coding/decoding method of the embodiment of the present invention eight was described with embodiment tetra-is identical.
For coding method, comparison diagram 3 and 7 can be found out, in the coding method of embodiment seven, has changed the obtain manner of basic tomographic image.According to shown in Fig. 7, utilize that image collecting device is special gathers image that resolution is less as basic tomographic image.Like this, basic tomographic image also comes from same meeting-place, can not introduce the damage of down-sampling, makes estimation more accurate.
That in other processes in the present embodiment seven and embodiment tri-, describes is identical.
Fig. 8 is another schematic diagram of embodiment of the present invention encoding-decoding process.As seen from Figure 8, it is expanding to n image collecting device, n layer enhancement layer image shown in Fig. 7.Basic tomographic image is to use the special collection of an image collecting device.Certainly, understandable, in distortion embodiment illustrated in fig. 8, also can to one of them enhancement layer image, carry out down-sampling by downsampling device and obtain described basic tomographic image.The enhancement layer image of a plurality of local scenes can all gather respectively with image collecting device, the enhancement layer image that also can have some local scenes is by some large image in different resolution, to carry out down-sampling to obtain, and the enhancement layer image of some local scenes gathers by image collecting device.The enhancement layer image of each local scene utilizes the enhancement layer of a SVC to encode, and basic tomographic image utilizes the basic layer of SVC to encode, thereby decoding device can obtain main scene image and a plurality of local scene image in a meeting-place.Utilize identical that the coding method of this Fig. 8 and embodiment tri-or seven describe, it is identical that coding/decoding method and embodiment tetra-describe.
Description by above-described embodiment can be found out, utilize the decoding method of the embodiment of the present invention, the quality of image in the time of not only can improving the many scenes demonstrations in meeting-place, and owing to realizing many scene images encoding and decoding simultaneously in meeting-place in same SVC coding and decoding device inside, can effectively avoid the different scene image absolute codings in same meeting-place, the dispersion treatment of decoding, take more than resource and the difficult problem of Synchronization Control.Meanwhile, the embodiment of the present invention has been utilized the characteristic of SVC, comprises the video code flow of the different scenes in same meeting-place in the code stream of Yi road, emphasizes the effect showing, and can reduce resource and the complexity of system to reach efficient, convenient and to take the many scenes in meeting-place that resource is few.
One of ordinary skill in the art will appreciate that all or part of flow process realizing in above-described embodiment method, to come the hardware that instruction is relevant to complete by computer program, described program can be stored in a computer read/write memory medium, this program, when carrying out, can comprise as the flow process of the embodiment of above-mentioned each side method.Wherein, described storage medium can be magnetic disc, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc.
As shown in Figure 9, the embodiment of the present invention nine also provides a kind of code device, comprising:
Image acquisition unit 91, for obtaining respectively the basic tomographic image of at least one width and at least one width enhancement layer image of meeting-place scene;
Coding unit 92, for described basic tomographic image is encoded as basic layer that can layered video encoding and decoding SVC, the enhancement layer using described enhancement layer image as described SVC, forms encoding code stream.
Described in preceding method embodiment, described image acquisition unit 91 can comprise: the first image collection module, for obtaining the identical but image that resolution is different of at least two width meeting-place scene Scenes, and using the described image obtaining as described enhancement layer image; The second image collection module, for the described image obtaining, selects the little image of resolution to carry out down-sampling, obtains described basic tomographic image.
Or described image acquisition unit 91 can comprise: the 3rd image collection module, for obtaining at least one width meeting-place scene image; The 4th image collection module, for carrying out obtaining after down-sampling enhancement layer image or using described meeting-place scene image and image that described meeting-place scene image is carried out obtaining after down-sampling as enhancement layer image using described meeting-place scene image as enhancement layer image or to described meeting-place scene image; The 5th image collection module, for a width of described meeting-place scene image is carried out to down-sampling, obtains basic tomographic image, and the resolution of wherein said basic tomographic image is not more than the resolution of arbitrary described enhancement layer image.Wherein, described the 5th image collection module selects meeting-place scene image to carry out the principle of down-sampling, can be also to select the less image of resolution to sample, damage picture quality being brought to reduce down-sampling.Described enhancement layer image can be several, and the resolution between several is different.
Wherein, described coding unit 92 comprises: coded reference image collection module, for obtain the coded reference image of described enhancement layer image by up-sampling according to described basic tomographic image; Coding module, for the basic layer using described basic tomographic image as described SVC, adopt H.264 method to encode, utilize the enhancement layer of described coded reference image using described enhancement layer image as described SVC to encode, and be respectively basic layer bit stream and enhancement layer bitstream interpolation code stream identification corresponding to described enhancement layer image that described basic tomographic image is corresponding, form encoding code stream.
For the abundant local scene that shows meeting-place, better meet user's needs, described coding unit 92 also can comprise: parameter acquisition module, for obtaining the cutting parameter of described enhancement layer image; Described coding module, also for when forming encoding code stream, is concentrated described cutting Parameter storage in the sequential parameter of described encoding code stream.
Wherein, the operation principle of described code device can be with reference to the description in preceding method embodiment.
The code device of the embodiment of the present invention, in the process of coding, for basic tomographic image and enhancement layer image in the meeting-place scene obtaining, encodes as basic layer, enhancement layer that can layered video encoding and decoding SVC respectively.The encoding code stream forming in this manner, takes full advantage of the characteristic of SVC coded system, can comprise the code stream corresponding to image of a plurality of different resolutions in the encoding code stream forming.Therefore, utilize the code device of the embodiment of the present invention, can improve the quality of image shown when realizing the many scenes demonstrations in meeting-place.
As shown in figure 10, the embodiment of the present invention ten also provides a kind of decoding device, comprising:
Code stream receiving element 1001, for received code code stream, wherein said encoding code stream is to be encoded and obtained by the enhancement layer using the basic tomographic image in the scene of meeting-place as basic layer that can layered video encoding and decoding SVC, using the enhancement layer image in the scene of meeting-place as described SVC;
Code stream acquiring unit 1002, for obtaining respectively basic layer bit stream and enhancement layer bitstream according to described encoding code stream;
Decoding unit 1003, for respectively described basic layer bit stream and enhancement layer bitstream being decoded, obtains described basic tomographic image and described enhancement layer image.
Wherein, described code stream acquiring unit 1002 can comprise: identification (RNC-ID) analytic module, for reading the code stream identification of described encoding code stream; Code stream acquisition module, for obtaining respectively basic layer bit stream and the enhancement layer bitstream of described encoding code stream according to described code stream identification.
Wherein, described decoding unit 1003 can comprise: the 6th image collection module, for described basic tomographic image is carried out to the decoded reference pictures that up-sampling obtains described enhancement layer bitstream; The 7th image collection module, for utilizing described decoded reference pictures to decode to described enhancement layer bitstream, obtains enhancement layer image.
In order to save decode resources, described decoding unit 1003 also can comprise: judge module, for judging whether, need to decode to described enhancement layer bitstream.Described judge whether to decode to described enhancement layer bitstream according to being, whether the display terminal of decoding end supports the function that topography amplifies, and is the described enhancement layer bitstream of decoding, otherwise does not decode; Or, even if the display terminal of decoding end is supported the function that topography amplifies, also judge whether the demand that topography amplifies, the instruction that topography amplifies for example whether detected, if, decoding needs enhancement layer bitstream corresponding to topography amplifying, and for other enhancement layer bitstreams, does not decode.Certainly, also may there is other criterion, will not enumerate herein.
Described decoding unit also can comprise: Parameter analysis of electrochemical module, for the slice header of the described encoding code stream of decoding, obtains the picture parameter set of described encoding code stream, and obtain described sequence parameter set by described picture parameter set.Now, described code stream acquisition module 1002, specifically for utilizing described sequence parameter set, described decoded reference pictures to decode to described enhancement layer bitstream, obtains enhancement layer image.
In addition, in order to save decode resources, improve decoding efficiency, described several parsing modules are also concentrated the cutting parameter of storage for obtaining described sequential parameter; Described code stream acquisition module, specifically for utilizing described sequence parameter set, cutting parameter and described decoded reference pictures to decode to described enhancement layer bitstream, obtains the enhancement layer image in the determined region of described cutting parameter.
In addition, described decoding device also can comprise: display unit, and for showing described basic tomographic image and enhancement layer image.
Wherein, the operation principle of described decoding device can be with reference to the description in preceding method embodiment.
The decoding device of the embodiment of the present invention, owing to being by basic tomographic image and the enhancement layer image of meeting-place scene in the process at coding, respectively as basic layer that can layered video encoding and decoding SVC, enhancement layer is encoded and form encoding code stream.That is to say the encoding code stream forming in this manner, take full advantage of the characteristic of SVC coded system, in the encoding code stream forming, can comprise the code stream corresponding to image of a plurality of different resolutions.Therefore, the basic tomographic image that decoding device that can the utility embodiment of the present invention obtains and the quality of enhancement layer image, thus improve the quality of image shown when realizing the many scenes in meeting-place and show.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited to this, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; can expect easily changing or replacing, within all should being encompassed in protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection range of described claim.