WO2016161674A1 - Method, device, and system for video image compression and reading - Google Patents

Method, device, and system for video image compression and reading Download PDF

Info

Publication number
WO2016161674A1
WO2016161674A1 PCT/CN2015/077729 CN2015077729W WO2016161674A1 WO 2016161674 A1 WO2016161674 A1 WO 2016161674A1 CN 2015077729 W CN2015077729 W CN 2015077729W WO 2016161674 A1 WO2016161674 A1 WO 2016161674A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
code stream
layer
video
target layer
Prior art date
Application number
PCT/CN2015/077729
Other languages
French (fr)
Chinese (zh)
Inventor
武晓阳
浦世亮
沈林杰
俞海
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2016161674A1 publication Critical patent/WO2016161674A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • H04N19/23Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding with coding of regions that are present throughout a whole video segment, e.g. sprites, background or mosaic

Definitions

  • the present invention relates to the field of image processing, and in particular, to a method, device and system for video image compression and reading.
  • the digital video compression standard began in the 1980s. After more than 30 years of development, the existing standards include ITU-T series H.261, H.263, ISO MPEG-1, MPEG-4, and two organizations. Developed MPEG-2/H.262, H.264/AVC, HEVC (newest release in 2013). There are other standards of organizations, such as domestic AVS, Microsoft's VC-1, Google's VP8 and so on. Similarly, these standards use a block-based hybrid coding framework that combines predictive coding, transform coding, and entropy coding.
  • the block-based hybrid coding frame encoding process is shown in Figure 1.
  • the image to be encoded is first block-processed and divided into 16x16 blocks called macroblocks (the HEVC block size can vary from 8x8 to 64x64, called Maximum coding unit LCU).
  • macroblocks are encoded in a scanning order from left to right and top to bottom. Each macroblock first performs predictive coding, and uses the previous frame to reconstruct the image or the already coded portion around the macroblock as a reference to obtain the predicted residual data; the residual data is spatially transformed and encoded, and the DCT or ICT is used according to different sizes.
  • the parameter data is transformed to obtain transform coefficients in the frequency domain; after the transform coefficients are quantized, they are sent to the entropy coding to obtain the final code stream.
  • the current quantized data needs to be inversely processed, that is, inverse quantized and inverse transformed, and then added to the predicted data to obtain a decoded image, that is, a reconstructed image, and the reconstructed image is placed.
  • the reference buffer the reference image is encoded as the next frame image.
  • the decoding process of the block-based hybrid coding framework is shown in FIG. 2.
  • the decoded image (video signal) is obtained.
  • the decoded image needs to be stored for use as a reference image for frame decoding.
  • Predictive coding is an important coding technique for video compression.
  • the coded image can be divided into I frames (intra prediction frame, Intra), P frame (inter prediction frame, prediction), B frame (bidirectional prediction frame, Bi-Prediction).
  • I frames intra prediction frame, Intra
  • P frame inter prediction frame, prediction
  • B frame bidirectional prediction frame, Bi-Prediction
  • I frame intra prediction frame
  • P frame inter prediction frame
  • B frame bidirectional prediction frame, Bi-Prediction
  • the previous frame and the subsequent frame can be used as reference at the same time to become a bidirectional reference frame.
  • the B frame decoding needs to be decoded after both the previous reference frame and the subsequent reference frame are successfully decoded.
  • P frame, B frame in addition to encoding Use other frames as a reference, or use the frame data as a reference for I frame, and choose the best case for both.
  • I frames can be decoded independently, and are usually used for random insertion. For example, digital TV requires 1 to 1.5 seconds to insert an I frame to ensure that the user can see the image as soon as possible when switching channels.
  • the I frame compression efficiency is low, and the code rate is relatively large, usually 4 to 10 times or even several times of the P frame.
  • I frame ⁇ P frame ⁇ B frame is usually used.
  • I frame ⁇ P frame ⁇ B frame is usually used.
  • the foregoing multiple reconstructed images may be used as reference frames, as shown in FIG. 5, which is a P-frame multi-frame reference case, and when encoding the second P-frame, the first two frames are used as reference; As shown in FIG. 6, it is a B frame multi-frame reference case, the forward reference frame of the B frame has two frames, and the backward reference frame is one frame. Multi-frame reference can improve compression efficiency and increase the complexity of the operation.
  • the processing of the encoding of interest in the existing video is implemented by assigning different quantized coefficients to the coding blocks of the region of interest class.
  • the quantized coefficients are smaller than other regions and the picture quality is high.
  • the order of the code streams, the dependencies between the blocks and adjacent blocks, and the dependence of the blocks on the reference image blocks have not changed.
  • the user needs to retrieve the video, it is necessary to decode all the pictures to obtain the picture of the region of interest.
  • there are not many moving objects on the monitoring screen and the time period of the moving objects is also a small number. All images are completely solved and retrieved, and the waste of computing resources is serious.
  • the code stream is combined with the code stream.
  • the composite code stream is decoded, and the image containing the target object is directly retrieved, thereby improving the utilization of the computing resource.
  • the first aspect adopts a video image compression method, including:
  • the combining the code stream of the target layer and the code stream of the background layer is specifically:
  • the header information is added to the code stream corresponding to the image to be encoded, and the code stream of the target layer and the code stream of the background layer are recorded after the header information.
  • the method respectively generates a code stream for each of the target layer and the background layer, including:
  • a code stream is separately generated for each of the filled target layer and the background layer.
  • the location information of the target in the target layer is recorded in the header information.
  • the location information of the target in the target layer in the header information is recorded as empty.
  • the separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information.
  • the second aspect adopts a video image compression device, including:
  • a layer extracting unit configured to extract a background layer and a target layer from the image to be encoded, where the target in the target layer is an area of the interest region in the image to be encoded;
  • a layer coding unit configured to respectively generate a code stream for each of the target layer and the background layer
  • the code stream composite unit is configured to combine the code stream of the target layer and the code stream of the background layer.
  • the code stream composite unit is specifically configured to:
  • the header information is added to the code stream corresponding to the image to be encoded, and the code stream of the target layer and the code stream of the background layer are recorded after the header information.
  • the layer coding unit includes:
  • a first filling module configured to fill an area other than the target in the target layer with a fixed value
  • a second filling module configured to fill, in the background layer, an area corresponding to the target in the target layer with a fixed value
  • the layer coding module is configured to respectively generate a code stream for each of the filled target layer and the background layer.
  • the location information of the target in the target layer is recorded in the header information.
  • the location information of the target in the target layer in the header information is recorded as empty.
  • the separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information.
  • the third aspect adopts a video image reading method, including:
  • Obtaining a video code stream where the video code stream is composed of a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is a portion of the interest region in the image;
  • the associated video codestream is decoded starting from the target video frame.
  • the video code stream is added with header information, where the header information records location information of the target in the target layer;
  • Decoding the related video code stream from the target video frame specifically:
  • An area of interest region is decoded from the target video frame, and the portion of interest region is composited to the background layer according to the location information.
  • the separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information.
  • the fourth aspect adopts a video image reading device, including:
  • a code stream obtaining unit configured to acquire a video code stream, where the video code stream is formed by combining a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is an interest in the image Regional part;
  • a target confirmation unit configured to confirm a target video frame where the decoding target is located
  • a code stream decoding unit configured to decode the related video code stream from the target video frame.
  • the video code stream is added with header information, where the header information records location information of the target in the target layer;
  • the code stream decoding unit is specifically configured to:
  • An area of interest region is decoded from the target video frame, and the portion of interest region is composited to the background layer according to the location information.
  • the separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information.
  • a video image processing system comprising the video image compression device according to any of the above aspects, and the video image reading device according to any one of the preceding claims.
  • the invention has the beneficial effects that: by extracting the background layer and the target layer from the image to be encoded, respectively, the background layer and the target layer are separately encoded to generate a code stream, and then the code stream is combined, and the composite code stream is decoded. Decoding, directly retrieve the image containing the target object, and improve the utilization of computing resources.
  • FIG. 1 is a schematic flow chart of a block-based hybrid coding framework coding in the prior art
  • FIG. 2 is a schematic flow chart of decoding of a block-based hybrid coding frame in the prior art
  • FIG. 3 is a schematic diagram showing a scanning sequence of macroblocks in block-based hybrid coding in the prior art
  • FIG. 4 is a schematic diagram of an inter-frame reference relationship in block-based hybrid coding in the prior art
  • FIG. 5 is a schematic diagram of a reference relationship of a P frame multiframe reference in block-based hybrid coding in the prior art
  • FIG. 6 is a schematic diagram of a reference relationship of a B frame multiframe reference in block-based hybrid coding in the prior art
  • Figure 7 is a schematic illustration of a region of interest in an image in the prior art
  • FIG. 8 is a flowchart of a method of a first embodiment of a video image compression method according to an embodiment of the present invention.
  • FIG. 9 is a flowchart of a method of a second embodiment of a video image compression method according to an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of an image layer and a background layer in a second embodiment of a video image compression method according to an embodiment of the present invention.
  • FIG. 11 is a schematic diagram of a code stream organization manner in a second embodiment of a video image compression method according to an embodiment of the present invention.
  • FIG. 12 is a schematic structural diagram of a first embodiment of an apparatus for compressing video images according to an embodiment of the present invention.
  • FIG. 13 is a structural block diagram of a second embodiment of an apparatus for compressing video images according to an embodiment of the present invention.
  • FIG. 14 is a flowchart of a method of a first embodiment of a video image reading method according to an embodiment of the present invention.
  • FIG. 15 is a block diagram showing the structure of a first embodiment of a video image reading apparatus according to an embodiment of the present invention.
  • 16 is a block diagram showing the configuration of a first embodiment of a video image processing system according to an embodiment of the present invention.
  • FIG. 8 is a flowchart of a method for a first embodiment of a video image compression method according to an embodiment of the present invention.
  • the method in this embodiment is mainly used for storing various videos, especially monitoring videos. As shown, the method includes:
  • Step S101 extracting a background layer and a target layer from the image to be encoded, where the target in the target layer is the portion of the interest region in the image to be encoded.
  • Step S102 respectively generating a code stream for each of the target layer and the background layer.
  • the target layer and the background layer are separately encoded to form respective code streams at the time of encoding, and the encoding may be specifically performed for the adopted encoding standard.
  • Step S103 Combine the code stream of the target layer and the code stream of the background layer.
  • the code stream of the target layer is combined with the code stream of the background layer. Compared with the prior art scheme, the combined code stream can perform more accurate positioning and directly access the image of the determined target. The decoding efficiency is improved.
  • the background layer and the target layer are separately encoded to generate a code stream, and then the code stream is combined, and the composite code stream is decoded during decoding.
  • Direct retrieval of images containing target objects improves the utilization of computing resources.
  • FIG. 9 is a flowchart of a method for a second embodiment of a video image compression method according to an embodiment of the present invention. As shown in the figure, the method includes:
  • Step S201 extracting a background layer and a target layer from the image to be encoded, where the target in the target layer is a portion of the region of interest in the image to be encoded.
  • the extraction of the background layer and the target layer is realized by image recognition or image analysis, and the range selection of the target layer can also be completed by setting the imaging device.
  • the specific technical solutions have been implemented in the prior art and will not be further described herein.
  • Step S202 Filling the area other than the target in the target layer with a fixed value.
  • Step S203 Filling in the background layer corresponding to the target in the target layer with a fixed value.
  • the target layer in the original position in the image when decoding In order to make the target layer in the original position in the image when decoding, the area outside the target in the target layer is filled with a fixed value, and the area in the background layer corresponding to the target in the target layer is also filled with a fixed value.
  • the target layer and the background layer have the same image size and resolution when encoding, and the subsequent composite operations are more accurate.
  • the specific filling method is as shown in FIG. 10, two extracting two layers from the image, and filling the corresponding positions of the other layer in the respective layers, which is equivalent to obtaining two sub-resolutions with the same resolution.
  • the image frame is encoded separately for the two layers.
  • Step S204 respectively generating a code stream for each of the filled target layer and the background layer.
  • each block is still encoded according to the scanning method from left to right and from top to bottom, and only when the filled portion is encountered, skipping without processing, each The code streams generated by the layers are combined.
  • Step S205 Add header information to the code stream corresponding to the image to be encoded, and record the code stream of the target layer and the code stream of the background layer after the header information.
  • the header information of the target in the target layer is recorded in the header information.
  • a separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information; specifically, the separation identifier may be a start code identifier capable of separating, that is, each layer
  • the code stream is provided with a start code identifier so as to distinguish the start position of the code stream when decoding.
  • the specific code stream is organized as shown in FIG. 11.
  • the header information is added before the video stream, and the location of the target in the target layer is recorded. When the video is retrieved, the header information is directly used for accurate positioning, thereby improving data processing.
  • the efficiency, the positional relationship between the code stream of the specific target layer and the code stream of the background layer is not limited, and the layer 1 code stream and the layer 2 code stream in FIG. 11 respectively correspond to one.
  • the background layer and the target layer are separately encoded to generate a code stream, and then the code stream is combined, and the composite code stream is decoded during decoding.
  • Direct retrieval of images containing target objects improves the utilization of computing resources.
  • the header information is also set, and the code stream of the target layer and the code stream of the background layer are recorded after the header information, and the location information of the target in the target layer is recorded in the header information, and the code stream and background of the target layer are recorded.
  • a separate identifier is inserted between the code streams of the layer to distinguish the two, and the ordered storage and fast retrieval of the two code streams are realized.
  • the following is an embodiment of a device for compressing video images provided in an embodiment of the present invention, and an embodiment of a device for compressing video images is implemented based on an embodiment of the method for compressing video images described above, in a device for compressing digital video images.
  • a device for compressing video images is implemented based on an embodiment of the method for compressing video images described above, in a device for compressing digital video images.
  • the embodiments please refer to the above embodiments of the video image compression method.
  • FIG. 12 is a structural block diagram of a first embodiment of a video image compression apparatus according to an embodiment of the present invention. As shown, the apparatus includes:
  • a layer extracting unit 310 configured to extract a background layer and a target layer from the image to be encoded, where the target in the target layer is an area of the interest region in the image to be encoded;
  • the target in the video is divided into the target layer, and when the video is viewed, the target layer is directly searched to realize a fast retrieval of the target that needs to be retrieved. Improve computing efficiency.
  • a layer coding unit 320 configured to separately generate a code stream for each of the target layer and the background layer
  • the code stream combining unit 330 is configured to combine the code stream of the target layer and the code stream of the background layer.
  • the cooperation of the above units works by extracting the background layer and the target layer from the image to be encoded, respectively encoding the background layer and the target layer to generate a code stream, and then combining and decoding the code stream.
  • the composite code stream is decoded to directly retrieve the image containing the target object, which improves the utilization of the computing resources.
  • FIG. 13 is a structural block diagram of a second embodiment of a video image compression apparatus according to an embodiment of the present invention. As shown, the apparatus includes:
  • a layer extracting unit 310 configured to extract a background layer and a target layer from the image to be encoded, where the target in the target layer is an area of the interest region in the image to be encoded;
  • a layer coding unit 320 configured to separately generate a code stream for each of the target layer and the background layer
  • the code stream combining unit 330 is configured to combine the code stream of the target layer and the code stream of the background layer.
  • the code stream recombining unit 330 is specifically configured to:
  • the header information is added to the code stream corresponding to the image to be encoded, and the code stream of the target layer and the code stream of the background layer are recorded after the header information.
  • the layer coding unit 320 includes:
  • a first filling module 321 configured to fill an area other than the target in the target layer with a fixed value
  • a second filling module 322, configured to fill, in the background layer, an area corresponding to the target in the target layer with a fixed value
  • the layer coding module 323 is configured to respectively generate a code stream for each of the filled target layer and the background layer.
  • each block is still encoded according to the scanning method from left to right and from top to bottom, and only when the filled portion is encountered, skipping without processing, each The code streams generated by the layers are combined.
  • the location information of the target in the target layer is recorded in the header information.
  • the location information of the target in the target layer in the header information is recorded as empty.
  • the separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information. Specifically, the code stream of each layer is set with a start code identifier for decoding. Differentiate the starting position of the code stream.
  • the cooperation of the above functional modules by extracting the background layer and the target layer from the image to be encoded, respectively encoding the background layer and the target layer to generate a code stream, and then combining and decoding the code stream.
  • the composite code stream is decoded to directly retrieve the image containing the target object, which improves the utilization of the computing resources.
  • the header information is also set, and the code stream of the target layer and the code stream of the background layer are recorded after the header information, and the location information of the target in the target layer is recorded in the header information, and the code stream and background of the target layer are recorded.
  • a separate identifier is inserted between the code streams of the layer to distinguish the two, and the ordered storage and fast retrieval of the two code streams are realized.
  • FIG. 14 is a flowchart of a method for a first embodiment of a video image reading method according to an embodiment of the present invention. As shown in the figure, the method includes:
  • Step S401 Acquire a video code stream, where the video code stream is formed by combining a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is a portion of the interest region in the image.
  • the video stream is a composite of the code stream of the target layer and the code stream of the background layer.
  • the target information to be read is targeted.
  • Step S402 Confirm that the frame in which the position information of the decoding target is recorded in the header information is the target video frame.
  • the video stream is added with header information, and the header information is recorded with location information of a target in the target layer. It can be accessed directly from the location where the header information is recorded.
  • the target video frame in which the decoding target is located can also be implemented according to other schemes, for example, without setting the header information, and directly accessing the video frame by frame.
  • Step S403 Decode an area of interest region from the target video frame, and combine the portion of the interest region into the background layer according to the location information.
  • a separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information.
  • the code stream in which the target layer is located may be accessed directly according to the boundary of the separated identifier.
  • the following is an embodiment of a device for reading video images provided in a specific embodiment of the present invention.
  • the embodiment of the device for reading video images is implemented based on the embodiment of the method for reading video images described above, and is read in several video images.
  • FIG. 15 is a structural block diagram of an apparatus for video image reading according to a specific embodiment of the present invention. As shown in the figure, the apparatus includes:
  • a code stream obtaining unit 510 configured to acquire a video code stream, where the video code stream is formed by combining a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is in an image Part of the area of interest;
  • a target confirmation unit 520 configured to confirm a target video frame where the decoding target is located
  • the code stream decoding unit 530 is configured to start decoding the related video code stream from the target video frame.
  • the video code stream is added with header information, where the header information records location information of the target in the target layer;
  • the code stream decoding unit 530 is specifically configured to:
  • An area of interest region is decoded from the target video frame, and the portion of interest region is composited to the background layer according to the location information.
  • the separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information.
  • the cooperative work of the above units realizes fast access to the target layer by reading the composite generated video code stream, improves the decoding efficiency, and reduces the computational complexity.
  • the video image processing system comprises the above-mentioned video image compression device 30 and video image reading device 50.
  • the video image compression apparatus 30 includes:
  • a layer extracting unit 310 configured to extract a background layer and a target layer from the image to be encoded, where the target in the target layer is an area of the interest region in the image to be encoded;
  • a layer coding unit 320 configured to separately generate a code stream for each of the target layer and the background layer
  • the code stream combining unit 330 is configured to combine the code stream of the target layer and the code stream of the background layer.
  • the video image reading device 50 includes:
  • a code stream obtaining unit 510 configured to acquire a video code stream, where the video code stream is formed by combining a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is in an image Part of the area of interest;
  • a target confirmation unit 520 configured to confirm a target video frame where the decoding target is located
  • the code stream decoding unit 530 is configured to start decoding the related video code stream from the target video frame.
  • the cooperation of the above units works by extracting the background layer and the target layer from the image to be encoded, respectively encoding the background layer and the target layer to generate a code stream, and then combining and decoding the code stream.
  • the composite code stream is decoded to directly retrieve the image containing the target object, which improves the utilization of the computing resources.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Disclosed are a method, device, and system for video image compression and reading. The method for video image compression and reading comprises: extracting a background layer and a target layer from images to be encoded, where a target in the target layer is an area of interest in the images to be encoded; encoding respectively the target layer and the background layer to produce respective streams; and, compounding the stream of the target layer and the stream of the background layer. By extracting the background layer and the target layer from the images to be encoded, encoding respectively the background layer and the target layer to produce respective streams, and then compounding the streams, when decoding, the compounding streams are decoded and images comprising a target object is retrieved directly, thus increasing the utilization rate of computer resources.

Description

一种视频图像压缩和读取的方法、装置及系统Method, device and system for video image compression and reading 技术领域Technical field
本发明涉及图像处理领域,尤其涉及一种视频图像压缩和读取的方法、装置及系统。The present invention relates to the field of image processing, and in particular, to a method, device and system for video image compression and reading.
背景技术Background technique
数字视频压缩标准开始于20世纪80年代,经过30多年的发展,现有的标准有ITU-T系列的H.261、H.263,ISO的MPEG-1、MPEG-4,以及两个组织共同制定的MPEG-2/H.262、H.264/AVC,HEVC(2013年最新发布)。还有其他组织的标准,比如国内的AVS、微软的VC-1、谷歌的VP8等。相同的是,这些标准都采用基于块的混合编码框架,融合预测编码、变换编码、以及熵编码3大编码技术。The digital video compression standard began in the 1980s. After more than 30 years of development, the existing standards include ITU-T series H.261, H.263, ISO MPEG-1, MPEG-4, and two organizations. Developed MPEG-2/H.262, H.264/AVC, HEVC (newest release in 2013). There are other standards of organizations, such as domestic AVS, Microsoft's VC-1, Google's VP8 and so on. Similarly, these standards use a block-based hybrid coding framework that combines predictive coding, transform coding, and entropy coding.
基于块的混合编码框架编码流程如图1所示,将要编码的图像先进行分块处理,分成16x16的块,称作宏块(Macroblock)(HEVC的块大小可以变化,从8x8到64x64,叫做最大编码单元LCU)。如图3所示,宏块按照从左至右、从上至下的扫描顺序进行编码。每个宏块首先进行预测编码,利用前面一帧重建图像或者宏块周围已经编码部分做参考,获得预测后的残差数据;残差数据进行空间的变换编码,采用DCT或者ICT按不同大小块对参数数据进行变换,得到频域中的变换系数;变换系数经过量化后,送到熵编码中,获得最终的码流。为了有效对下一帧图像进行编码,当前量化后的数据需要经过反向处理,也就是反量化、反变换,再与预测数据相加获得解码出来的图像,也就是重建图像,重建图像放在参考缓存中,作为下一帧图像编码的参考图像。基于块的混合编码框架的解码流程如图2所示,编码码流经过熵解码、反量化、反变换后,再与预测的图像进行相加就得到了解码的图像(视频信号)。解码出来的图像需要存储起来,以做下帧解码的参考图像使用。The block-based hybrid coding frame encoding process is shown in Figure 1. The image to be encoded is first block-processed and divided into 16x16 blocks called macroblocks (the HEVC block size can vary from 8x8 to 64x64, called Maximum coding unit LCU). As shown in FIG. 3, macroblocks are encoded in a scanning order from left to right and top to bottom. Each macroblock first performs predictive coding, and uses the previous frame to reconstruct the image or the already coded portion around the macroblock as a reference to obtain the predicted residual data; the residual data is spatially transformed and encoded, and the DCT or ICT is used according to different sizes. The parameter data is transformed to obtain transform coefficients in the frequency domain; after the transform coefficients are quantized, they are sent to the entropy coding to obtain the final code stream. In order to effectively encode the next frame image, the current quantized data needs to be inversely processed, that is, inverse quantized and inverse transformed, and then added to the predicted data to obtain a decoded image, that is, a reconstructed image, and the reconstructed image is placed. In the reference buffer, the reference image is encoded as the next frame image. The decoding process of the block-based hybrid coding framework is shown in FIG. 2. After the encoded code stream is entropy decoded, inverse quantized, and inverse transformed, and then added to the predicted image, the decoded image (video signal) is obtained. The decoded image needs to be stored for use as a reference image for frame decoding.
预测编码是视频压缩的重要编码技术,根据预测数据来源不同,可以把编码图像分成I帧(帧内预测帧、Intra)、P帧(帧间预测帧、Prediction)、B帧(双向预测帧、Bi-Prediction)。如图4所示,I帧进行预测编码时,只采用本帧的数据进行预测,解码的时候可以独立解码,不依赖其他帧。P帧进行预测编码时,采用前面一帧已编码图像的重建图像作为参考,P帧解码的时候,必须等参考帧的图像解码完成才能解码。B帧进行预测编码时,可以同时采用前面帧和后面帧做参考,成为双向参考帧,B帧解码是需要前面参考帧和后面参考帧都解码成功后才能解码。P帧、B帧在编码时除了 用其他帧做参考,也可以像I帧样用本帧数据做参考,选择两者最优情况就行。I帧可以独立解码,通常用作随机插入使用,比如数字电视要求1~1.5秒插入I帧,保证用户切换频道时,能够尽快看到图像。但是I帧压缩效率低,码率比较大,通常是P帧的4~10倍,甚至几十倍。就压缩效率来说,通常情况下I帧<P帧<B帧,就运算复杂度来说,通常情况下I帧<P帧<B帧。Predictive coding is an important coding technique for video compression. According to different sources of prediction data, the coded image can be divided into I frames (intra prediction frame, Intra), P frame (inter prediction frame, prediction), B frame (bidirectional prediction frame, Bi-Prediction). As shown in FIG. 4, when the I frame is subjected to predictive coding, only the data of the current frame is used for prediction, and the decoding can be independently decoded without relying on other frames. When the P frame is used for predictive coding, the reconstructed image of the encoded image of the previous frame is used as a reference. When the P frame is decoded, the image of the reference frame must be decoded before decoding. When predictive coding is performed on a B frame, the previous frame and the subsequent frame can be used as reference at the same time to become a bidirectional reference frame. The B frame decoding needs to be decoded after both the previous reference frame and the subsequent reference frame are successfully decoded. P frame, B frame in addition to encoding Use other frames as a reference, or use the frame data as a reference for I frame, and choose the best case for both. I frames can be decoded independently, and are usually used for random insertion. For example, digital TV requires 1 to 1.5 seconds to insert an I frame to ensure that the user can see the image as soon as possible when switching channels. However, the I frame compression efficiency is low, and the code rate is relatively large, usually 4 to 10 times or even several times of the P frame. In terms of compression efficiency, I frame <P frame <B frame is usually used. In terms of computational complexity, I frame <P frame <B frame is usually used.
在进行帧间预测时,可以采用前面多个重建图像做参考帧,如图5所示,其是P帧多帧参考情况,在编码第2个P帧时,采用前面两帧图像做参考;如图6所示,其是B帧多帧参考情况,B帧的前向参考帧有两帧,后向参考帧为一帧。多帧参考可以提高压缩效率,同时也会增加运算的复杂度。When performing inter-frame prediction, the foregoing multiple reconstructed images may be used as reference frames, as shown in FIG. 5, which is a P-frame multi-frame reference case, and when encoding the second P-frame, the first two frames are used as reference; As shown in FIG. 6, it is a B frame multi-frame reference case, the forward reference frame of the B frame has two frames, and the backward reference frame is one frame. Multi-frame reference can improve compression efficiency and increase the complexity of the operation.
在实际应用中,特别是视频监控应用中,用户往往对画面中特定的目标感兴趣,比如画面中的人、车、出入口区域等,希望这些区域画面质量清晰,也就是感兴趣编码,图7错误!未找到引用源。所示图像存在3个感兴趣区域。另外,由于监控视频点位多、时间长,导致数据量大,用户希望通过检索的方式快速定位目标,而不是查看整个视频。In practical applications, especially in video surveillance applications, users tend to be interested in specific targets in the picture, such as people, cars, entrances and exits, etc. in the picture, and hope that the picture quality of these areas is clear, that is, the coding of interest, Figure 7 error! The reference source was not found. There are 3 regions of interest in the image shown. In addition, because the monitoring video has many points and long time, resulting in a large amount of data, the user wants to quickly locate the target by means of retrieval, instead of viewing the entire video.
在现有的视频中处理感兴趣编码采用对感兴趣区域类的编码块分配不同的量化系数来实现,通常量化系数比其他区域小,画面质量高。但是,码流的顺序、块与相邻块之间的依赖关系、以及块与参考图像块的依赖关系并没有变。这时,用户如果需要对视频进行检索,需要解码所有的画面,才能获得感兴趣区域的画面。通常情况,监控画面的运动物体并不多,而且含有运动物体的时段也是少数,完全解出所有的图像再检索,计算资源的浪费严重。The processing of the encoding of interest in the existing video is implemented by assigning different quantized coefficients to the coding blocks of the region of interest class. Usually, the quantized coefficients are smaller than other regions and the picture quality is high. However, the order of the code streams, the dependencies between the blocks and adjacent blocks, and the dependence of the blocks on the reference image blocks have not changed. At this time, if the user needs to retrieve the video, it is necessary to decode all the pictures to obtain the picture of the region of interest. Normally, there are not many moving objects on the monitoring screen, and the time period of the moving objects is also a small number. All images are completely solved and retrieved, and the waste of computing resources is serious.
发明内容Summary of the invention
本发明的目的是提供一种视频图像压缩和读取的方法、装置及系统,其通过从待编码图像提取出背景图层和目标图层,将背景图层和目标图层分别进行编码各自生成码流,再将码流复合,解码时将复合码流解码,直接检索含有目标对象的图像,提高了计算资源的利用率。It is an object of the present invention to provide a video image compression and reading method, apparatus and system for separately encoding a background layer and a target layer by extracting a background layer and a target layer from the image to be encoded. The code stream is combined with the code stream. When decoding, the composite code stream is decoded, and the image containing the target object is directly retrieved, thereby improving the utilization of the computing resource.
为实现上述目的,具体采用以下技术方案:In order to achieve the above objectives, the following technical solutions are specifically adopted:
第一方面采用一种视频图像压缩方法,包括:The first aspect adopts a video image compression method, including:
从待编码图像提取出背景图层和目标图层,所述目标图层中的目标为待编码图像中的兴趣区域部分; Extracting a background layer and a target layer from the image to be encoded, wherein the target in the target layer is a portion of the region of interest in the image to be encoded;
分别对目标图层和背景图层编码各自生成码流;Generating a code stream for each of the target layer and the background layer respectively;
将目标图层的码流和背景图层的码流进行复合。Combine the code stream of the target layer with the code stream of the background layer.
其中,所述将目标图层的码流和背景图层的码流进行复合,具体为:Wherein, the combining the code stream of the target layer and the code stream of the background layer is specifically:
为待编码图像对应的码流添加头信息,在头信息后记录目标图层的码流和背景图层的码流。The header information is added to the code stream corresponding to the image to be encoded, and the code stream of the target layer and the code stream of the background layer are recorded after the header information.
其中,所述分别对目标图层和背景图层编码各自生成码流,包括:The method respectively generates a code stream for each of the target layer and the background layer, including:
将所述目标图层中目标之外的区域用固定值填充;Filling the area outside the target in the target layer with a fixed value;
将所述背景图层中对应所述目标图层中目标所在的区域用固定值填充;Filling in the background layer corresponding to the target in the target layer with a fixed value;
分别对填充后的目标图层和背景图层编码各自生成码流。A code stream is separately generated for each of the filled target layer and the background layer.
其中,所述头信息中记录有目标图层中的目标的位置信息。The location information of the target in the target layer is recorded in the header information.
其中,当待编码图像中提取目标图层失败时,所述头信息中目标图层中的目标的位置信息记录为空。Wherein, when the target layer is failed to be extracted in the image to be encoded, the location information of the target in the target layer in the header information is recorded as empty.
其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。The separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information.
第二方面采用一种视频图像压缩装置,包括:The second aspect adopts a video image compression device, including:
图层提取单元,用于从待编码图像提取出背景图层和目标图层,所述目标图层中的目标为待编码图像中的兴趣区域部分;a layer extracting unit, configured to extract a background layer and a target layer from the image to be encoded, where the target in the target layer is an area of the interest region in the image to be encoded;
图层编码单元,用于分别对目标图层和背景图层编码各自生成码流;a layer coding unit, configured to respectively generate a code stream for each of the target layer and the background layer;
码流复合单元,用于将目标图层的码流和背景图层的码流进行复合。The code stream composite unit is configured to combine the code stream of the target layer and the code stream of the background layer.
其中,所述码流复合单元,具体用于:The code stream composite unit is specifically configured to:
为待编码图像对应的码流添加头信息,在头信息后记录目标图层的码流和背景图层的码流。The header information is added to the code stream corresponding to the image to be encoded, and the code stream of the target layer and the code stream of the background layer are recorded after the header information.
其中,所述图层编码单元,包括:The layer coding unit includes:
第一填充模块,用于将所述目标图层中目标之外的区域用固定值填充; a first filling module, configured to fill an area other than the target in the target layer with a fixed value;
第二填充模块,用于将所述背景图层中对应所述目标图层中目标所在的区域用固定值填充;a second filling module, configured to fill, in the background layer, an area corresponding to the target in the target layer with a fixed value;
图层编码模块,用于分别对填充后的目标图层和背景图层编码各自生成码流。The layer coding module is configured to respectively generate a code stream for each of the filled target layer and the background layer.
其中,所述头信息中记录有目标图层中的目标的位置信息。The location information of the target in the target layer is recorded in the header information.
其中,当待编码图像中提取目标图层失败时,所述头信息中目标图层中的目标的位置信息记录为空。Wherein, when the target layer is failed to be extracted in the image to be encoded, the location information of the target in the target layer in the header information is recorded as empty.
其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。The separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information.
第三方面采用一种视频图像读取方法,包括:The third aspect adopts a video image reading method, including:
获取视频码流,所述视频码流由目标图层的码流和背景图层的码流复合而成;其中,所述目标图层中的目标为图像中的兴趣区域部分;Obtaining a video code stream, where the video code stream is composed of a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is a portion of the interest region in the image;
确认解码目标所在的目标视频帧;Confirm the target video frame where the decoding target is located;
从所述目标视频帧开始对相关视频码流解码。The associated video codestream is decoded starting from the target video frame.
其中,所述视频码流添加有头信息,所述头信息记录有目标图层中的目标的位置信息;The video code stream is added with header information, where the header information records location information of the target in the target layer;
从所述目标视频帧开始对相关视频码流解码,具体为:Decoding the related video code stream from the target video frame, specifically:
从所述目标视频帧中解码出兴趣区域部分,将所述兴趣区域部分根据所述位置信息复合到所述背景图层。An area of interest region is decoded from the target video frame, and the portion of interest region is composited to the background layer according to the location information.
其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。The separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information.
第四方面采用一种视频图像读取装置,包括:The fourth aspect adopts a video image reading device, including:
码流获取单元,用于获取视频码流,所述视频码流由目标图层的码流和背景图层的码流复合而成;其中,所述目标图层中的目标为图像中的兴趣区域部分;a code stream obtaining unit, configured to acquire a video code stream, where the video code stream is formed by combining a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is an interest in the image Regional part;
目标确认单元,用于确认解码目标所在的目标视频帧;a target confirmation unit, configured to confirm a target video frame where the decoding target is located;
码流解码单元,用于从所述目标视频帧开始对相关视频码流解码。 And a code stream decoding unit, configured to decode the related video code stream from the target video frame.
其中,所述视频码流添加有头信息,所述头信息记录有目标图层中的目标的位置信息;The video code stream is added with header information, where the header information records location information of the target in the target layer;
所述码流解码单元,具体用于:The code stream decoding unit is specifically configured to:
从所述目标视频帧中解码出兴趣区域部分,将所述兴趣区域部分根据所述位置信息复合到所述背景图层。An area of interest region is decoded from the target video frame, and the portion of interest region is composited to the background layer according to the location information.
其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。The separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information.
第五方面采用一种视频图像处理系统,包括上述任意一项所述的视频图像压缩装置和上述任意一项所述的视频图像读取装置。A video image processing system according to any one of the preceding claims, comprising the video image compression device according to any of the above aspects, and the video image reading device according to any one of the preceding claims.
本发明的有益效果在于:通过从待编码图像提取出背景图层和目标图层,将背景图层和目标图层分别进行编码各自生成码流,再将码流复合,解码时将复合码流解码,直接检索含有目标对象的图像,提高了计算资源的利用率。The invention has the beneficial effects that: by extracting the background layer and the target layer from the image to be encoded, respectively, the background layer and the target layer are separately encoded to generate a code stream, and then the code stream is combined, and the composite code stream is decoded. Decoding, directly retrieve the image containing the target object, and improve the utilization of computing resources.
附图说明DRAWINGS
图1是现有技术中基于块的混合编码框架编码的流程示意图;1 is a schematic flow chart of a block-based hybrid coding framework coding in the prior art;
图2是现有技术中基于块的混合编码框架解码的流程示意图;2 is a schematic flow chart of decoding of a block-based hybrid coding frame in the prior art;
图3是现有技术中基于块的混合编码中宏块的扫描顺序示意图;3 is a schematic diagram showing a scanning sequence of macroblocks in block-based hybrid coding in the prior art;
图4是现有技术中基于块的混合编码中帧间参考关系的示意图;4 is a schematic diagram of an inter-frame reference relationship in block-based hybrid coding in the prior art;
图5是现有技术中基于块的混合编码中P帧多帧参考的参考关系的示意图;5 is a schematic diagram of a reference relationship of a P frame multiframe reference in block-based hybrid coding in the prior art;
图6是现有技术中基于块的混合编码中B帧多帧参考的参考关系的示意图;6 is a schematic diagram of a reference relationship of a B frame multiframe reference in block-based hybrid coding in the prior art;
图7是现有技术中图像中的感兴趣区域的示意图;Figure 7 is a schematic illustration of a region of interest in an image in the prior art;
图8是本发明具体实施方式中提供的一种视频图像压缩的方法的第一实施例的方法流程图;FIG. 8 is a flowchart of a method of a first embodiment of a video image compression method according to an embodiment of the present invention; FIG.
图9是本发明具体实施方式中提供的一种视频图像压缩的方法的第二实施例的方法流程图; 9 is a flowchart of a method of a second embodiment of a video image compression method according to an embodiment of the present invention;
图10是本发明具体实施方式中提供的一种视频图像压缩的方法的第二实施例中图像层和背景层的示意图;FIG. 10 is a schematic diagram of an image layer and a background layer in a second embodiment of a video image compression method according to an embodiment of the present invention; FIG.
图11是本发明具体实施方式中提供的一种视频图像压缩的方法的第二实施例中码流的组织方式的示意图;11 is a schematic diagram of a code stream organization manner in a second embodiment of a video image compression method according to an embodiment of the present invention;
图12是本发明具体实施方式中提供的一种视频图像压缩的装置的第一实施例的结构示意图;FIG. 12 is a schematic structural diagram of a first embodiment of an apparatus for compressing video images according to an embodiment of the present invention; FIG.
图13是本发明具体实施方式中提供的一种视频图像压缩的装置的第二实施例的结构方框图;FIG. 13 is a structural block diagram of a second embodiment of an apparatus for compressing video images according to an embodiment of the present invention; FIG.
图14是本发明具体实施方式中提供的一种视频图像读取的方法的第一实施例的方法流程图;FIG. 14 is a flowchart of a method of a first embodiment of a video image reading method according to an embodiment of the present invention; FIG.
图15是本发明具体实施方式中提供的一种视频图像读取的装置的第一实施例的结构方框图;15 is a block diagram showing the structure of a first embodiment of a video image reading apparatus according to an embodiment of the present invention;
图16是本发明具体实施方式中提供的一种视频图像处理的系统的第一实施例的结构方框图。16 is a block diagram showing the configuration of a first embodiment of a video image processing system according to an embodiment of the present invention.
具体实施方式detailed description
为使本发明的目的、技术方案和优点更加清楚明了,下面结合具体实施方式并参照附图,对本发明进一步详细说明。应该理解,这些描述只是示例性的,而并非要限制本发明的范围。此外,在以下说明中,省略了对公知结构和技术的描述,以避免不必要地混淆本发明的概念。The present invention will be further described in detail below with reference to the specific embodiments thereof and the accompanying drawings. It is to be understood that the description is not intended to limit the scope of the invention. In addition, descriptions of well-known structures and techniques are omitted in the following description in order to avoid unnecessarily obscuring the inventive concept.
请参考图8,其是本发明具体实施方式中提供的一种视频图像压缩的方法的第一实施例的方法流程图。本实施例中的方法,主要用于各种视频,特别是监控视频的存储。如图所示,该方法,包括:Please refer to FIG. 8 , which is a flowchart of a method for a first embodiment of a video image compression method according to an embodiment of the present invention. The method in this embodiment is mainly used for storing various videos, especially monitoring videos. As shown, the method includes:
步骤S101:从待编码图像提取出背景图层和目标图层,所述目标图层中的目标为待编码图像中的兴趣区域部分。Step S101: extracting a background layer and a target layer from the image to be encoded, where the target in the target layer is the portion of the interest region in the image to be encoded.
在本方案中,特别针对于监控视频而言,画面中特定的目标,例如人、车、出入口等位置和区域,由于监控视频点位多,时间长,通常希望这些区域能够快速定位,无需对整个视频进行全面观察。所以将待编码图像区分为背景图层和目标图层,将视 频中的目标区分到目标图层中,在查看视频时,直接对目标图层进行检索,实现对需要检索的目标的快速检索,提高运算效率。In this solution, especially for monitoring video, specific targets in the screen, such as people, cars, entrances and exits, etc., because of the large number of monitoring video points and long time, it is generally desirable to quickly locate these areas without The entire video is fully observed. So the image to be encoded is divided into a background layer and a target layer, The target in the frequency is divided into the target layer. When viewing the video, the target layer is directly searched, and the fast retrieval of the target to be retrieved is realized, and the operation efficiency is improved.
步骤S102:分别对目标图层和背景图层编码各自生成码流。Step S102: respectively generating a code stream for each of the target layer and the background layer.
为使得解码时能够针对性地分别解码,在编码时对目标图层和背景图层分别编码形成各自码流,具体可以针对采用的编码标准完成编码。In order to enable the decoding to be separately decoded in the decoding, the target layer and the background layer are separately encoded to form respective code streams at the time of encoding, and the encoding may be specifically performed for the adopted encoding standard.
步骤S103:将目标图层的码流和背景图层的码流进行复合。Step S103: Combine the code stream of the target layer and the code stream of the background layer.
目标图层的码流和背景图层的码流进行复合,与现有技术的方案相比,复合后的码流能够进行更加精确的定位,直接访问确定的目标所在的图像。解码效率提高。The code stream of the target layer is combined with the code stream of the background layer. Compared with the prior art scheme, the combined code stream can perform more accurate positioning and directly access the image of the determined target. The decoding efficiency is improved.
综上所述,通过从待编码图像提取出背景图层和目标图层,将背景图层和目标图层分别进行编码各自生成码流,再将码流复合,解码时将复合码流解码,直接检索含有目标对象的图像,提高了计算资源的利用率。In summary, by extracting the background layer and the target layer from the image to be encoded, the background layer and the target layer are separately encoded to generate a code stream, and then the code stream is combined, and the composite code stream is decoded during decoding. Direct retrieval of images containing target objects improves the utilization of computing resources.
请参考图9,其是发明具体实施方式中提供的一种视频图像压缩的方法的第二实施例的方法流程图,如图所示,该方法包括:Please refer to FIG. 9 , which is a flowchart of a method for a second embodiment of a video image compression method according to an embodiment of the present invention. As shown in the figure, the method includes:
步骤S201:从待编码图像提取出背景图层和目标图层,所述目标图层中的目标为待编码图像中的兴趣区域部分。Step S201: extracting a background layer and a target layer from the image to be encoded, where the target in the target layer is a portion of the region of interest in the image to be encoded.
背景图层和目标图层的提取通过图像识别或图像分析实现,还可通过对摄像设备的设置完成目标图层的范围选定。具体的技术方案在现有技术中已有实现,在此不做进一步说明。The extraction of the background layer and the target layer is realized by image recognition or image analysis, and the range selection of the target layer can also be completed by setting the imaging device. The specific technical solutions have been implemented in the prior art and will not be further described herein.
步骤S202:将所述目标图层中目标之外的区域用固定值填充。Step S202: Filling the area other than the target in the target layer with a fixed value.
步骤S203:将所述背景图层中对应所述目标图层中目标所在的区域用固定值填充。Step S203: Filling in the background layer corresponding to the target in the target layer with a fixed value.
为了使得解码时目标图层处于图像中的原始的位置,将目标层中目标之外的区域用固定值填充,将背景图层中对应所述目标图层中目标所在的区域也用固定值填充,在编码时目标图层和背景图层具备相同的图像大小和分辨率,后续的复合操作更加精确。具体的填充方式如图10所示,两个从图像中提取出两个图层,在各自图层中对另一图层对应的位置进行填充,相当于得到两个具有相同的分辨率的子图像帧,再对两个图层分别编码。 In order to make the target layer in the original position in the image when decoding, the area outside the target in the target layer is filled with a fixed value, and the area in the background layer corresponding to the target in the target layer is also filled with a fixed value. The target layer and the background layer have the same image size and resolution when encoding, and the subsequent composite operations are more accurate. The specific filling method is as shown in FIG. 10, two extracting two layers from the image, and filling the corresponding positions of the other layer in the respective layers, which is equivalent to obtaining two sub-resolutions with the same resolution. The image frame is encoded separately for the two layers.
步骤S204:分别对填充后的目标图层和背景图层编码各自生成码流。Step S204: respectively generating a code stream for each of the filled target layer and the background layer.
对目标图层和背景图层进行编码时,依然按照从左至右、从上之下的扫描方式,对每个分块进行编码,只有在遇到填充部分时,直接跳过不用处理,每个图层产生的码流复合在一起。When encoding the target layer and the background layer, each block is still encoded according to the scanning method from left to right and from top to bottom, and only when the filled portion is encountered, skipping without processing, each The code streams generated by the layers are combined.
步骤S205:为待编码图像对应的码流添加头信息,在头信息后记录目标图层的码流和背景图层的码流。Step S205: Add header information to the code stream corresponding to the image to be encoded, and record the code stream of the target layer and the code stream of the background layer after the header information.
所述头信息中记录有目标图层中的目标的位置信息。所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符;具体地分隔标识符可以为能起分隔作用的起始码标识,即每个图层的码流都设置有起始码标识,以便解码时对码流的起始位置进行区分。具体的码流的组织方式如图11所示,在视频流之前加入头信息,记录目标图层中的目标的位置,在视频进行检索时,直接通过头信息进行精确定位,提高了数据的处理效率,具体的目标图层的码流和背景图层的码流的位置关系不作限制,图11中的图层1码流和图层2码流各自对应一个即可。The header information of the target in the target layer is recorded in the header information. a separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information; specifically, the separation identifier may be a start code identifier capable of separating, that is, each layer The code stream is provided with a start code identifier so as to distinguish the start position of the code stream when decoding. The specific code stream is organized as shown in FIG. 11. The header information is added before the video stream, and the location of the target in the target layer is recorded. When the video is retrieved, the header information is directly used for accurate positioning, thereby improving data processing. The efficiency, the positional relationship between the code stream of the specific target layer and the code stream of the background layer is not limited, and the layer 1 code stream and the layer 2 code stream in FIG. 11 respectively correspond to one.
综上所述,通过从待编码图像提取出背景图层和目标图层,将背景图层和目标图层分别进行编码各自生成码流,再将码流复合,解码时将复合码流解码,直接检索含有目标对象的图像,提高了计算资源的利用率。同时还设置了头信息,在头信息后记录目标图层的码流和背景图层的码流,在头信息中记录目标图层中的目标的位置信息,在目标图层的码流和背景图层的码流之间插入分隔标识符对二者进行区分,实现了两个码流的有序存储和快速检索。In summary, by extracting the background layer and the target layer from the image to be encoded, the background layer and the target layer are separately encoded to generate a code stream, and then the code stream is combined, and the composite code stream is decoded during decoding. Direct retrieval of images containing target objects improves the utilization of computing resources. At the same time, the header information is also set, and the code stream of the target layer and the code stream of the background layer are recorded after the header information, and the location information of the target in the target layer is recorded in the header information, and the code stream and background of the target layer are recorded. A separate identifier is inserted between the code streams of the layer to distinguish the two, and the ordered storage and fast retrieval of the two code streams are realized.
以下是本发明具体实施方式中提供的一种视频图像压缩的装置的实施例,视频图像压缩的装置的实施例基于上述的视频图像压缩的方法的实施例实现,在数视频图像压缩的装置的实施例中未尽的阐述,请参考上述的视频图像压缩的方法的实施例。The following is an embodiment of a device for compressing video images provided in an embodiment of the present invention, and an embodiment of a device for compressing video images is implemented based on an embodiment of the method for compressing video images described above, in a device for compressing digital video images. For an explanation of the embodiments, please refer to the above embodiments of the video image compression method.
请参考图12,其是本发明具体实施方式中提供的一种视频图像压缩的装置的第一实施例的结构方框图,如图所示,该装置,包括:Please refer to FIG. 12, which is a structural block diagram of a first embodiment of a video image compression apparatus according to an embodiment of the present invention. As shown, the apparatus includes:
图层提取单元310,用于从待编码图像提取出背景图层和目标图层,所述目标图层中的目标为待编码图像中的兴趣区域部分;a layer extracting unit 310, configured to extract a background layer and a target layer from the image to be encoded, where the target in the target layer is an area of the interest region in the image to be encoded;
通过将待编码图像区分为背景图层和目标图层,将视频中的目标区分到目标图层中,在查看视频时,直接对目标图层进行检索,实现对需要检索的目标的快速检索,提高运算效率。 By distinguishing the image to be encoded into a background layer and a target layer, the target in the video is divided into the target layer, and when the video is viewed, the target layer is directly searched to realize a fast retrieval of the target that needs to be retrieved. Improve computing efficiency.
图层编码单元320,用于分别对目标图层和背景图层编码各自生成码流;a layer coding unit 320, configured to separately generate a code stream for each of the target layer and the background layer;
码流复合单元330,用于将目标图层的码流和背景图层的码流进行复合。The code stream combining unit 330 is configured to combine the code stream of the target layer and the code stream of the background layer.
综上所述,上述各单元的协同工作,通过从待编码图像提取出背景图层和目标图层,将背景图层和目标图层分别进行编码各自生成码流,再将码流复合,解码时将复合码流解码,直接检索含有目标对象的图像,提高了计算资源的利用率。In summary, the cooperation of the above units works by extracting the background layer and the target layer from the image to be encoded, respectively encoding the background layer and the target layer to generate a code stream, and then combining and decoding the code stream. The composite code stream is decoded to directly retrieve the image containing the target object, which improves the utilization of the computing resources.
请参考图13,其是本发明具体实施方式中提供的一种视频图像压缩的装置的第二实施例的结构方框图,如图所示,该装置,包括:Please refer to FIG. 13 , which is a structural block diagram of a second embodiment of a video image compression apparatus according to an embodiment of the present invention. As shown, the apparatus includes:
图层提取单元310,用于从待编码图像提取出背景图层和目标图层,所述目标图层中的目标为待编码图像中的兴趣区域部分;a layer extracting unit 310, configured to extract a background layer and a target layer from the image to be encoded, where the target in the target layer is an area of the interest region in the image to be encoded;
图层编码单元320,用于分别对目标图层和背景图层编码各自生成码流;a layer coding unit 320, configured to separately generate a code stream for each of the target layer and the background layer;
码流复合单元330,用于将目标图层的码流和背景图层的码流进行复合。The code stream combining unit 330 is configured to combine the code stream of the target layer and the code stream of the background layer.
其中,所述码流复合单元330,具体用于:The code stream recombining unit 330 is specifically configured to:
为待编码图像对应的码流添加头信息,在头信息后记录目标图层的码流和背景图层的码流。The header information is added to the code stream corresponding to the image to be encoded, and the code stream of the target layer and the code stream of the background layer are recorded after the header information.
其中,所述图层编码单元320,包括:The layer coding unit 320 includes:
第一填充模块321,用于将所述目标图层中目标之外的区域用固定值填充;a first filling module 321 , configured to fill an area other than the target in the target layer with a fixed value;
第二填充模块322,用于将所述背景图层中对应所述目标图层中目标所在的区域用固定值填充;a second filling module 322, configured to fill, in the background layer, an area corresponding to the target in the target layer with a fixed value;
图层编码模块323,用于分别对填充后的目标图层和背景图层编码各自生成码流。The layer coding module 323 is configured to respectively generate a code stream for each of the filled target layer and the background layer.
对目标图层和背景图层进行编码时,依然按照从左至右、从上之下的扫描方式,对每个分块进行编码,只有在遇到填充部分时,直接跳过不用处理,每个图层产生的码流复合在一起。When encoding the target layer and the background layer, each block is still encoded according to the scanning method from left to right and from top to bottom, and only when the filled portion is encountered, skipping without processing, each The code streams generated by the layers are combined.
其中,所述头信息中记录有目标图层中的目标的位置信息。The location information of the target in the target layer is recorded in the header information.
其中,当待编码图像中提取目标图层失败时,所述头信息中目标图层中的目标的位置信息记录为空。 Wherein, when the target layer is failed to be extracted in the image to be encoded, the location information of the target in the target layer in the header information is recorded as empty.
其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符,具体地,每个图层的码流都设置有起始码标识,以便解码时对码流的起始位置进行区分。The separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information. Specifically, the code stream of each layer is set with a start code identifier for decoding. Differentiate the starting position of the code stream.
综上所述,上述功能模块的协同合作,通过从待编码图像提取出背景图层和目标图层,将背景图层和目标图层分别进行编码各自生成码流,再将码流复合,解码时将复合码流解码,直接检索含有目标对象的图像,提高了计算资源的利用率。同时还设置了头信息,在头信息后记录目标图层的码流和背景图层的码流,在头信息中记录目标图层中的目标的位置信息,在目标图层的码流和背景图层的码流之间插入分隔标识符对二者进行区分,实现了两个码流的有序存储和快速检索。In summary, the cooperation of the above functional modules, by extracting the background layer and the target layer from the image to be encoded, respectively encoding the background layer and the target layer to generate a code stream, and then combining and decoding the code stream. The composite code stream is decoded to directly retrieve the image containing the target object, which improves the utilization of the computing resources. At the same time, the header information is also set, and the code stream of the target layer and the code stream of the background layer are recorded after the header information, and the location information of the target in the target layer is recorded in the header information, and the code stream and background of the target layer are recorded. A separate identifier is inserted between the code streams of the layer to distinguish the two, and the ordered storage and fast retrieval of the two code streams are realized.
以下是本发明具体实施方式中提供的一种视频图像读取的方法的实施例,本实施例中的方案用于对前述实施例中得到的视频码流进行读取。请参考图14,其是本发明具体实施方式中提供的一种视频图像读取的方法的第一实施例的方法流程图,如图所示,该方法包括:The following is an embodiment of a video image reading method provided in the specific embodiment of the present invention. The solution in this embodiment is used to read the video code stream obtained in the foregoing embodiment. Please refer to FIG. 14 , which is a flowchart of a method for a first embodiment of a video image reading method according to an embodiment of the present invention. As shown in the figure, the method includes:
步骤S401:获取视频码流,所述视频码流由目标图层的码流和背景图层的码流复合而成;其中,所述目标图层中的目标为图像中的兴趣区域部分。Step S401: Acquire a video code stream, where the video code stream is formed by combining a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is a portion of the interest region in the image.
视频码流是有目标图层的码流和背景图层的码流复合而成,在读取时,有针对性地对想要读取的目标信息进行定位。The video stream is a composite of the code stream of the target layer and the code stream of the background layer. When reading, the target information to be read is targeted.
步骤S402:确认头信息中记录有解码目标的位置信息的帧为目标视频帧。Step S402: Confirm that the frame in which the position information of the decoding target is recorded in the header information is the target video frame.
所述视频码流添加有头信息,所述头信息记录有目标图层中的目标的位置信息。可以直接通过头信息记录的位置进行访问。The video stream is added with header information, and the header information is recorded with location information of a target in the target layer. It can be accessed directly from the location where the header information is recorded.
确认解码目标所在的目标视频帧还可根据其他的方案实现,例如不设置头信息,直接对视频进行逐帧访问。The target video frame in which the decoding target is located can also be implemented according to other schemes, for example, without setting the header information, and directly accessing the video frame by frame.
步骤S403:从所述目标视频帧中解码出兴趣区域部分,将所述兴趣区域部分根据所述位置信息复合到所述背景图层。Step S403: Decode an area of interest region from the target video frame, and combine the portion of the interest region into the background layer according to the location information.
所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。当对视频图像进行访问时,直接根据分隔标识符的分界,对目标图层所在码流进行访问即可。 A separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information. When the video image is accessed, the code stream in which the target layer is located may be accessed directly according to the boundary of the separated identifier.
综上所述,通过对复合生成的视频码流的读取,实现对目标图层的快速访问,提高了解码的效率,降低了运算的复杂度。In summary, by reading the video stream generated by the composite, fast access to the target layer is achieved, the decoding efficiency is improved, and the complexity of the operation is reduced.
以下是本发明具体实施方式中提供的一种视频图像读取的装置的实施例,视频图像读取的装置的实施例基于上述的视频图像读取的方法的实施例实现,在数视频图像读取的装置的实施例中未尽的阐述,请参考上述的视频图像读取的方法的实施例。The following is an embodiment of a device for reading video images provided in a specific embodiment of the present invention. The embodiment of the device for reading video images is implemented based on the embodiment of the method for reading video images described above, and is read in several video images. For an explanation of the embodiment of the device taken, please refer to the above embodiment of the method for video image reading.
请参考图15,其是第本发明具体实施方式中提供的一种视频图像读取的装置的结构方框图,如图所示,该装置包括:Please refer to FIG. 15 , which is a structural block diagram of an apparatus for video image reading according to a specific embodiment of the present invention. As shown in the figure, the apparatus includes:
码流获取单元510,用于获取视频码流,所述视频码流由目标图层的码流和背景图层的码流复合而成;其中,所述目标图层中的目标为图像中的兴趣区域部分;a code stream obtaining unit 510, configured to acquire a video code stream, where the video code stream is formed by combining a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is in an image Part of the area of interest;
目标确认单元520,用于确认解码目标所在的目标视频帧;a target confirmation unit 520, configured to confirm a target video frame where the decoding target is located;
码流解码单元530,用于从所述目标视频帧开始对相关视频码流解码。The code stream decoding unit 530 is configured to start decoding the related video code stream from the target video frame.
其中,所述视频码流添加有头信息,所述头信息记录有目标图层中的目标的位置信息;The video code stream is added with header information, where the header information records location information of the target in the target layer;
所述码流解码单元530,具体用于:The code stream decoding unit 530 is specifically configured to:
从所述目标视频帧中解码出兴趣区域部分,将所述兴趣区域部分根据所述位置信息复合到所述背景图层。An area of interest region is decoded from the target video frame, and the portion of interest region is composited to the background layer according to the location information.
其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。The separation identifier is inserted between the code stream of the target layer and the code stream of the background layer after the header information.
综上所述,上述各单元的协同工作,通过对复合生成的视频码流的读取,实现对目标图层的快速访问,提高了解码的效率,降低了运算的复杂度。In summary, the cooperative work of the above units realizes fast access to the target layer by reading the composite generated video code stream, improves the decoding efficiency, and reduces the computational complexity.
最后本发明具体实施方式中还提供了一种视频图像处理系统的实施例,视频图像处理系统包含上述的视频图像压缩的装置30和视频图像读取的装置50两部分。具体如图16所示,视频图像压缩的装置30包括:Finally, an embodiment of the video image processing system is further provided in the embodiment of the present invention. The video image processing system comprises the above-mentioned video image compression device 30 and video image reading device 50. Specifically, as shown in FIG. 16, the video image compression apparatus 30 includes:
图层提取单元310,用于从待编码图像提取出背景图层和目标图层,所述目标图层中的目标为待编码图像中的兴趣区域部分;a layer extracting unit 310, configured to extract a background layer and a target layer from the image to be encoded, where the target in the target layer is an area of the interest region in the image to be encoded;
图层编码单元320,用于分别对目标图层和背景图层编码各自生成码流;a layer coding unit 320, configured to separately generate a code stream for each of the target layer and the background layer;
码流复合单元330,用于将目标图层的码流和背景图层的码流进行复合。 The code stream combining unit 330 is configured to combine the code stream of the target layer and the code stream of the background layer.
视频图像读取的装置50包括:The video image reading device 50 includes:
码流获取单元510,用于获取视频码流,所述视频码流由目标图层的码流和背景图层的码流复合而成;其中,所述目标图层中的目标为图像中的兴趣区域部分;a code stream obtaining unit 510, configured to acquire a video code stream, where the video code stream is formed by combining a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is in an image Part of the area of interest;
目标确认单元520,用于确认解码目标所在的目标视频帧;a target confirmation unit 520, configured to confirm a target video frame where the decoding target is located;
码流解码单元530,用于从所述目标视频帧开始对相关视频码流解码。The code stream decoding unit 530 is configured to start decoding the related video code stream from the target video frame.
综上所述,上述各单元的协同工作,通过从待编码图像提取出背景图层和目标图层,将背景图层和目标图层分别进行编码各自生成码流,再将码流复合,解码时将复合码流解码,直接检索含有目标对象的图像,提高了计算资源的利用率。通过对复合生成的视频码流的读取,实现对目标图层的快速访问,提高了解码的效率,降低了运算的复杂度。In summary, the cooperation of the above units works by extracting the background layer and the target layer from the image to be encoded, respectively encoding the background layer and the target layer to generate a code stream, and then combining and decoding the code stream. The composite code stream is decoded to directly retrieve the image containing the target object, which improves the utilization of the computing resources. By reading the video stream generated by the composite, fast access to the target layer is achieved, the decoding efficiency is improved, and the computational complexity is reduced.
应当理解的是,本发明的上述具体实施方式仅仅用于示例性说明或解释本发明的原理,而不构成对本发明的限制。因此,在不偏离本发明的精神和范围的情况下所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。此外,本发明所附权利要求旨在涵盖落入所附权利要求范围和边界、或者这种范围和边界的等同形式内的全部变化和修改例。The above-described embodiments of the present invention are intended to be illustrative only and not to limit the invention. Therefore, any modifications, equivalent substitutions, improvements, etc., which are made without departing from the spirit and scope of the invention, are intended to be included within the scope of the invention. Rather, the scope of the appended claims is intended to cover all such modifications and modifications
尽管已经详细描述了本发明的实施方式,但是应该理解的是,在不偏离本发明的精神和范围的情况下,可以对本发明的实施方式做出各种改变、替换和变更。 Although the embodiments of the present invention have been described in detail, it is understood that various modifications, changes and changes may be made to the embodiments of the present invention without departing from the spirit and scope of the invention.

Claims (26)

  1. 一种视频图像压缩方法,包括:A video image compression method includes:
    从待编码图像提取出背景图层和目标图层,所述目标图层中的目标为待编码图像中的兴趣区域部分;Extracting a background layer and a target layer from the image to be encoded, wherein the target in the target layer is a portion of the region of interest in the image to be encoded;
    分别对目标图层和背景图层编码各自生成码流;Generating a code stream for each of the target layer and the background layer respectively;
    将目标图层的码流和背景图层的码流进行复合。Combine the code stream of the target layer with the code stream of the background layer.
  2. 根据权利要求1所述的一种视频图像压缩方法,其中,所述将目标图层的码流和背景图层的码流进行复合,包括:The video image compression method according to claim 1, wherein the combining the code stream of the target layer and the code stream of the background layer comprises:
    为待编码图像对应的码流添加头信息,在头信息后记录目标图层的码流和背景图层的码流。The header information is added to the code stream corresponding to the image to be encoded, and the code stream of the target layer and the code stream of the background layer are recorded after the header information.
  3. 根据权利要求1所述的一种视频图像压缩方法,其中,所述分别对目标图层和背景图层编码各自生成码流,包括:The video image compression method according to claim 1, wherein the generating a code stream for each of the target layer and the background layer respectively comprises:
    将所述目标图层中目标之外的区域用固定值填充;Filling the area outside the target in the target layer with a fixed value;
    将所述背景图层中对应所述目标图层中目标所在的区域用固定值填充;Filling in the background layer corresponding to the target in the target layer with a fixed value;
    分别对填充后的目标图层和背景图层编码各自生成码流。A code stream is separately generated for each of the filled target layer and the background layer.
  4. 根据权利要求2所述的一种视频图像压缩方法,其中,所述头信息中记录有目标图层中的目标的位置信息。A video image compression method according to claim 2, wherein the header information of the target in the target layer is recorded in the header information.
  5. 根据权利要求4所述的一种视频图像压缩方法,其中,当待编码图像中提取目标图层失败时,所述头信息中目标图层中的目标的位置信息记录为空。A video image compression method according to claim 4, wherein when the target layer is failed to be extracted from the image to be encoded, the position information of the target in the target layer in the header information is recorded as empty.
  6. 根据权利要求2所述的一种视频图像压缩方法,其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。A video image compression method according to claim 2, wherein a separation identifier is inserted between the code stream of the target layer after the header information and the code stream of the background layer.
  7. 一种视频图像压缩装置,包括:A video image compression device includes:
    图层提取单元,用于从待编码图像提取出背景图层和目标图层,所述目标图层中的目标为待编码图像中的兴趣区域部分;a layer extracting unit, configured to extract a background layer and a target layer from the image to be encoded, where the target in the target layer is an area of the interest region in the image to be encoded;
    图层编码单元,用于分别对目标图层和背景图层编码各自生成码流;a layer coding unit, configured to respectively generate a code stream for each of the target layer and the background layer;
    码流复合单元,用于将目标图层的码流和背景图层的码流进行复合。 The code stream composite unit is configured to combine the code stream of the target layer and the code stream of the background layer.
  8. 根据权利要求7所述的一种视频图像压缩装置,其中,所述码流复合单元,用于:A video image compression apparatus according to claim 7, wherein said code stream combining unit is configured to:
    为待编码图像对应的码流添加头信息,在头信息后记录目标图层的码流和背景图层的码流。The header information is added to the code stream corresponding to the image to be encoded, and the code stream of the target layer and the code stream of the background layer are recorded after the header information.
  9. 根据权利要求7所述的一种视频图像压缩装置,其中,所述图层编码单元,包括:The video image compression device according to claim 7, wherein the layer coding unit comprises:
    第一填充模块,用于将所述目标图层中目标之外的区域用固定值填充;a first filling module, configured to fill an area other than the target in the target layer with a fixed value;
    第二填充模块,用于将所述背景图层中对应所述目标图层中目标所在的区域用固定值填充;a second filling module, configured to fill, in the background layer, an area corresponding to the target in the target layer with a fixed value;
    图层编码模块,用于分别对填充后的目标图层和背景图层编码各自生成码流。The layer coding module is configured to respectively generate a code stream for each of the filled target layer and the background layer.
  10. 根据权利要求8所述的一种视频图像压缩装置,其中,所述头信息中记录有目标图层中的目标的位置信息。A video image compressing apparatus according to claim 8, wherein position information of a target in the target layer is recorded in said header information.
  11. 根据权利要求10所述的一种视频图像压缩装置,其中,当待编码图像中提取目标图层失败时,所述头信息中目标图层中的目标的位置信息记录为空。A video image compressing apparatus according to claim 10, wherein when the target layer is failed to be extracted from the image to be encoded, the position information of the target in the target layer in the header information is recorded as empty.
  12. 根据权利要求8所述的一种视频图像压缩装置,其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。A video image compressing apparatus according to claim 8, wherein a separation identifier is inserted between the code stream of the target layer after the header information and the code stream of the background layer.
  13. 一种视频图像读取方法,包括:A video image reading method includes:
    获取视频码流,所述视频码流由目标图层的码流和背景图层的码流复合而成;其中,所述目标图层中的目标为图像中的兴趣区域部分;Obtaining a video code stream, where the video code stream is composed of a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is a portion of the interest region in the image;
    确认解码目标所在的目标视频帧;Confirm the target video frame where the decoding target is located;
    从所述目标视频帧开始对相关视频码流解码。The associated video codestream is decoded starting from the target video frame.
  14. 根据权利要求13所述的一种视频图像读取方法,其中,所述视频码流添加有头信息,所述头信息记录有目标图层中的目标的位置信息;A video image reading method according to claim 13, wherein said video code stream is added with header information, and said header information is recorded with position information of a target in the target layer;
    从所述目标视频帧开始对相关视频码流解码,包括:Decoding the associated video codestream from the target video frame, including:
    从所述目标视频帧中解码出兴趣区域部分,将所述兴趣区域部分根据所述位置信息复合到所述背景图层。 An area of interest region is decoded from the target video frame, and the portion of interest region is composited to the background layer according to the location information.
  15. 根据权利要求14所述的一种视频图像读取方法,其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。A video image reading method according to claim 14, wherein a separation identifier is inserted between the code stream of the target layer after the header information and the code stream of the background layer.
  16. 一种视频图像读取装置,包括:A video image reading device comprising:
    码流获取单元,用于获取视频码流,所述视频码流由目标图层的码流和背景图层的码流复合而成;其中,所述目标图层中的目标为图像中的兴趣区域部分;a code stream obtaining unit, configured to acquire a video code stream, where the video code stream is formed by combining a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is an interest in the image Regional part;
    目标确认单元,用于确认解码目标所在的目标视频帧;a target confirmation unit, configured to confirm a target video frame where the decoding target is located;
    码流解码单元,用于从所述目标视频帧开始对相关视频码流解码。And a code stream decoding unit, configured to decode the related video code stream from the target video frame.
  17. 根据权利要求16所述的一种视频图像读取装置,其中,所述视频码流添加有头信息,所述头信息记录有目标图层中的目标的位置信息;A video image reading apparatus according to claim 16, wherein said video code stream is added with header information, and said header information is recorded with position information of a target in a target layer;
    所述码流解码单元,用于:The code stream decoding unit is configured to:
    从所述目标视频帧中解码出兴趣区域部分,将所述兴趣区域部分根据所述位置信息复合到所述背景图层。An area of interest region is decoded from the target video frame, and the portion of interest region is composited to the background layer according to the location information.
  18. 根据权利要求17所述的一种视频图像读取装置,其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。A video image reading apparatus according to claim 17, wherein a separation identifier is inserted between the code stream of the target layer after the header information and the code stream of the background layer.
  19. 一种视频图像处理系统,包括视频图像压缩装置和视频图像读取装置;A video image processing system comprising a video image compression device and a video image reading device;
    所述视频图像压缩装置,包括:The video image compression device includes:
    图层提取单元,用于从待编码图像提取出背景图层和目标图层,所述目标图层中的目标为待编码图像中的兴趣区域部分;a layer extracting unit, configured to extract a background layer and a target layer from the image to be encoded, where the target in the target layer is an area of the interest region in the image to be encoded;
    图层编码单元,用于分别对目标图层和背景图层编码各自生成码流;a layer coding unit, configured to respectively generate a code stream for each of the target layer and the background layer;
    码流复合单元,用于将目标图层的码流和背景图层的码流进行复合;a code stream composite unit, configured to combine a code stream of a target layer and a code stream of a background layer;
    所述视频图像读取装置,包括:The video image reading device includes:
    码流获取单元,用于获取视频码流,所述视频码流由目标图层的码流和背景图层的码流复合而成;其中,所述目标图层中的目标为图像中的兴趣区域部分;a code stream obtaining unit, configured to acquire a video code stream, where the video code stream is formed by combining a code stream of a target layer and a code stream of a background layer; wherein the target in the target layer is an interest in the image Regional part;
    目标确认单元,用于确认解码目标所在的目标视频帧;a target confirmation unit, configured to confirm a target video frame where the decoding target is located;
    码流解码单元,用于从所述目标视频帧开始对相关视频码流解码。 And a code stream decoding unit, configured to decode the related video code stream from the target video frame.
  20. 根据权利要求19所述的一种视频图像处理系统,其中,所述码流复合单元,用于:A video image processing system according to claim 19, wherein said code stream combining unit is configured to:
    为待编码图像对应的码流添加头信息,在头信息后记录目标图层的码流和背景图层的码流。The header information is added to the code stream corresponding to the image to be encoded, and the code stream of the target layer and the code stream of the background layer are recorded after the header information.
  21. 根据权利要求19所述的一种视频图像处理系统,其中,所述图层编码单元,包括:A video image processing system according to claim 19, wherein said layer coding unit comprises:
    第一填充模块,用于将所述目标图层中目标之外的区域用固定值填充;a first filling module, configured to fill an area other than the target in the target layer with a fixed value;
    第二填充模块,用于将所述背景图层中对应所述目标图层中目标所在的区域用固定值填充;a second filling module, configured to fill, in the background layer, an area corresponding to the target in the target layer with a fixed value;
    图层编码模块,用于分别对填充后的目标图层和背景图层编码各自生成码流。The layer coding module is configured to respectively generate a code stream for each of the filled target layer and the background layer.
  22. 根据权利要求20所述的一种视频图像处理系统,其中,所述头信息中记录有目标图层中的目标的位置信息。A video image processing system according to claim 20, wherein position information of a target in the target layer is recorded in said header information.
  23. 根据权利要求22所述的一种视频图像处理系统,其中,当待编码图像中提取目标图层失败时,所述头信息中目标图层中的目标的位置信息记录为空。A video image processing system according to claim 22, wherein when the target layer is failed to be extracted from the image to be encoded, the position information of the target in the target layer in the header information is recorded as empty.
  24. 根据权利要求20所述的一种视频图像处理系统,其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。A video image processing system according to claim 20, wherein a separation identifier is inserted between the code stream of the target layer after the header information and the code stream of the background layer.
  25. 根据权利要求19所述的一种视频图像处理系统,其中,所述视频码流添加有头信息,所述头信息记录有目标图层中的目标的位置信息;A video image processing system according to claim 19, wherein said video code stream is added with header information, and said header information is recorded with position information of a target in the target layer;
    所述码流解码单元,用于:The code stream decoding unit is configured to:
    从所述目标视频帧中解码出兴趣区域部分,将所述兴趣区域部分根据所述位置信息复合到所述背景图层。An area of interest region is decoded from the target video frame, and the portion of interest region is composited to the background layer according to the location information.
  26. 根据权利要求25所述的一种视频图像处理系统,其中,所述头信息后的目标图层的码流和背景图层的码流之间插入有分隔标识符。 A video image processing system according to claim 25, wherein a separation identifier is inserted between the code stream of the target layer after the header information and the code stream of the background layer.
PCT/CN2015/077729 2015-04-08 2015-04-28 Method, device, and system for video image compression and reading WO2016161674A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510164120.XA CN106162190A (en) 2015-04-08 2015-04-08 A kind of video image compression and the method for reading, Apparatus and system
CN201510164120.X 2015-04-08

Publications (1)

Publication Number Publication Date
WO2016161674A1 true WO2016161674A1 (en) 2016-10-13

Family

ID=57071735

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/077729 WO2016161674A1 (en) 2015-04-08 2015-04-28 Method, device, and system for video image compression and reading

Country Status (2)

Country Link
CN (1) CN106162190A (en)
WO (1) WO2016161674A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108668130A (en) * 2017-03-31 2018-10-16 晨星半导体股份有限公司 The method for recombinating image file
JP2021500764A (en) 2017-08-29 2021-01-07 Line株式会社 Improving video quality for video calls
CN108924557B (en) * 2018-06-11 2022-02-08 海信视像科技股份有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN110460855B (en) * 2019-07-22 2023-04-18 西安万像电子科技有限公司 Image processing method and system
CN113012657A (en) * 2019-12-19 2021-06-22 北京嗨动视觉科技有限公司 Layer processing method and device, video processing equipment and computer readable storage medium
CN113660495A (en) * 2021-08-11 2021-11-16 易谷网络科技股份有限公司 Real-time video stream compression method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101755460A (en) * 2007-07-20 2010-06-23 富士胶片株式会社 Image processing apparatus, image processing method, image processing system and program
US20140267583A1 (en) * 2013-03-13 2014-09-18 Futurewei Technologies, Inc. Augmented Video Calls on Mobile Devices
CN104335588A (en) * 2012-07-04 2015-02-04 英特尔公司 A region of interest based framework for 3D video coding

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120118781A (en) * 2011-04-19 2012-10-29 삼성전자주식회사 Method and apparatus for unified scalable video encoding for multi-view video, method and apparatus for unified scalable video decoding for multi-view video
CN103402087A (en) * 2013-07-23 2013-11-20 北京大学 Video encoding and decoding method based on gradable bit streams

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101755460A (en) * 2007-07-20 2010-06-23 富士胶片株式会社 Image processing apparatus, image processing method, image processing system and program
CN104335588A (en) * 2012-07-04 2015-02-04 英特尔公司 A region of interest based framework for 3D video coding
US20140267583A1 (en) * 2013-03-13 2014-09-18 Futurewei Technologies, Inc. Augmented Video Calls on Mobile Devices

Also Published As

Publication number Publication date
CN106162190A (en) 2016-11-23

Similar Documents

Publication Publication Date Title
US20230171399A1 (en) Intra prediction-based image coding method and apparatus using mpm list
US9247249B2 (en) Motion vector prediction in video coding
TWI504237B (en) Buffering prediction data in video coding
WO2016161674A1 (en) Method, device, and system for video image compression and reading
US20220182681A1 (en) Image or video coding based on sub-picture handling structure
CN111971960B (en) Method for processing image based on inter prediction mode and apparatus therefor
WO2016161675A1 (en) Method and system for target-based video encoding
CN114402597A (en) Video or image coding using adaptive loop filter
EP3975556A1 (en) Image decoding method for performing inter-prediction when prediction mode for current block ultimately cannot be selected, and device for same
JP2023516336A (en) Image encoding/decoding method and apparatus based on mixed NAL unit types and method for bitstream transmission
JP7490797B2 (en) Image encoding/decoding method and apparatus for selectively encoding size information of rectangular slices and method for transmitting bitstreams
US9648336B2 (en) Encoding apparatus and method
JP2023516375A (en) Image encoding/decoding method and apparatus based on mixed NAL unit types and method for bitstream transmission
KR20230017819A (en) Image coding method and apparatus
JP7494315B2 (en) Image encoding/decoding method and device based on available slice type information for GDR or IRPA pictures, and recording medium for storing bitstreams
JP7492023B2 (en) Image encoding/decoding method and device based on hybrid NAL unit type, and recording medium for storing bitstreams
JP7492026B2 (en) Image encoding/decoding method and apparatus for signaling HRD parameters, and computer-readable recording medium storing a bitstream - Patents.com
RU2820148C2 (en) Image encoding and decoding equipment and image data transmission equipment
US20240196008A1 (en) Method and device for intra prediction based on plurality of dimd modes
US20240236368A9 (en) Intra prediction method and device based on intra prediction mode derivation
US20240137560A1 (en) Intra prediction method and device based on intra prediction mode derivation
KR20220160043A (en) Video encoding/decoding method, apparatus and recording medium for storing a bitstream based on a hybrid NAL unit type
CN113273210A (en) Method and apparatus for compiling information about merged data
JP2015035785A (en) Dynamic image encoding device, imaging device, dynamic image encoding method, program, and recording medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15888229

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 21/06/2018)

122 Ep: pct application non-entry in european phase

Ref document number: 15888229

Country of ref document: EP

Kind code of ref document: A1