WO2016173277A1 - 视频编码方法、解码方法及其装置 - Google Patents

视频编码方法、解码方法及其装置 Download PDF

Info

Publication number
WO2016173277A1
WO2016173277A1 PCT/CN2015/098060 CN2015098060W WO2016173277A1 WO 2016173277 A1 WO2016173277 A1 WO 2016173277A1 CN 2015098060 W CN2015098060 W CN 2015098060W WO 2016173277 A1 WO2016173277 A1 WO 2016173277A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
foreground
image
metadata
target
Prior art date
Application number
PCT/CN2015/098060
Other languages
English (en)
French (fr)
Other versions
WO2016173277A9 (zh
Inventor
郭斌
蔡巍伟
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Priority to US15/569,840 priority Critical patent/US10638142B2/en
Priority to EP15890651.1A priority patent/EP3291558B1/en
Publication of WO2016173277A1 publication Critical patent/WO2016173277A1/zh
Publication of WO2016173277A9 publication Critical patent/WO2016173277A9/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • H04N19/23Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding with coding of regions that are present throughout a whole video segment, e.g. sprites, background or mosaic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • H04N19/25Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding with scene description coding, e.g. binary format for scenes [BIFS] compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • H04N19/27Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding involving both synthetic and natural picture components, e.g. synthetic natural hybrid coding [SNHC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • H04N19/463Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/527Global motion vector estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based
    • H04N19/54Motion estimation other than block-based using feature points or meshes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Definitions

  • the present application relates to the field of video surveillance, and in particular, to a video encoding method, a decoding method, and a device thereof.
  • the more clear the images collected the more video data is generated. If any video data is not processed, the transmission of the video data requires a large amount of network bandwidth and storage. The storage space required for video data is larger, so that the cost of performing network transmission or data storage is high. Therefore, in the prior art, a method is proposed, which compresses and encodes video data, and then transmits the compressed and encoded video file, thereby reducing network bandwidth occupied by video data transmission and reducing cost. .
  • One of the technical problems to be solved by the present application is to provide a video encoding apparatus capable of effectively reducing the amount of video data and alleviating in the case where the number of moving targets is large or large. Network bandwidth limitations during transmission.
  • an embodiment of the present application first provides a video encoding apparatus, including: a video collecting unit for acquiring a video image; and a processing unit for compressing a background image in the video image Encoding to obtain video compression data, and performing structured processing on the foreground moving target in the video image to obtain foreground target metadata; and a data transmission unit configured to transmit the video compressed data and the foreground target metadata, where
  • the foreground target metadata is data in which video structured semantic information is stored.
  • the processing unit is further configured to perform background modeling on the video image, and detect the foreground moving target based on the established background model to separate the background image and the foreground moving target in the current frame video image.
  • the data transmission unit transmits video compression data corresponding to the background image at intervals of a set time period, and transmits foreground target metadata corresponding to the foreground moving target in real time.
  • the processing unit adopts a structured algorithm that does not set a target type of structured algorithm and a structured type target when structuring the foreground moving target in the video image. algorithm.
  • a video decoding apparatus comprising: a data receiving unit for receiving video compressed data and foreground target metadata; and a processing unit for decoding video compressed data, for foreground Target metadata is interpreted.
  • the method further includes: a storage unit configured to store an image, the processing unit further selecting, according to the information of the foreground target metadata, a corresponding foreground target image from the storage unit as a foreground moving target, Interpretation of foreground target metadata.
  • the processing unit according to the information of the foreground target metadata, superimposes and displays the foreground moving target described by the foreground target metadata on the decoded background image by using a display drawing technique to achieve a foreground Interpretation of target metadata.
  • the method further includes: a video display unit, configured to perform composite display on the decoded background image and the interpreted foreground moving target.
  • a video transmission display system including: The video encoding device described above, and the video decoding device as described above.
  • a video encoding method including: collecting a video image to be transmitted; compressing and encoding a background image in the video image to obtain video compressed data, and in the video image
  • the foreground moving target is structured to obtain foreground target metadata; the video compressed data and the foreground target metadata are transmitted, wherein the foreground target metadata is data storing video structured semantic information.
  • the method further includes: performing background modeling on the video image, and detecting the foreground moving target based on the established background model to separate the background image and the foreground moving target in the current frame video image.
  • the interval-set time period transmits video compression data corresponding to the background image, and transmits foreground target metadata corresponding to the foreground moving target in real time.
  • the structured algorithm employed in the structuring of foreground moving objects in the video image includes a structured algorithm of the target type and a structured algorithm that sets the type target.
  • a video decoding method including: receiving video compressed data and foreground target metadata; decoding video compressed data, and interpreting foreground target metadata; and decoding the obtained background image Synthetic display with the foreground moving target after interpretation.
  • the method further includes: selecting a corresponding foreground from the pre-stored image according to the information of the foreground target metadata.
  • the target image is used as a foreground moving target, and the foreground target image and the decoded background image are combined and displayed.
  • the method further includes: using the display drawing technology in the decoded background according to the information of the foreground target metadata The foreground moving target described by the foreground target metadata is superimposed on the image.
  • a video encoding method for a highway comprising: collecting a video image on a highway; separating a frame of video image into a background image including a still scene and including motion according to a background model; a foreground image of the target vehicle; compressing the background image Video compression data encoded in a digital array mode, foreground target metadata obtained by structurally processing a foreground image of a moving target vehicle, wherein the foreground target metadata is data storing video structured semantic information; compressing the video The data and the foreground target metadata are mixed to obtain a mixed stream of video data with metadata, and the mixed stream is transmitted.
  • the foreground target metadata includes at least: vehicle type, vehicle color, vehicle brand, vehicle model, license plate number, location of the foreground target in the frame video image, time of the frame video image.
  • a video decoding method for a highway comprising: parsing a video data mixed stream with metadata to obtain video compressed data and foreground target metadata; and decoding the video compressed data to obtain
  • the background image is obtained by interpreting the foreground target metadata to obtain a foreground image; according to the position information and the time information in the metadata, the foreground image is superimposed on the corresponding position of the background image, and the combined display is performed to restore the captured video image.
  • the step of obtaining the foreground image by interpreting the foreground target metadata includes: selecting a corresponding foreground target image as the foreground moving target according to the information of the foreground target metadata, or according to the foreground target
  • the information of the metadata is superimposed on the decoded background image by using a display drawing technique to render the foreground moving target described by the foreground target metadata.
  • the embodiment of the present application further provides a storage medium, where the storage medium is used to store an application, and the application is used to execute a video encoding method described in the present application at runtime.
  • the embodiment of the present application further provides a storage medium, where the storage medium is used to store an application, and the application is used to execute a video decoding method described in the present application at runtime.
  • the embodiment of the present application further provides a storage medium, where the storage medium is used to store an application, and the application is used to execute a highway for use in the present application at runtime.
  • Video encoding method is used to encode a codec.
  • the embodiment of the present application further provides a storage medium, where the storage medium is used to store an application, and the application is used to execute a highway for use in the present application at runtime.
  • Video decoding method is used to store an application, and the application is used to execute a highway for use in the present application at runtime.
  • an embodiment of the present application further provides an application, where the application is used to execute a video encoding method described in the present application at runtime.
  • an embodiment of the present application further provides an application, where the application is used to execute a video decoding method described in the present application at runtime.
  • an embodiment of the present application further provides an application, where the application is used to execute a video encoding method for a highway described in the present application at runtime.
  • the embodiment of the present application further provides an application, where the application is used to execute a video decoding method for a highway described in the present application at runtime.
  • an embodiment of the present application further provides an encoding device, where the encoding device includes: a processor, a memory, a communication interface, and a bus;
  • the processor, the memory, and the communication interface are connected by the bus and complete communication with each other;
  • the memory stores executable program code
  • the processor runs a program corresponding to the executable program code by reading executable program code stored in the memory for:
  • Collecting a video image to be transmitted compressing and encoding the background image in the video image to obtain video compressed data, and performing structural processing on the foreground moving target in the video image to obtain foreground target metadata; and transmitting the video compression Data and the foreground target metadata, wherein the foreground target metadata is data storing video structured semantic information.
  • the embodiment of the present application further provides a decoding device, where the decoding device includes: a processor, a memory, a communication interface, and a bus;
  • the processor, the memory, and the communication interface are connected by the bus and complete communication with each other;
  • the memory stores executable program code
  • the processor runs a program corresponding to the executable program code by reading executable program code stored in the memory for:
  • an embodiment of the present application further provides an encoding device, where the encoding device includes: a processor, a memory, a communication interface, and a bus;
  • the processor, the memory, and the communication interface are connected by the bus and complete communication with each other;
  • the memory stores executable program code
  • the processor runs a program corresponding to the executable program code by reading executable program code stored in the memory for:
  • the foreground image of the vehicle is structured to obtain foreground target metadata, wherein the foreground target metadata is data storing video structured semantic information; and the video compressed data and the foreground target metadata are mixed to obtain metadata
  • the video data is mixed and the hybrid stream is transmitted.
  • the embodiment of the present application further provides a decoding device, where the decoding device includes: a processor, a memory, a communication interface, and a bus;
  • the processor, the memory, and the communication interface are connected by the bus and complete communication with each other;
  • the memory stores executable program code
  • the processor runs a program corresponding to the executable program code by reading executable program code stored in the memory for:
  • One or more of the above aspects may have the following advantages or benefits compared to the prior art.
  • the method of the present application is a video transmission method structured based on a foreground moving target (or foreground target), which is mainly applied in the case of video monitoring of a fixed scene and an overall situation in the scene, such as monitoring a highway road. Traffic situation.
  • a foreground moving target or foreground target
  • the data traffic can be effectively reduced, and the network bandwidth can be saved.
  • FIG. 1 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a video encoding method according to an embodiment of the present application
  • FIG. 3 is a schematic flowchart of performing foreground detection processing according to an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of performing tile label processing according to an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of performing size normalization processing according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a contour scan line according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a video decoding apparatus according to an embodiment of the present application.
  • FIG. 9 is a schematic flowchart diagram of a video decoding method according to an embodiment of the present application.
  • Background refers to a region of a pixel in a video image that remains stable for a certain period of time relative to the foreground of motion.
  • Foreground refers to a region of a pixel in a video image that has changed somewhat relative to the background of motion.
  • Structured refers to the discrete digital image array that extracts the semantic information present therein through video content analysis processing (eg, a structured description of a frame of image, "there is a red car in the image”).
  • Methodadata refers to data that stores video structured information.
  • the transmitted video data may be a video, a still image, or an animation, or a combination of the foregoing, and is not limited.
  • the video transmission method of the present application is a video transmission method based on foreground moving target structure, which is mainly applied in the case of video monitoring of a fixed scene and an overall situation in the scene, for example, monitoring road traffic conditions of a highway.
  • FIG. 1 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present application.
  • the video encoding apparatus 101 of this embodiment can transmit video data by wired or wireless means.
  • the apparatus 101 includes a video collection unit 101a, a processing unit 101b, and a data transmission unit 101c.
  • the video capture unit 101a is configured to collect video images to be transmitted.
  • the processing unit 101b performs compression encoding on the background image in the video image to be transmitted to obtain video compressed data, and performs structural processing on the foreground moving target (or foreground target) in the video image to obtain foreground target metadata.
  • the data transmission unit 101c transmits video compression data and foreground target metadata.
  • the video capture unit 101a can be, for example, a video capture card, which mainly converts analog video outputted by a video monitoring device such as an analog camera, a video recorder, etc. into binary digital information through an analog-to-digital converter, and saves it as an editable digital video file. .
  • the processing unit 101b performs background modeling on the video image, and detects the foreground moving target based on the established background model to separate the background image and the foreground in the current frame video image. aims.
  • the background and the foreground are relative concepts. Taking the expressway as an example, when people pay attention to the cars coming and going on the expressway, these vehicles are the foreground, and the road surface and the surrounding environment are the background; When focusing on pedestrians entering the highway, the intruder is the foreground, and other things, such as cars, become the background.
  • detecting the foreground target is the basis for the target analysis.
  • the common method for foreground target detection is the background subtraction method, and the key of the background subtraction method is how to establish the background model from the video sequence.
  • a variety of background modeling methods have been proposed for different application environments, such as single Gaussian model based methods, hybrid Gaussian model based methods, statistical based background modeling methods, and codebook based modeling methods.
  • the processing unit 101b preferably uses a codebook-based modeling method, and the basic idea of modeling is to first generate an initial codebook according to a video sequence, by means of a parameter in the codeword, "the longest time does not appear. Perform time domain filtering. The purpose of this is to filter out those codewords in the codebook that may represent foreground images. Finally, after the spatial domain filtering, the codeword representing the less frequently occurring background state deleted in the previous step is restored to the codebook, and the background model can be established by the above method.
  • the foreground moving target detection is performed based on the background model, that is, the moving foreground target is extracted from the background image from the current frame video image.
  • the processing unit 101b uses the background difference method to detect the foreground object. Specifically, the processing unit 101b subtracts the current frame video image from the background model. If the pixel interpolation is greater than a certain threshold, it is determined that the pixel belongs to the foreground moving target, and otherwise belongs to the background image.
  • the difference between the current frame video image and the background model is used to detect the motion region, and generally provides relatively complete feature data, and the method is simple in operation, and the motion target can be completely and accurately segmented in a fixed background.
  • the moving target detection method in dynamic background can be used, such as matching method, optical flow method or global motion. Estimation method, etc., will not be repeated here.
  • the processing unit 101b can also eliminate the noise by performing an open operation and a closed operation on the foreground image, and then discarding the smaller contour.
  • the processing unit 101b can divide the background image of the current frame video image and the foreground object well. After that, the processing unit 101b compression-codes the background image in the video and structuring the foreground object in the video.
  • the method for performing compression coding on the background image may adopt Huffman coding, predictive coding, transform coding, etc., and the prior art is relatively mature, and thus will not be described herein.
  • the unnecessary data can be removed to reduce the amount of data required to represent the digital image, which facilitates image storage and transmission, and reduces storage space and transmission bandwidth.
  • the processing unit 101b also needs to perform structured processing on the foreground target in the video to obtain foreground target metadata.
  • the metadata is not a large amount of video data, but information that semantically describes the foreground target in the video.
  • the foreground moving target is a car on the road
  • the metadata obtained by structuring this target can be as shown in the following table.
  • the extent to which the foreground target can be structured and how much metadata is obtained depends on various factors such as the video environment, video resolution, definition, and algorithm of structural analysis.
  • the algorithm itself that structuring the video, as well as the specific definition of the metadata, is not the focus of this program. Any structured algorithm that can obtain the above types of metadata can be used.
  • the foreground moving object having a larger amount of data than the background image is structured, so that the processed metadata is not video data. It is structured semantic information, which can be transmitted in binary data by text or design data structure. Therefore, compared with the existing video coding technology to compress the entire video image, the amount of data is greatly reduced, and the network can be further reduced. Bandwidth consumption.
  • the data transmission unit 101c of the video transmission device 101 transmits the video compression data and the foreground target metadata.
  • the above-mentioned compression-encoded video compressed data and the structured foreground target metadata may be mixed into new video data, and then the new video data is transmitted by wireless or wired.
  • the data transmission unit 101c directly transmits the compression-encoded video compressed data and the structured foreground target metadata as separate independent data by wireless or wired means.
  • the transmission mode is the second type, that is, when the data is separately transmitted as the independent data
  • the video compressed data only needs to be transmitted once or once every other time.
  • the foreground moving target will be more or less different in different frame images, so the foreground target metadata needs to be transmitted in real time. Since no real-time pair is needed
  • the background image of each frame of video image is compression-encoded and transmitted, thereby reducing the data processing load of the processing unit and reducing the network resources occupied by the data transmission.
  • the video transmission apparatus 101 of this embodiment may further include a storage unit that stores the background video compression data and the foreground target metadata.
  • the data transmission unit 101c may be taken out from the storage unit.
  • the specific type of storage unit is not limited herein.
  • FIG. 2 shows a flow of a video encoding method according to an embodiment of the present application, which may be performed in the above apparatus, and the method includes the following steps.
  • step S210 the video collection unit 101a collects a video image to be transmitted.
  • step S220 the processing unit 101b performs compression encoding on the background image in the video image to be transmitted to obtain video compression data, and performs structural processing on the foreground moving target (or foreground target) in the video image to obtain foreground target metadata.
  • the processing flow for obtaining the metadata of the target type by the processing unit 101b will be described in detail below by taking a structured algorithm that does not set the target type as an example.
  • the processing unit 101b performs foreground detection.
  • the foreground detection refers to determining the front spot in the current input image (the pixel point in the background where motion occurs). To determine whether there is motion in the pixel point, it is necessary to calculate the difference between the current input pixel point and the background image pixel point, when the difference exceeds the setting After the specified range, the point is considered to be the former attraction.
  • x represents the pixel value of the pixel point X
  • b(x) represents the pixel value of the background image corresponding to the pixel point X
  • T represents the set threshold value, and if the result is 1, it indicates that the pixel point X is the front point of view.
  • the background image is obtained by maintaining a background model that combines multiple image information. It absorbs illumination changes in the environment and filters out interference such as rain and snow. Prospect test results such as As shown in Fig. 3, the white dots in the foreground map indicate the front spots, and the black dots indicate the background points.
  • the processing unit 101b performs the pre-attraction block marking process.
  • the foreground detection link can only determine whether the pixel in the input image is the former attraction (sports point), and does not define the attribution of each of the former attractions.
  • the front sights corresponding to the moving objects are spatially continuous, appearing as a foreground clump in the image, and the outlines of these clumps are usually closed.
  • the blob mark can be regarded as a process of contour search and contour tracking. Each blob corresponds to a unique outline, and by searching these outlines, each foreground blob in the foreground image can be marked.
  • the results of the agglomerate labeling are shown in Fig. 4, and the five clusters of the pellets 1 to 5 were obtained.
  • the processing unit 101b performs target tracking and extraction processing.
  • the length of the track should meet the requirements. A too short track indicates that it is a short-term disturbance in the background.
  • the trajectory of the mass should conform to the motion characteristics of the normal moving target, and the trajectory should be regular, and should not be messy.
  • the speed of the mass should not be too large.
  • the initialization steps include the following:
  • the first 15 frames of the trajectory information of the blob (the position information of the target at each time point) are saved, and the Kalman filter corresponding to the target is updated.
  • the YUV color histogram of the statistical block in the current image area, and the feature template of the histogram as the target is saved.
  • the purpose of the trace is to establish the positional correspondence (trajectory) of the target in time series.
  • the process of determining the position of the target in the current image is described as follows:
  • the motion position prediction of the target is performed using the kalman filter corresponding to the target.
  • the Kalman filter saves the speed and direction information of the target, and can predict the moving position of the target at the next moment:
  • State_post represents the current predicted position
  • T represents the kalman conversion matrix
  • State_pre represents the target coordinate, velocity, and acceleration correction value at the moment of the kalman filter.
  • Mean Shift target tracking at the predicted position of the target.
  • search for foreground clumps If there is no foreground clump at this location, the tracking fails. If there is a foreground clump, the size of the blob and the position of the center point are used as the result of the target tracking.
  • the kalman filter parameters and state_pre are corrected by the result position information of the target tracking.
  • the YUV color histogram in the current range of the statistical target is used to update the target feature template and update the size of the target.
  • the foreground image is extracted, which represents the position and shape information of the target.
  • the YUV color histogram represents the probability that each YUV value in the image appears in the image. Assuming the YUV value (y, u, v), the probability P(y, u, v) that appears in the image is:
  • YUV(x, y) represents the YUV value of the image at position (x, y)
  • M and N respectively indicate the height and width of the image
  • U(x, y) represents the U component value of the image at position (x, y)
  • V(x, y) represents the V component value of the image at position (x, y)
  • M and N respectively indicate the height and width of the image
  • the color histogram of the target is a compressed histogram calculated by the joint probability distribution, which occupies less memory space than the traditional color histogram, and the amount of data involved in the operation process is greatly reduced.
  • the processing unit 101b performs a size normalization process of the target.
  • the size of the moving target will be greatly different.
  • a special scaling method is adopted: according to the aspect ratio of the target, the width and height directions respectively adopt different scaling factors to ensure the size of the target before and after the normalization. consistency.
  • width and height are w, h, and the size is normalized as follows:
  • Scale_w, scale_h indicate the scale in the width and height directions, respectively
  • the processing unit 101b performs contour feature extraction processing of the target.
  • the outline scan line perpendicular to the rectangular side is made at each point of the rectangular side, and the distance between the scan line and the target outline is recorded.
  • one feature value (average of four contour scan line lengths) can be saved every four points. This not only greatly reduces the amount of data, but also filters out the effects of some image noise points and feature data.
  • 6 is a schematic diagram of a contour scan line, and a white line indicates a contour scan line of a point corresponding to a target circumscribed rectangular frame.
  • the processing unit 101b performs a classification operation using the SVM classifier.
  • the extracted contour features are numerically normalized, and each feature value is scaled to 0-1, and the feature values are input into a trained SVM classifier for classification operation, and the target type is determined according to the result of the classifier output. .
  • the structuring algorithm for setting the type target is as follows: the vehicle target structuring algorithm (including the license plate number identification, the vehicle color classification, the vehicle sub- Brand recognition, etc., personnel target structured algorithm (including height, age, gender, whether to wear glasses, clothing color, etc.), moving target structure algorithm (including target type, motion speed, motion direction, position and other structured algorithms) .
  • step S230 the data transmission unit 101c transmits the video compression data and the foreground target metadata.
  • the video transmission device 101 of the present embodiment separately processes the background image and the foreground moving target in the image before the video image is transmitted, that is, compresses and encodes the background image to obtain video compressed data, and targets the moving target.
  • Structured processing yields foreground target metadata. Since metadata is not video data, but structured semantic information, it can be transmitted in binary data by text or design data structure, so the amount of video data can be greatly reduced, and the consumption of network bandwidth can be further reduced.
  • FIG. 7 is a schematic structural diagram of a video decoding apparatus according to an embodiment of the present application.
  • the video decoding device 201 of this embodiment can decode and display video compressed data and foreground target metadata.
  • the device 201 includes a video display unit 201a, a processing unit 201b, and a data receiving unit 201c.
  • the data receiving unit 201c receives the video compressed data and the foreground target metadata transmitted from the data transmission unit 101c by wire or wirelessly.
  • the processing unit 201b decodes the video compressed data and interprets the foreground target metadata.
  • the video display unit 201a performs composite display on the decoded background image and the interpreted foreground moving target.
  • the processing unit 201b decodes the background video data to obtain a background image, and the video display unit 201a displays the decoded background image. It should be noted that the decoding method used by the processing unit 201b corresponds to the method for encoding the background image, and the decoding process and the decoding algorithm specifically involved are not described herein.
  • the processing unit 201b interprets the foreground target metadata to obtain a foreground image.
  • the processing unit 201b interprets the foreground target metadata, the following two methods are given in this embodiment.
  • the device 201 further includes a storage unit 201d, which pre-stores various types of foreground target images in advance. For example, if the vehicle on the highway is monitored, the storage unit 201d can store a large number of different colors and different colors. Pictures of models and vehicles of different brands.
  • the processing unit 201b performs the interpretation analysis on the received foreground target metadata, according to the information provided in the metadata, the foreground target image that matches or is closest to the metadata description may be found from the pre-stored picture of the storage unit 201d.
  • the foreground target image is used as a foreground moving target.
  • the video display unit 201a superimposes the foreground target image on the decoded and displayed background image according to the target position of the metadata description and the time when the target appears, and realizes the composite display of the background image and the foreground moving target.
  • Figure 8(a) is a background image.
  • the brand is Volkswagen
  • the sub-brand is Touran
  • the movement direction is the upper left 45. degree.
  • a corresponding picture is found in the pre-stored picture (Fig. 8(b)), which is closest to the motion foreground described in the metadata.
  • the vehicle picture of FIG. 8(b) is superimposed on the background image, and the effect diagram shown in FIG. 8(c) is obtained.
  • the method can quickly interpret the foreground target metadata, and the data processing speed is faster.
  • the moving target does not need to be very clear, such as monitoring the traffic of the car, the desired result can be quickly obtained.
  • the processing unit 201b of the video display device 201 interprets the received foreground target metadata, and superimposes directly on the decoded and displayed background image using the display drawing technique according to the information provided in the metadata.
  • the foreground moving target described by the metadata is drawn to realize a composite display of the background image and the foreground image.
  • This method does not need to store a large number of target pictures.
  • the data processing speed is slower than that of Method 1, the obtained target picture is more accurate and can accurately recover the original video image.
  • FIG. 9 shows a flow of a video decoding method according to an embodiment of the present application.
  • the method can be performed in the above apparatus, the method comprising the following steps.
  • step S410 the data receiving unit 201c receives the video compressed data and the foreground target metadata.
  • step S420 the processing unit 201b decodes the video compressed data to interpret the foreground target metadata.
  • step S430 the data transmission unit 201a performs composite display on the decoded background image and the interpreted foreground moving target.
  • the image is one of a series of images acquired when the camera monitors the highway traffic flow.
  • the monitoring of the highway traffic is mainly concerned with how many cars are on the road. Information such as models.
  • the video image of Fig. 10(a) includes still scenes (e.g., trees, buildings, etc.) and moving objects (e.g., minibuses, large buses, etc.).
  • the frame video image is separated into a background image containing only a still scene (as shown in the upper diagram of FIG. 10(b)) and a foreground image containing only the moving target compact bus and the large passenger car (as shown in FIG. 10(b)).
  • Figure a background image containing only a still scene (as shown in the upper diagram of FIG. 10(b)) and a foreground image containing only the moving target compact bus and the large passenger car.
  • the background image is compression-encoded into video compression data in a digital array mode.
  • the compact passenger car image and the large passenger car image are structured separately (see Fig. 10(d)).
  • the foreground target metadata includes at least: vehicle type, vehicle color, vehicle brand, vehicle model, license plate number, position of the foreground target in the frame video image, and time of the frame video image.
  • the foreground target metadata obtained after structured processing is: model: small passenger car; color: red; brand: Audi; model: A4; license plate: xxxxx; location: xxxxx; time: xxxxxx.
  • the foreground target metadata obtained after structured processing is: model: large passenger car; color: red; brand: Yutong; model: xx; license plate: xxxxx; location: xxxxx; time: xxxxxx.
  • the video compressed data A and the foreground target metadata B+C are mixed to obtain a video data mixed stream D with metadata, and the mixed stream D is transmitted or stored.
  • the video data mixed stream D with metadata is parsed to obtain video compressed data A and foreground target metadata B+C, and then the video compressed data B is decoded to obtain a background image.
  • the foreground target metadata B+C is interpreted to obtain a foreground image.
  • the foreground image is superimposed on the corresponding position of the background image, and the combined display is performed to restore the captured video image. .
  • the present application processes the background image and the foreground moving target in the image before transmitting the video image, that is, compresses and encodes the background image to obtain video compressed data, and structurally processes the foreground moving target to obtain a foreground.
  • Target metadata Since metadata is not video data, but structured semantic information, you can use text or design data structures to binary data. The mode is transmitted, so the amount of video data can be greatly reduced, and the consumption of network bandwidth is further reduced.
  • the embodiment of the present application further provides a storage medium, where the storage medium is used to store an application, and the application is used to execute a video encoding method described in the present application at runtime.
  • a video encoding method as described in the present application includes:
  • the embodiment of the present application further provides a storage medium, where the storage medium is used to store an application, and the application is used to execute a video decoding method described in the present application at runtime.
  • the video decoding method of the present application includes:
  • the background image obtained after decoding and the interpreted foreground moving target are synthesized and displayed.
  • the embodiment of the present application further provides a storage medium, where the storage medium is used to store an application, and the application is used to execute a highway for use in the present application at runtime.
  • Video encoding method The video coding method for the expressway described in the present application includes:
  • the video compressed data and the foreground target metadata are mixed to obtain a mixed stream of video data with metadata, and the mixed stream is transmitted.
  • the embodiment of the present application further provides a storage medium, where the storage medium is used to store an application, and the application is used to execute a highway for use in the present application at runtime.
  • Video decoding method The video decoding method for the expressway described in the present application includes:
  • the foreground image is superimposed on the corresponding position of the background image, and the combined display is performed to restore the captured video image.
  • an embodiment of the present application further provides an application, where the application is used to execute a video encoding method described in the present application at runtime.
  • a video encoding method as described in the present application includes:
  • an embodiment of the present application further provides an application, where the application is used to execute a video decoding method described in the present application at runtime.
  • the video decoding method of the present application includes:
  • the background image obtained after decoding and the interpreted foreground moving target are synthesized and displayed.
  • an embodiment of the present application further provides an application, where the application is used to execute a video encoding method for a highway described in the present application at runtime.
  • the video coding method for the expressway described in the present application includes:
  • the video compressed data and the foreground target metadata are mixed to obtain a mixed stream of video data with metadata, and the mixed stream is transmitted.
  • the embodiment of the present application further provides an application, where the application is used to execute a video decoding method for a highway described in the present application at runtime.
  • the video decoding method for the expressway described in the present application includes:
  • the foreground image is superimposed on the corresponding position of the background image, and the combined display is performed to restore the captured video image.
  • an embodiment of the present application further provides an encoding device, where the encoding device includes: a processor, a memory, a communication interface, and a bus;
  • the processor, the memory, and the communication interface are connected by the bus and complete communication with each other;
  • the memory stores executable program code
  • the processor operates by reading executable program code stored in the memory
  • the program corresponding to the executable program code is used to:
  • the embodiment of the present application further provides a decoding device, where the decoding device includes: a processor, a memory, a communication interface, and a bus;
  • the processor, the memory, and the communication interface are connected by the bus and complete communication with each other;
  • the memory stores executable program code
  • the processor runs a program corresponding to the executable program code by reading executable program code stored in the memory for:
  • the background image obtained after decoding and the interpreted foreground moving target are synthesized and displayed.
  • an embodiment of the present application further provides an encoding device, where the encoding device includes: a processor, a memory, a communication interface, and a bus;
  • the processor, the memory, and the communication interface are connected by the bus and complete communication with each other;
  • the memory stores executable program code
  • the processor runs a program corresponding to the executable program code by reading executable program code stored in the memory for:
  • the video compressed data and the foreground target metadata are mixed to obtain a mixed stream of video data with metadata, and the mixed stream is transmitted.
  • the embodiment of the present application further provides a decoding device, where the decoding device includes: a processor, a memory, a communication interface, and a bus;
  • the processor, the memory, and the communication interface are connected by the bus and complete communication with each other;
  • the memory stores executable program code
  • the processor runs a program corresponding to the executable program code by reading executable program code stored in the memory for:
  • the foreground image is superimposed on the corresponding position of the background image, and the combined display is performed to restore the captured video image.
  • the various components of the calculation and/or printer provided by the embodiments of the present application, and the steps in the method may be concentrated on a single computing device or distributed in multiple calculations. On the network of devices. Alternatively, they may be implemented in program code executable by a computing device. Thus, they may be stored in a storage device by a computing device, or they may be fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof may be implemented as a single integrated circuit module. Thus, the application is not limited to any particular combination of hardware and software.

Abstract

本申请提供了一种视频编码方法、解码方法及其装置,该视频编码装置包括:视频采集单元,其用于采集视频图像;处理单元,其用于对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;数据传输单元,其用于传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。在运动目标的数量较多或尺寸较大的情况下,能够有效地降低视频数据量,缓解传输时网络带宽的限制。

Description

视频编码方法、解码方法及其装置
本申请要求于2015年4月30日提交中国专利局、申请号为201510216640.0发明名称为“视频编码方法、解码方法及其装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频监控领域,尤其涉及一种视频编码方法、解码方法及其装置。
背景技术
随着多媒体信息技术的不断发展,视频信息大量涌现,视频数据作为一种表达信息的综合媒体,已成为我们现实生活中一个重要的信息载体。
以监控采像设备为例,其所采集的图像越清晰则所产生的视频数据就越多,若不对这些视频数据进行任何处理,就传输这些视频数据则需要占用大量的网络带宽,并且存储这些视频数据所需的存储空间也就越大,这样,无论是进行网络传输还是进行数据存储,所耗费的成本都很高。因此在现有技术中,提出了如下这种方法,该方法通过对视频数据进行压缩编码,然后将压缩编码后的视频文件进行传输,进而降低视频数据传输时所占用的网络带宽,减小成本。
虽然,此方法在一定程度上减小了传输时所占用的网络带宽,然而,由于这种方法仍然采用传统的视频编码方法对视频数据整体进行编码,然后将编码后的数据传输出去,因此在视频图像的运动目标的数量较多,尺寸较大时仍然会占用较大的网络带宽。
因此,亟需提出一种方案,在运动目标的数量较多或尺寸较大的情况下,能够有效地降低视频数据量,缓解传输时网络带宽的限制。
发明内容
本申请所要解决的技术问题之一是需要提供一种视频编码装置,其在运动目标的数量较多或尺寸较大的情况下,能够有效地降低视频数据量,缓解 传输时网络带宽的限制。
为了解决上述技术问题,本申请的实施例首先提供了一种视频编码装置,包括:视频采集单元,其用于采集视频图像;处理单元,其用于对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;数据传输单元,其用于传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。
在一个实施例中,所述处理单元还用于对视频图像进行背景建模,并基于建立的背景模型来检测所述前景运动目标,以分离当前帧视频图像中的背景图像和前景运动目标。
在一个实施例中,所述数据传输单元间隔设定时间段传输对应背景图像的视频压缩数据,并实时传输对应前景运动目标的前景目标元数据。
在一个实施例中,所述处理单元在对所述视频图像中的前景运动目标进行结构化处理时,采用的结构化算法包括不设定目标类型的结构化算法和设定类型目标的结构化算法。
根据本申请另一方面,还提供了一种视频解码装置,包括:数据接收单元,其用于接收视频压缩数据和前景目标元数据;处理单元,其用于对视频压缩数据进行解码,对前景目标元数据进行解读。
在一个实施例中,还包括:存储单元,其用于存储图像,所述处理单元进一步根据所述前景目标元数据的信息从所述存储单元中选择对应的前景目标图像作为前景运动目标,实现对前景目标元数据的解读。
在一个实施例中,所述处理单元,根据所述前景目标元数据的信息,利用显示绘图技术在解码后的背景图像上叠加绘制所述前景目标元数据所描述的前景运动目标,实现对前景目标元数据的解读。
在一个实施例中,还包括:视频显示单元,其用于对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
根据本申请另一方面,还提供了一种视频传输显示系统,包括:如上所 述的视频编码装置,以及如上所述的视频解码装置。
根据本申请另一方面,还提供了一种视频编码方法,包括:采集待传输的视频图像;对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。
在一个实施例中,还包括:对视频图像进行背景建模,并基于建立的背景模型来检测所述前景运动目标,以分离当前帧视频图像中的背景图像和前景运动目标。
在一个实施例中,间隔设定时间段传输对应背景图像的视频压缩数据,并实时传输对应前景运动目标的前景目标元数据。
在一个实施例中,在对所述视频图像中的前景运动目标进行结构化处理时,采用的结构化算法包括目标类型的结构化算法和设定类型目标的结构化算法。
根据本申请另一方面,还提供了一种视频解码方法,包括:接收视频压缩数据和前景目标元数据;对视频压缩数据进行解码,对前景目标元数据进行解读;对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
在一个实施例中,在对解码后得到的背景图像和解读后的前景运动目标进行合成显示的步骤中,进一步包括:根据所述前景目标元数据的信息从预先存储的图像中选择对应的前景目标图像作为前景运动目标,将该前景目标图像与解码后的背景图像进行合成显示。
在一个实施例中,在对解码后得到的背景图像和解读后的前景运动目标进行合成显示的步骤中,进一步包括:根据所述前景目标元数据的信息,利用显示绘图技术在解码后的背景图像上叠加绘制所述前景目标元数据所描述的前景运动目标。
根据本申请另一方面,还提供了一种用于高速公路的视频编码方法,包括:采集高速公路上的视频图像;根据背景模型将一帧视频图像分离成包含静止场景的背景图像以及包含运动目标车辆的前景图像;将背景图像压缩编 码成数字阵列模式的视频压缩数据,对运动目标车辆的前景图像进行结构化处理得到的前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据;将视频压缩数据和前景目标元数据进行混合得到带有元数据的视频数据混合流,并将该混合流进行传输。
在一个实施例中,所述前景目标元数据至少包括:车辆类型、车辆颜色、车辆品牌、车辆型号、车牌号、前景目标在该帧视频图像中的位置、该帧视频图像的时间。
根据本申请另一方面,还提供了一种用于高速公路的视频解码方法,包括:解析带有元数据的视频数据混合流得到视频压缩数据和前景目标元数据;对视频压缩数据进行解码得到背景图像,对前景目标元数据进行解读得到前景图像;根据元数据中的位置信息和时间信息,将前景图像叠加到背景图像的对应位置上,进行合成显示,重新复原所采集到的视频图像。
在一个实施例中,在对前景目标元数据进行解读得到前景图像的步骤中,包括:根据所述前景目标元数据的信息选择对应的前景目标图像作为前景运动目标,或者,根据所述前景目标元数据的信息,利用显示绘图技术在解码后的背景图像上叠加绘制所述前景目标元数据所描述的前景运动目标。
为解决上述技术问题,本申请实施例还提供了一种存储介质,其中,该存储介质用于存储应用程序,所述应用程序用于在运行时执行本申请所述的一种视频编码方法。
为解决上述技术问题,本申请实施例还提供了一种存储介质,其中,该存储介质用于存储应用程序,所述应用程序用于在运行时执行本申请所述的一种视频解码方法。
为解决上述技术问题,本申请实施例还提供了一种存储介质,其中,该存储介质用于存储应用程序,所述应用程序用于在运行时执行本申请所述的一种用于高速公路的视频编码方法。
为解决上述技术问题,本申请实施例还提供了一种存储介质,其中,该存储介质用于存储应用程序,所述应用程序用于在运行时执行本申请所述的一种用于高速公路的视频解码方法。
为解决上述技术问题,本申请实施例还提供了一种应用程序,其中,该应用程序用于在运行时执行本申请所述的一种视频编码方法。
为解决上述技术问题,本申请实施例还提供了一种应用程序,其中,该应用程序用于在运行时执行本申请所述的一种视频解码方法。
为解决上述技术问题,本申请实施例还提供了一种应用程序,其中,该应用程序用于在运行时执行本申请所述的一种用于高速公路的视频编码方法。
为解决上述技术问题,本申请实施例还提供了一种应用程序,其中,该应用程序用于在运行时执行本申请所述的一种用于高速公路的视频解码方法。
为解决上述技术问题,本申请实施例还提供了一种编码设备,所述编码设备包括:处理器、存储器、通信接口和总线;
所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
所述存储器存储可执行程序代码;
所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:
采集待传输的视频图像;对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。
为解决上述技术问题,本申请实施例还提供了一种解码设备,所述解码设备包括:处理器、存储器、通信接口和总线;
所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
所述存储器存储可执行程序代码;
所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:
接收视频压缩数据和前景目标元数据;对视频压缩数据进行解码,对前景目标元数据进行解读;对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
为解决上述技术问题,本申请实施例还提供了一种编码设备,所述编码设备包括:处理器、存储器、通信接口和总线;
所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
所述存储器存储可执行程序代码;
所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:
采集高速公路上的视频图像;根据背景模型将一帧视频图像分离成包含静止场景的背景图像以及包含运动目标车辆的前景图像;将背景图像压缩编码成数字阵列模式的视频压缩数据,对运动目标车辆的前景图像进行结构化处理得到的前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据;将视频压缩数据和前景目标元数据进行混合得到带有元数据的视频数据混合流,并将该混合流进行传输。
为解决上述技术问题,本申请实施例还提供了一种解码设备,所述解码设备包括:处理器、存储器、通信接口和总线;
所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
所述存储器存储可执行程序代码;
所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:
解析带有元数据的视频数据混合流得到视频压缩数据和前景目标元数据;对视频压缩数据进行解码得到背景图像,对前景目标元数据进行解读得 到前景图像;根据元数据中的位置信息和时间信息,将前景图像叠加到背景图像的对应位置上,进行合成显示,重新复原所采集到的视频图像。
与现有技术相比,上述方案中的一个或多个实施例可以具有如下优点或有益效果。
本申请的方法是一种基于前景运动目标(或称前景目标)结构化的视频传输方法,其主要应用在对固定场景以及场景中的整体状况进行视频监控的情况中,例如监控高速公路的道路流量情况。通过对视频图像中的背景进行压缩编码,并对前景目标进行结构化处理,然后将处理后的视频压缩数据和元数据进行传输,进而能够有效降低数据流量,节省网络带宽等。
本申请的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请的技术方案而了解。本申请的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构和/或流程来实现和获得。
附图说明
为了更清楚地说明本申请实施例和现有技术的技术方案,下面对实施例和现有技术中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例的视频编码装置的结构示意图;
图2为本申请实施例的视频编码方法的流程示意图;
图3为本申请实施例的进行前景检测处理的流程示意图;
图4为本申请实施例的进行图块标记处理的流程示意图;
图5为本申请实施例的进行大小归一化处理的流程示意图;
图6为本申请实施例的轮廓扫描线示意图;
图7为本申请实施例的视频解码装置的结构示意图;
图8(a)-(c)为对一视频图像进行合成的说明图;
图9为本申请实施例的视频解码方法的流程示意图;
图10(a)-(f)为对一视频图像进行传输和显示的说明图。
具体实施方式
以下将结合附图及实施例来详细说明本申请的实施方式,借此对本申请如何应用技术手段来解决技术问题,并达成相应技术效果的实现过程能充分理解并据以实施。本申请实施例以及实施例中的各个特征,在不相冲突前提下可以相互结合,所形成的技术方案均在本申请的保护范围之内。
另外,附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行。并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
下面对本申请所涉及到的用语进行说明。“背景”是指视频图像中相对于运动前景在一定时间内保持稳定的像素区域。“前景”是指视频图像中相对于运动背景发生了一定变化的像素区域。“结构化”是指将离散的数字图像阵列,通过视频内容分析处理提取其中存在的语义信息(如:针对一帧图像的结构化描述,“图像中有一辆红色的轿车”)。“元数据”是指存储了视频结构化信息的数据。
在本申请实施例中,所传输的视频数据可以是视频、静态图像或动画,或上述数据的组合,不做限定。
本申请的视频传输方法是一种基于前景运动目标结构化的视频传输方法,其主要应用在对固定场景以及场景中的整体状况进行视频监控的情况中,例如监控高速公路的道路流量情况。通过对视频图像中的背景图像进行压缩编码,并对前景目标进行结构化处理,然后将处理后得到的视频压缩编码和元数据进行传输,进而能够有效降低数据流量,节省网络带宽等。
(实施例一)
图1为本申请实施例的视频编码装置的结构示意图。本实施例的视频编码装置101可以将视频数据通过有线或无线的方式传输出去。
该装置101包括:视频采集单元101a、处理单元101b和数据传输单元101c。视频采集单元101a用于采集待传输的视频图像。处理单元101b对待传输的视频图像中的背景图像进行压缩编码得到视频压缩数据,并对该视频图像中的前景运动目标(或称前景目标)进行结构化处理得到前景目标元数据。数据传输单元101c传输视频压缩数据和前景目标元数据。
视频采集单元101a例如可以为视频采集卡,其主要是将视频监控装置例如模拟摄像机、录像机等输出的模拟视频通过模数转换器转换成二进制数字信息,以作为可编辑处理的数字视频文件保存下来。
为了对视频图像中的背景图像和前景目标分别进行处理,处理单元101b对视频图像进行背景建模,并基于建立的背景模型来检测前景运动目标,以分离当前帧视频图像中的背景图像和前景目标。
需要说明的是,背景与前景都是相对的概念,以高速公路为例,在人们关注高速公路上来来往往的汽车时,则这些车辆是前景,而路面以及周围的环境是背景;在人们关注闯入高速公路的行人时,则闯入者是前景,而包括汽车之类的其他东西又成了背景。
而且,检测前景目标是进行目标分析的基础,前景目标检测的常用方法是背景相减法,而背景相减法的关键是如何从视频序列中建立背景模型。针对不同的应用环境,已提出多种背景建模方法,常用的如基于单高斯模型的方法、基于混合高斯模型的方法、基于统计的背景建模方法、基于码本的建模方法等。
在本实施例中,处理单元101b优选使用基于码本的建模方法,它建模的基本思想是,先根据视频序列产生最初的码本,借助码字中的一个参数“最长未出现时间”进行时域滤波处理。这样做的目的是滤除码本中可能代表前景图像的那些码字。最后再经过空域滤波,将上一步错误删除的代表较少出现的背景状态的码字恢复到码本中,通过上述方法即可建立背景模型。
在处理单元101b建立好背景模型后,基于背景模型进行前景运动目标检测,即从当前帧视频图像中将运动的前景目标从背景图像中提取出来。
由于在本实施例中视频监控装置,例如模拟摄像机在整个监视过程中不 发生移动,因此优选地处理单元101b采用背景差分法来检测前景目标。具体地,处理单元101b将当前帧视频图像与背景模型相减,如果像素插值大于某一阈值,则判断此像素属于前景运动目标,否则属于背景图像。利用当前帧视频图像与背景模型的差分来检测运动区域,一般能够提供比较完整的特征数据,而且这种方法操作简单,在固定背景下能够完整、精确地分割出运动目标。
容易理解,对于摄像机在监视过程中发生了移动,如平动、旋转或多自由度运动等的情况下,则可采用动态背景下的运动目标检测方法,例如匹配法、光流法或全局运动估计法等,在此不再赘述。
一般情况下,得到的前景目标会包含很多噪声,因此处理单元101b还可以通过对前景图像进行开运算及闭运算,然后再丢弃比较小的轮廓,进而消除噪声。处理单元101b通过上述操作就能够将当前帧视频图像的背景图像和前景目标很好地分割开。在此之后,处理单元101b对视频中的背景图像进行压缩编码,并对视频中的前景目标进行结构化处理。
关于对背景图像进行压缩编码的方法可以采用哈夫曼编码、预测编码、变换编码等,由于现有技术较为成熟,因而在此不再赘述。通过对背景图像进行压缩编码能够去除多余数据减少表示数字图像时需要的数据量,便于图像的存储和传输,减小了存储空间和传输带宽。
另一方面,处理单元101b还需要对视频中的前景目标进行结构化处理得到前景目标元数据。
需要说明的是,元数据并非是大量的视频数据,而是对视频中的前景目标进行语义化描述的信息。比如前景运动目标是道路上的一辆轿车,对此目标进行结构化获得的元数据可以如下表所示。
Figure PCTCN2015098060-appb-000001
Figure PCTCN2015098060-appb-000002
需要说明的是,对前景目标能进行什么程度的结构化,获得多少元数据,取决于视频环境,视频分辨率,清晰度,结构化分析的算法等多方面因素。对视频进行结构化的算法本身、以及元数据的具体定义并非本方案关注的重点,可以使用任何能获取上述类型元数据的结构化算法。
在处理单元101b对背景图像进行压缩编码处理的基础上,又对相比背景图像来说数据量较大的前景运动目标进行了结构化处理,这样,由于处理得到的元数据并非视频数据,而是结构化的语义信息,可以用文本或者设计数据结构以二进制数据的方式进行传输,因此相比现有的使用视频编码技术来压缩整个视频图像说,数据量被大大的降低,能够进一步降低网络带宽的消耗。
视频传输装置101的数据传输单元101c对视频压缩数据和前景目标元数据进行传输。在传输模式上可以将上述压缩编码后的视频压缩数据和结构化的前景目标元数据混合成新的视频数据,然后通过无线或有线的方式将新的视频数据传输出去。或者,数据传输单元101c通过无线或有线的方式直接将上述压缩编码后的视频压缩数据和结构化的前景目标元数据作为独立的两种数据分别进行传输。
需要说明的是,在传输模式为第二种时,即作为独立数据分别传输时,由于本实施例的视频场景固定不变,因此视频压缩数据只需要传输一次或者隔设定时间段传输一次即可。而前景运动目标在不同的帧图像中或多或少都会存在差异,因此前景目标元数据需要被实时进行传输。由于不需要实时对 每帧视频图像的背景图像进行压缩编码和传输,因此不仅降低了处理单元的数据处理负担,还能够降低数据传输所占用的网络资源。
另外,本实施例的视频传输装置101还可以包括存储单元,该存储单元对背景视频压缩数据和前景目标元数据进行存储,待需要传输时,数据传输单元101c从该存储单元取出即可。此处不对该存储单元的具体类型进行限定。
图2示出了本申请实施例的视频编码方法的流程,该方法可以在上述装置中执行,方法包括如下步骤。
在步骤S210中,视频采集单元101a采集待传输的视频图像。
在步骤S220中,处理单元101b对待传输的视频图像中的背景图像进行压缩编码得到视频压缩数据,并对该视频图像中的前景运动目标(或称前景目标)进行结构化处理得到前景目标元数据。
下面以不设定目标类型的结构化算法为例,详细说明处理单元101b获得目标类型的元数据的处理流程。
首先,处理单元101b进行前景检测。
前景检测是指确定当前输入图像中的前景点(背景中发生运动的像素点),要确定像素点是否存在运动,需要计算当前输入像素点与背景图像像素点之间的差异,当差异超过设定的范围以后,则认为该点为前景点。
具体地,设当前像素点为X,则:
Figure PCTCN2015098060-appb-000003
x表示像素点X的像素值,b(x)表示像素点X对应背景图像的像素值,T表示设定阈值,若结果为1则表示该像素点X为前景点。
通过判断输入图像中每个像素点与其对应的背景像素之间的差异,我们可以确定输入图像中哪些点属于前景点,哪些点属于静止不动的背景点。
背景图像是通过维护一个结合多种图像信息的背景模型来获取,它可以吸收环境中的光照变化,可以滤除例如雨雪等带来的干扰。前景检测结果如 图3所示,其中前景图中白色点表示前景点,黑色点表示背景点。
然后,处理单元101b进行前景点团块标记处理。
前景检测环节只能确定输入图像中像素点是否为前景点(运动点),并未定义各个前景点的归属。通常运动目标所对应的前景点在空间上是连续的,在图像中表现为一个前景团块,这些团块的轮廓通常是闭合的。团块标记可以看作是一个轮廓搜索以及轮廓跟踪的过程,每一个团块对应一个唯一的轮廓,通过寻找这些轮廓可以标记前景图像中的各个前景团块。团块标记结果如图4所示,得到了团块1~5这五个团块。
接下来,处理单元101b进行目标跟踪及提取处理。
在实际应用中,不是所有的前景团块都为智能视频分析所关注的运动目标。在很多情况下,监控场景中的背景扰动都在前景检测输出中生成前景团块,如果不把这一部分虚假目标过滤,将会产生大量的虚假告警信息。判断一个前景团块是否为真实的运动目标,通常可以按照如下方式进行。
对生成的前景团块进行跟踪,记录团块的运动轨迹。如果团块对应的是一个真实的运动目标,则应满足:
1、轨迹的长度应符合要求,过短的轨迹表示它是背景中短时扰动。
2、团块的运动轨迹应符合正常运动目标的运动特性,轨迹应是规律的,而不应是杂乱的。
3、团块的运动速度不应过大。
在对前景团块进行干扰过滤后,开始对它进行目标跟踪,跟踪前需对目标进行初始化,初始化步骤包括如下:
保存团块的前15帧轨迹信息(目标在各个时间点中的位置信息),更新目标对应的Kalman滤波器。
统计团块在当前图像区域内的YUV彩色直方图,保存该直方图为目标的特征模板。
通过团块检测及轨迹分析,我们在视频图像中定义了运动目标,目标跟 踪的目的是为了建立目标在时序上的位置对应关系(轨迹)。确定目标在当前图像中的位置的过程描述如下:
利用目标对应的kalman滤波器,进行目标的运动位置预测。Kalman滤波器中保存了目标的速度,方向信息,可以预测目标在下一时刻的运动位置:
state_post=T×state_pre
Figure PCTCN2015098060-appb-000004
state_post表示当前预测位置,T表示kalman转换矩阵,
state_pre表示kalman滤波器上一时刻目标坐标,速度,加速度修正值
在目标的预测位置上,进行Mean Shift目标跟踪。在Mean Shift跟踪的最终位置上,搜索前景团块。如果该位置不存在前景团块,则跟踪失败。如果存在前景团块,则用该团块的大小及中心点位置作为目标跟踪的结果。
利用目标跟踪的结果位置信息修正kalman滤波器各参数及state_pre。统计目标当前所处范围内的YUV彩色直方图,利用它对目标特征模板进行更新,更新目标的大小。
在跟踪的结果位置上,在目标大小限定的范围内,提取前景图像,它代表目标的位置及形状信息。
其中YUV彩色直方图表示的是图像中各YUV值在图像中出现的概率,假设有YUV值(y,u,v),则它在图像中出现的概率P(y,u,v)为:
Figure PCTCN2015098060-appb-000005
其中
Figure PCTCN2015098060-appb-000006
YUV(x,y)表示图像在位置(x,y)上的YUV值
M与N分别表示图像的高度与宽度
表示图像中各YUV值概率一共需要256x256x256=16777216个存储位置,需要大量的存储空间。考虑到YUV空间中各个分量的独立性,在这里我们对彩色直方图进行一个近似的描述:
P(y,u,v)=P(y)*P(u)*P(v)
Figure PCTCN2015098060-appb-000007
其中
Figure PCTCN2015098060-appb-000008
Figure PCTCN2015098060-appb-000009
其中
Figure PCTCN2015098060-appb-000010
U(x,y)表示图像在位置(x,y)上的U分量值
Figure PCTCN2015098060-appb-000011
其中
Figure PCTCN2015098060-appb-000012
V(x,y)表示图像在位置(x,y)上的V分量值
M与N分别表示图像的高度与宽度
由此我们对彩色直方图进行了压缩,描述一个彩色直方图只需要256+256+256=768个存储位置。目标的颜色直方图是利用联合概率分布计算得到的压缩直方图,比传统的颜色直方图占用更少的内存空间,运算过程中涉及的数据量大大降低。
接着,处理单元101b进行目标的大小归一化处理。
经过目标提取以及跟踪后得到的运动目标,大小会存在较大的差异,我们需要对目标进行大小归一化处理,将目标图像统一宽40像素,高40像素的目标模板。图像缩放过程中,为保留目标的宽高比例,采用一种特殊的缩放方法:根据目标的宽高比例,宽度与高度方向分别采用不同的缩放因子,保证大小归一化前后目标宽高比例的一致性。假设当前有目标,宽高分别为w,h,按照以下方式进行大小归一化处理:
scale_w=40/w
if(w>h)
scale_h=40/w
scale_w=40/h
if(w<=h)
scale_h=40/h
scale_w,scale_h分别表示宽度与高度方向上的缩放比例
经过大小归一化处理后的结果可参见图5。
接下来,处理单元101b进行目标的轮廓特征提取处理。
从大小归一化的目标外接矩形框左上角开始,沿逆时针方向,在矩形边各点做垂直于矩形边的轮廓扫描线,记录扫描线从矩形边界到目标轮廓之间的距离,将这些距离作为目标的轮廓特征值,一共可以统计40+40+40+40=160个特征值。在实际应用中,为减少特征数据的数量,可以每4个点保存一个特征值(4根轮廓扫描线长度的平均值)。这样不仅可以大大地减少数据量,而且可以滤除一些图像噪声点与特征数据的影响。图6为轮廓扫描线示意图,白线表示目标外接矩形框对应点的轮廓扫描线。
最后,处理单元101b利用SVM分类器进行分类运算。
将提取的轮廓特征进行数值归一化处理,将各特征值缩放至0-1之间,将特征值输入到实现训练好的SVM分类器进行分类运算,根据分类器输出的结果确定目标的类型。
需要说明的,除了上述的不设定目标类型结构化算法以外,根据具体的应用还包括如下设定类型目标的结构化算法:车辆目标结构化算法(包括车牌号码识别、车身颜色分类、车辆子品牌识别等),人员目标结构化算法(包括身高、年龄段、性别、是否戴眼镜、衣着颜色等),运动目标结构化算法(包括目标类型、运动速度、运动方向、位置等结构化算法)。
在步骤S230中,数据传输单元101c传输视频压缩数据和前景目标元数据。
综上所述,本实施例的视频传输装置101在对视频图像传输之前,分别对图像中的背景图像和前景运动目标进行处理,即对背景图像进行压缩编码得到视频压缩数据,对前景运动目标进行结构化处理得到前景目标元数据。 由于元数据并非视频数据,而是结构化的语义信息,可以用文本或者设计数据结构以二进制数据的方式进行传输,因此能够大大的降低视频数据量,进一步的降低网络带宽的消耗。
(实施例二)
图7为本申请实施例的视频解码装置的结构示意图。本实施例的视频解码装置201可以将视频压缩数据和前景目标元数据进行解码和图像显示。
该装置201包括:视频显示单元201a、处理单元201b和数据接收单元201c。数据接收单元201c通过有线或无线的方式来接收来自数据传输单元101c传输的视频压缩数据和前景目标元数据。处理单元201b对视频压缩数据进行解码,对前景目标元数据进行解读。视频显示单元201a对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
处理单元201b对背景视频数据进行解码得到背景图像,视频显示单元201a对解码得到的背景图像进行显示。需要说明的是,处理单元201b所使用的解码方法与对背景图像进行编码的方法相对应,具体涉及的解码过程和解码算法在此不再赘述。
处理单元201b对前景目标元数据进行解读得到前景图像。对于处理单元201b如何解读前景目标元数据,本实施例给出以下两种方法。
在方法1中,该装置201还包括存储单元201d,该存储单元201d事先预存各种类型的前景目标图片,例如,若监控高速公路上的车辆,那么存储单元201d中可以存储大量不同颜色、不同型号、不同品牌的车辆的图片。在处理单元201b对接收到的前景目标元数据进行解读分析时,根据元数据中提供的信息,可以从该存储单元201d的预存图片中找出符合或最接近元数据描述的前景目标图片,将该前景目标图片作为前景运动目标。视频显示单元201a根据元数据描述的目标位置以及目标出现的时间,把前景目标图片叠加至已解码并显示的背景图像上,实现背景图像与前景运动目标的合成显示。
例如,图8(a)为背景图像,假设在元数据中描述了某时刻,某矩形区域,有一辆黑色小型汽车,品牌为大众,子品牌为途安,运动方向为左上45 度。根据上述元数据的描述,在预存图片中找到了对应的图片(图8(b)),其最接近元数据中所描述的运动前景。然后把该图8(b)的车辆图片叠加至背景图像,获得图8(c)所示的效果图。
虽然此种方法所获得的前景目标图片与实际前景目标存在一定差异,但是该方法能够快速地对前景目标元数据进行解读,数据处理速度较快。在运动目标无需十分清楚的情况下,例如监控汽车流量,能够很快地得到所需要的结果。
或者,在方法2中,视频显示装置201的处理单元201b对接收到的前景目标元数据进行解读,根据元数据中提供的信息,使用显示绘图技术,直接在已解码并显示的背景图像上叠加绘制元数据所描述的前景运动目标,实现背景图像和前景图像的合成显示。此种方法无需存储大量的目标图片,虽然数据处理速度相较方法1来说慢一些,但是所得到的目标图片较为精确,能够准确地恢复出原始视频图像。
在显示绘图技术中,包括了DirectDraw,Direct3D,OpenGL等技术,在实际的操作中,就如影视特效一样,是可以绘制出较为贴近真实物体的2D/3D图像。
图9示出了本申请实施例的视频解码方法的流程。该方法可以在上述装置中执行,方法包括如下步骤。
在步骤S410中,数据接收单元201c接收视频压缩数据和前景目标元数据。
在步骤S420中,处理单元201b对视频压缩数据进行解码,对前景目标元数据进行解读。
在步骤S430中,数据传输单元201a对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
(示例)
图10(a)-(f)为对一视频图像进行传输和显示的说明图。下面用该系 列附图来描述如何对一视频图像进行传输和显示。
如图10(a)所示,该图像是在摄像机监控高速公路道路流量情况时所获取的一系列图像的其中一帧,监控高速公路道路流量情况主要是关注道路上有多少车,分别是什么车型等信息。该图10(a)的视频图像包含了静止场景(如树木、建筑物等)以及运动目标(例如小型客车和大型客车等)。
根据背景模型将该帧视频图像分离成仅包含静止场景的背景图像(如图10(b)的上图)以及仅包含运动目标小型客车和大型客车的前景图像(如图10(b)的下图)。
如图10(c)所示,将背景图像压缩编码成数字阵列模式的视频压缩数据。然后,分别对小型客车图像和大型客车图像进行结构化处理(参见图10(d))。在本应用场景中,前景目标元数据至少包括:车辆类型、车辆颜色、车辆品牌、车辆型号、车牌号、前景目标在该帧视频图像中的位置、该帧视频图像的时间。对于小型客车图像,结构化处理后得到的前景目标元数据为:车型:小型客车;颜色:红色;品牌:奥迪;型号:A4;车牌:xxxxx;位置:xxxxx;时间:xxxxxx。对于大型客车图像,结构化处理后得到的前景目标元数据为:车型:大型客车;颜色:红色;品牌:宇通;型号:xx;车牌:xxxxx;位置:xxxxx;时间:xxxxxx。
然后,参见图10(e),将视频压缩数据A和前景目标元数据B+C进行混合得到带有元数据的视频数据混合流D,并将该混合流D进行传输或存储。
最后,如图10(f)所示,将带有元数据的视频数据混合流D进行解析得到视频压缩数据A和前景目标元数据B+C,然后对视频压缩数据B进行解码得到背景图像,对前景目标元数据B+C进行解读得到前景图像,最后根据元数据中的位置信息和时间信息,将前景图像叠加到背景图像的对应位置上,进行合成显示,重新复原所采集到的视频图像。
综上所述,本申请在对视频图像传输之前,分别对图像中的背景图像和前景运动目标进行处理,即对背景图像进行压缩编码得到视频压缩数据,对前景运动目标进行结构化处理得到前景目标元数据。由于元数据并非视频数据,而是结构化的语义信息,可以用文本或者设计数据结构以二进制数据的 方式进行传输,因此能够大大的降低视频数据量,进一步的降低网络带宽的消耗。
为解决上述技术问题,本申请实施例还提供了一种存储介质,其中,该存储介质用于存储应用程序,所述应用程序用于在运行时执行本申请所述的一种视频编码方法。其中,本申请所述的一种视频编码方法,包括:
采集待传输的视频图像;
对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;
传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。
为解决上述技术问题,本申请实施例还提供了一种存储介质,其中,该存储介质用于存储应用程序,所述应用程序用于在运行时执行本申请所述的一种视频解码方法。其中,本申请所述的一种视频解码方法,包括:
接收视频压缩数据和前景目标元数据;
对视频压缩数据进行解码,对前景目标元数据进行解读;
对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
为解决上述技术问题,本申请实施例还提供了一种存储介质,其中,该存储介质用于存储应用程序,所述应用程序用于在运行时执行本申请所述的一种用于高速公路的视频编码方法。其中,本申请所述的一种用于高速公路的视频编码方法,包括:
采集高速公路上的视频图像;
根据背景模型将一帧视频图像分离成包含静止场景的背景图像以及包含运动目标车辆的前景图像;
将背景图像压缩编码成数字阵列模式的视频压缩数据,对运动目标车辆的前景图像进行结构化处理得到的前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据;
将视频压缩数据和前景目标元数据进行混合得到带有元数据的视频数据混合流,并将该混合流进行传输。
为解决上述技术问题,本申请实施例还提供了一种存储介质,其中,该存储介质用于存储应用程序,所述应用程序用于在运行时执行本申请所述的一种用于高速公路的视频解码方法。其中,本申请所述的一种用于高速公路的视频解码方法,包括:
解析带有元数据的视频数据混合流得到视频压缩数据和前景目标元数据;
对视频压缩数据进行解码得到背景图像,对前景目标元数据进行解读得到前景图像;
根据元数据中的位置信息和时间信息,将前景图像叠加到背景图像的对应位置上,进行合成显示,重新复原所采集到的视频图像。
为解决上述技术问题,本申请实施例还提供了一种应用程序,其中,该应用程序用于在运行时执行本申请所述的一种视频编码方法。其中,本申请所述的一种视频编码方法,包括:
采集待传输的视频图像;
对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;
传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。
为解决上述技术问题,本申请实施例还提供了一种应用程序,其中,该应用程序用于在运行时执行本申请所述的一种视频解码方法。其中,本申请所述的一种视频解码方法,包括:
接收视频压缩数据和前景目标元数据;
对视频压缩数据进行解码,对前景目标元数据进行解读;
对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
为解决上述技术问题,本申请实施例还提供了一种应用程序,其中,该应用程序用于在运行时执行本申请所述的一种用于高速公路的视频编码方法。其中,本申请所述的一种用于高速公路的视频编码方法,包括:
采集高速公路上的视频图像;
根据背景模型将一帧视频图像分离成包含静止场景的背景图像以及包含运动目标车辆的前景图像;
将背景图像压缩编码成数字阵列模式的视频压缩数据,对运动目标车辆的前景图像进行结构化处理得到的前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据;
将视频压缩数据和前景目标元数据进行混合得到带有元数据的视频数据混合流,并将该混合流进行传输。
为解决上述技术问题,本申请实施例还提供了一种应用程序,其中,该应用程序用于在运行时执行本申请所述的一种用于高速公路的视频解码方法。其中,本申请所述的一种用于高速公路的视频解码方法,包括:
解析带有元数据的视频数据混合流得到视频压缩数据和前景目标元数据;
对视频压缩数据进行解码得到背景图像,对前景目标元数据进行解读得到前景图像;
根据元数据中的位置信息和时间信息,将前景图像叠加到背景图像的对应位置上,进行合成显示,重新复原所采集到的视频图像。
为解决上述技术问题,本申请实施例还提供了一种编码设备,所述编码设备包括:处理器、存储器、通信接口和总线;
所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
所述存储器存储可执行程序代码;
所述处理器通过读取所述存储器中存储的可执行程序代码来运行与 所述可执行程序代码对应的程序,以用于:
采集待传输的视频图像;
对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;
传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。
为解决上述技术问题,本申请实施例还提供了一种解码设备,所述解码设备包括:处理器、存储器、通信接口和总线;
所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
所述存储器存储可执行程序代码;
所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:
接收视频压缩数据和前景目标元数据;
对视频压缩数据进行解码,对前景目标元数据进行解读;
对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
为解决上述技术问题,本申请实施例还提供了一种编码设备,所述编码设备包括:处理器、存储器、通信接口和总线;
所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
所述存储器存储可执行程序代码;
所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:
采集高速公路上的视频图像;
根据背景模型将一帧视频图像分离成包含静止场景的背景图像以及包含 运动目标车辆的前景图像;
将背景图像压缩编码成数字阵列模式的视频压缩数据,对运动目标车辆的前景图像进行结构化处理得到的前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据;
将视频压缩数据和前景目标元数据进行混合得到带有元数据的视频数据混合流,并将该混合流进行传输。
为解决上述技术问题,本申请实施例还提供了一种解码设备,所述解码设备包括:处理器、存储器、通信接口和总线;
所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
所述存储器存储可执行程序代码;
所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:
解析带有元数据的视频数据混合流得到视频压缩数据和前景目标元数据;
对视频压缩数据进行解码得到背景图像,对前景目标元数据进行解读得到前景图像;
根据元数据中的位置信息和时间信息,将前景图像叠加到背景图像的对应位置上,进行合成显示,重新复原所采集到的视频图像。
本领域的技术人员应该明白,上述的本申请实施例所提供的计算和/或打印机的各组成部分,以及方法中的各步骤,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上。可选地,它们可以用计算装置可执行的程序代码来实现。从而,可以将它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请不限制于任何特定的硬件和软件结合。
需要说明的是,在本专利的权利要求和说明书中,诸如第一和第二等之 类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
虽然本申请所揭露的实施方式如上,但所述的内容仅为便于理解本申请技术方案而采用的实施方式,并非用以限定本申请。任何本申请所属领域内的技术人员,在不脱离本申请所揭露的精神和范围的前提下,可以在实施的形式及细节上进行任何的修改与变化,但本申请的专利保护范围,仍须以所附的权利要求书所界定的范围为准。

Claims (32)

  1. 一种视频编码装置,包括:
    视频采集单元,其用于采集视频图像;
    处理单元,其用于对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;
    数据传输单元,其用于传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。
  2. 根据权利要求1所述的装置,其特征在于,
    所述处理单元还用于对视频图像进行背景建模,并基于建立的背景模型来检测所述前景运动目标,以分离当前帧视频图像中的背景图像和前景运动目标。
  3. 根据权利要求1所述的装置,其特征在于,
    所述数据传输单元间隔设定时间段传输对应背景图像的视频压缩数据,并实时传输对应前景运动目标的前景目标元数据。
  4. 根据权利要求1-3中任一项所述的装置,其特征在于,
    所述处理单元在对所述视频图像中的前景运动目标进行结构化处理时,采用的结构化算法包括不设定目标类型的结构化算法和设定类型目标的结构化算法。
  5. 一种视频解码装置,包括:
    数据接收单元,其用于接收视频压缩数据和前景目标元数据;
    处理单元,其用于对视频压缩数据进行解码,对前景目标元数据进行解读。
  6. 根据权利要求5所述的装置,其特征在于,还包括:
    存储单元,其用于存储图像,
    所述处理单元进一步根据所述前景目标元数据的信息从所述存储单元中选择对应的前景目标图像作为前景运动目标,实现对前景目标元数据的解读。
  7. 根据权利要求5所述的装置,其特征在于,
    所述处理单元,根据所述前景目标元数据的信息,利用显示绘图技术在解码后的背景图像上叠加绘制所述前景目标元数据所描述的前景运动目标,实现对前景目标元数据的解读。
  8. 根据权利要求5-7中任一项所述的装置,其特征在于,还包括:
    视频显示单元,其用于对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
  9. 一种视频传输显示系统,包括:
    如权利要求1-4中任一项所述的视频编码装置,以及
    如权利要求5-8中任一项所述的视频解码装置。
  10. 一种视频编码方法,包括:
    采集待传输的视频图像;
    对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;
    传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。
  11. 根据权利要求10所述的方法,其特征在于,还包括:
    对视频图像进行背景建模,并基于建立的背景模型来检测所述前景运动目标,以分离当前帧视频图像中的背景图像和前景运动目标。
  12. 根据权利要求10所述的方法,其特征在于,
    间隔设定时间段传输对应背景图像的视频压缩数据,并实时传输对应前景运动目标的前景目标元数据。
  13. 根据权利要求10-12中任一项所述的方法,其特征在于,
    在对所述视频图像中的前景运动目标进行结构化处理时,采用的结构化算法包括目标类型的结构化算法和设定类型目标的结构化算法。
  14. 一种视频解码方法,包括:
    接收视频压缩数据和前景目标元数据;
    对视频压缩数据进行解码,对前景目标元数据进行解读;
    对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
  15. 根据权利要求14所述的方法,其特征在于,在对解码后得到的背景图像和解读后的前景运动目标进行合成显示的步骤中,进一步包括:
    根据所述前景目标元数据的信息从预先存储的图像中选择对应的前景目标图像作为前景运动目标,将该前景目标图像与解码后的背景图像进行合成显示。
  16. 根据权利要求14所述的方法,其特征在于,在对解码后得到的背景图像和解读后的前景运动目标进行合成显示的步骤中,进一步包括:
    根据所述前景目标元数据的信息,利用显示绘图技术在解码后的背景图像上叠加绘制所述前景目标元数据所描述的前景运动目标。
  17. 一种用于高速公路的视频编码方法,包括:
    采集高速公路上的视频图像;
    根据背景模型将一帧视频图像分离成包含静止场景的背景图像以及包含运动目标车辆的前景图像;
    将背景图像压缩编码成数字阵列模式的视频压缩数据,对运动目标车辆的前景图像进行结构化处理得到的前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据;
    将视频压缩数据和前景目标元数据进行混合得到带有元数据的视频数据混合流,并将该混合流进行传输。
  18. 根据权利要求17所述的方法,其特征在于,
    所述前景目标元数据至少包括:车辆类型、车辆颜色、车辆品牌、车辆型号、车牌号、前景目标在该帧视频图像中的位置、该帧视频图像的时间。
  19. 一种用于高速公路的视频解码方法,包括:
    解析带有元数据的视频数据混合流得到视频压缩数据和前景目标元数据;
    对视频压缩数据进行解码得到背景图像,对前景目标元数据进行解读得到前景图像;
    根据元数据中的位置信息和时间信息,将前景图像叠加到背景图像的对应位置上,进行合成显示,重新复原所采集到的视频图像。
  20. 根据权利要求19所述的方法,其特征在于,在对前景目标元数据进行解读得到前景图像的步骤中,包括:
    根据所述前景目标元数据的信息选择对应的前景目标图像作为前景运动目标,或者,
    根据所述前景目标元数据的信息,利用显示绘图技术在解码后的背景图像上叠加绘制所述前景目标元数据所描述的前景运动目标。
  21. 一种存储介质,其特征在于,所述存储介质用于存储应用程序,所述应用程序用于在运行时执行以下步骤:
    采集待传输的视频图像;
    对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;
    传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。
  22. 一种存储介质,其特征在于,所述存储介质用于存储应用程序,所述应用程序用于在运行时执行以下步骤:
    接收视频压缩数据和前景目标元数据;
    对视频压缩数据进行解码,对前景目标元数据进行解读;
    对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
  23. 一种存储介质,其特征在于,所述存储介质用于存储应用程序,所述应用程序用于在运行时执行以下步骤:
    采集高速公路上的视频图像;
    根据背景模型将一帧视频图像分离成包含静止场景的背景图像以及包含运动目标车辆的前景图像;
    将背景图像压缩编码成数字阵列模式的视频压缩数据,对运动目标车辆的前景图像进行结构化处理得到的前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据;
    将视频压缩数据和前景目标元数据进行混合得到带有元数据的视频数据混合流,并将该混合流进行传输。
  24. 一种存储介质,其特征在于,所述存储介质用于存储应用程序,所述应用程序用于在运行时执行以下步骤:
    解析带有元数据的视频数据混合流得到视频压缩数据和前景目标元数据;
    对视频压缩数据进行解码得到背景图像,对前景目标元数据进行解读得到前景图像;
    根据元数据中的位置信息和时间信息,将前景图像叠加到背景图像的对应位置上,进行合成显示,重新复原所采集到的视频图像。
  25. 一种应用程序,其特征在于,所述应用程序用于在运行时执行以下步骤:
    采集待传输的视频图像;
    对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;
    传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。
  26. 一种应用程序,其特征在于,所述应用程序用于在运行时执行以下步骤:
    接收视频压缩数据和前景目标元数据;
    对视频压缩数据进行解码,对前景目标元数据进行解读;
    对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
  27. 一种应用程序,其特征在于,所述应用程序用于在运行时执行以下步骤:
    采集高速公路上的视频图像;
    根据背景模型将一帧视频图像分离成包含静止场景的背景图像以及包含运动目标车辆的前景图像;
    将背景图像压缩编码成数字阵列模式的视频压缩数据,对运动目标车辆的前景图像进行结构化处理得到的前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据;
    将视频压缩数据和前景目标元数据进行混合得到带有元数据的视频数据混合流,并将该混合流进行传输。
  28. 一种应用程序,其特征在于,所述应用程序用于在运行时执行以下步骤:
    解析带有元数据的视频数据混合流得到视频压缩数据和前景目标元数据;
    对视频压缩数据进行解码得到背景图像,对前景目标元数据进行解读得到前景图像;
    根据元数据中的位置信息和时间信息,将前景图像叠加到背景图像的对应位置上,进行合成显示,重新复原所采集到的视频图像。
  29. 一种编码设备,其特征在于,所述编码设备包括:处理器、存储器、通信接口和总线;
    所述处理器、所述存储器和所述通信接口通过所述总线连接并完成 相互间的通信;
    所述存储器存储可执行程序代码;
    所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:
    采集待传输的视频图像;
    对所述视频图像中的背景图像进行压缩编码得到视频压缩数据,以及对所述视频图像中的前景运动目标进行结构化处理得到前景目标元数据;
    传输所述视频压缩数据和所述前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据。
  30. 一种解码设备,其特征在于,所述解码设备包括:处理器、存储器、通信接口和总线;
    所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
    所述存储器存储可执行程序代码;
    所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:
    接收视频压缩数据和前景目标元数据;
    对视频压缩数据进行解码,对前景目标元数据进行解读;
    对解码后得到的背景图像和解读后的前景运动目标进行合成显示。
  31. 一种编码设备,其特征在于,所述编码设备包括:处理器、存储器、通信接口和总线;
    所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
    所述存储器存储可执行程序代码;
    所述处理器通过读取所述存储器中存储的可执行程序代码来运行与 所述可执行程序代码对应的程序,以用于:
    采集高速公路上的视频图像;
    根据背景模型将一帧视频图像分离成包含静止场景的背景图像以及包含运动目标车辆的前景图像;
    将背景图像压缩编码成数字阵列模式的视频压缩数据,对运动目标车辆的前景图像进行结构化处理得到的前景目标元数据,其中,所述前景目标元数据是存储了视频结构化语义信息的数据;
    将视频压缩数据和前景目标元数据进行混合得到带有元数据的视频数据混合流,并将该混合流进行传输。
  32. 一种解码设备,其特征在于,所述解码设备包括:处理器、存储器、通信接口和总线;
    所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
    所述存储器存储可执行程序代码;
    所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:
    解析带有元数据的视频数据混合流得到视频压缩数据和前景目标元数据;
    对视频压缩数据进行解码得到背景图像,对前景目标元数据进行解读得到前景图像;
    根据元数据中的位置信息和时间信息,将前景图像叠加到背景图像的对应位置上,进行合成显示,重新复原所采集到的视频图像。
PCT/CN2015/098060 2015-04-30 2015-12-21 视频编码方法、解码方法及其装置 WO2016173277A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US15/569,840 US10638142B2 (en) 2015-04-30 2015-12-21 Video coding and decoding methods and apparatus
EP15890651.1A EP3291558B1 (en) 2015-04-30 2015-12-21 Video coding and decoding methods and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510216640.0A CN106210612A (zh) 2015-04-30 2015-04-30 视频编码方法、解码方法及其装置
CN201510216640.0 2015-04-30

Publications (2)

Publication Number Publication Date
WO2016173277A1 true WO2016173277A1 (zh) 2016-11-03
WO2016173277A9 WO2016173277A9 (zh) 2016-12-22

Family

ID=57198093

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/098060 WO2016173277A1 (zh) 2015-04-30 2015-12-21 视频编码方法、解码方法及其装置

Country Status (4)

Country Link
US (1) US10638142B2 (zh)
EP (1) EP3291558B1 (zh)
CN (1) CN106210612A (zh)
WO (1) WO2016173277A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110602504A (zh) * 2019-10-09 2019-12-20 山东浪潮人工智能研究院有限公司 一种基于YOLOv2目标检测算法的视频解压缩方法及系统

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108089152B (zh) * 2016-11-23 2020-07-03 杭州海康威视数字技术股份有限公司 一种设备控制方法、装置及系统
CN106781498A (zh) * 2017-01-10 2017-05-31 成都通甲优博科技有限责任公司 一种高速公路的车流量统计方法
CN106603999A (zh) * 2017-02-17 2017-04-26 上海创米科技有限公司 视频监控报警方法与系统
CN109218660B (zh) * 2017-07-07 2021-10-12 中兴通讯股份有限公司 一种视频处理方法及装置
CN107835381B (zh) * 2017-10-17 2019-11-19 浙江大华技术股份有限公司 一种生成动检录像预览图的方法及装置
JP2019103067A (ja) 2017-12-06 2019-06-24 キヤノン株式会社 情報処理装置、記憶装置、画像処理装置、画像処理システム、制御方法、及びプログラム
CN108898072A (zh) * 2018-06-11 2018-11-27 东莞中国科学院云计算产业技术创新与育成中心 一种面向公安刑侦应用的视频图像智能研判系统
DE102018221920A1 (de) 2018-12-17 2020-06-18 Robert Bosch Gmbh Inhaltsadaptive verlustbehaftete Kompression von Messdaten
CN109831638B (zh) * 2019-01-23 2021-01-08 广州视源电子科技股份有限公司 视频图像传输方法、装置、交互智能平板和存储介质
CN109951710B (zh) * 2019-03-26 2021-07-02 中国民航大学 基于深度学习的机坪监控视频压缩方法及系统
US11245909B2 (en) * 2019-04-29 2022-02-08 Baidu Usa Llc Timestamp and metadata processing for video compression in autonomous driving vehicles
CN110677670A (zh) * 2019-08-21 2020-01-10 咪咕视讯科技有限公司 一种图像压缩、解压的方法、装置、电子设备及系统
CN110517215B (zh) * 2019-08-28 2022-03-25 咪咕视讯科技有限公司 一种视频压缩处理方法、电子设备及存储介质
CN110784672B (zh) * 2019-10-11 2021-05-14 腾讯科技(深圳)有限公司 一种视频数据传输方法、装置、设备及存储介质
US10771272B1 (en) * 2019-11-01 2020-09-08 Microsoft Technology Licensing, Llc Throttling and prioritization for multichannel audio and/or multiple data streams for conferencing
CN110868600B (zh) * 2019-11-11 2022-04-26 腾讯云计算(北京)有限责任公司 目标跟踪视频推流方法、显示方法、装置和存储介质
CN111246176A (zh) * 2020-01-20 2020-06-05 北京中科晶上科技股份有限公司 一种节带化视频传输方法
CN113301337A (zh) * 2020-02-24 2021-08-24 北京三星通信技术研究有限公司 编解码方法和装置
CN111372062B (zh) * 2020-05-02 2021-04-20 北京花兰德科技咨询服务有限公司 人工智能图像通信系统及记录方法
DE102020206963A1 (de) 2020-06-04 2021-12-09 Robert Bosch Gesellschaft mit beschränkter Haftung Verfahren und System zur Verdichtung eines Datenstroms zwischen Netzwerkvorrichtungen
CN112052351A (zh) * 2020-07-28 2020-12-08 上海工程技术大学 一种用于动态环境的监控系统
CN111935487B (zh) * 2020-08-12 2022-08-12 北京广慧金通教育科技有限公司 一种基于视频流检测的图像压缩方法及系统
CN112367492A (zh) * 2020-11-06 2021-02-12 江西经济管理干部学院 一种低带宽人工智能人像视频传输方法
US11425412B1 (en) * 2020-11-10 2022-08-23 Amazon Technologies, Inc. Motion cues for video encoding
CN112509146B (zh) * 2020-11-23 2023-06-20 歌尔科技有限公司 图像处理方法、装置、电子设备及存储介质
CN112822497B (zh) * 2020-12-01 2024-02-02 青岛大学 基于边缘计算的视频压缩编码处理方法及相关组件
CN113965749A (zh) * 2020-12-14 2022-01-21 深圳市云数链科技有限公司 静态摄像机视频传输方法及系统
CN112929662B (zh) * 2021-01-29 2022-09-30 中国科学技术大学 解决码流结构化图像编码方法中对象重叠问题的编码方法
CN117857816A (zh) * 2022-09-30 2024-04-09 中国电信股份有限公司 视频传输方法、装置、电子设备及存储介质
CN116209069B (zh) * 2023-04-25 2023-07-21 北京邮电大学 基于语义域的多址接入方法及相关设备
CN116634178B (zh) * 2023-07-26 2023-10-31 清华大学 一种极低码率的安防场景监控视频编解码方法及系统
CN116708725B (zh) * 2023-08-07 2023-10-31 清华大学 基于语义编解码的低带宽人群场景安防监控方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110064309A1 (en) * 2009-09-15 2011-03-17 Ricoh Company, Limited Image processing apparatus and image processing method
CN102982311A (zh) * 2012-09-21 2013-03-20 公安部第三研究所 基于视频结构化描述的车辆视频特征提取系统及方法
CN103020624A (zh) * 2011-09-23 2013-04-03 杭州海康威视系统技术有限公司 混合车道监控视频智能标记、检索回放方法及其装置
CN103475882A (zh) * 2013-09-13 2013-12-25 北京大学 监控视频的编码、识别方法和监控视频的编码、识别系统
CN104301735A (zh) * 2014-10-31 2015-01-21 武汉大学 城市交通监控视频全局编码方法及系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2932046B1 (fr) * 2008-06-03 2010-08-20 Thales Sa Procede et systeme permettant de crypter visuellement les objets mobiles au sein d'un flux video compresse
EP2338278B1 (en) * 2008-09-16 2015-02-25 Intel Corporation Method for presenting an interactive video/multimedia application using content-aware metadata
CN102006473B (zh) * 2010-11-18 2013-03-13 无锡中星微电子有限公司 视频编码器和编码方法以及视频解码器和解码方法
CN103179402A (zh) * 2013-03-19 2013-06-26 中国科学院半导体研究所 一种视频压缩编码与解码方法及其装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110064309A1 (en) * 2009-09-15 2011-03-17 Ricoh Company, Limited Image processing apparatus and image processing method
CN103020624A (zh) * 2011-09-23 2013-04-03 杭州海康威视系统技术有限公司 混合车道监控视频智能标记、检索回放方法及其装置
CN102982311A (zh) * 2012-09-21 2013-03-20 公安部第三研究所 基于视频结构化描述的车辆视频特征提取系统及方法
CN103475882A (zh) * 2013-09-13 2013-12-25 北京大学 监控视频的编码、识别方法和监控视频的编码、识别系统
CN104301735A (zh) * 2014-10-31 2015-01-21 武汉大学 城市交通监控视频全局编码方法及系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110602504A (zh) * 2019-10-09 2019-12-20 山东浪潮人工智能研究院有限公司 一种基于YOLOv2目标检测算法的视频解压缩方法及系统

Also Published As

Publication number Publication date
EP3291558A1 (en) 2018-03-07
WO2016173277A9 (zh) 2016-12-22
EP3291558A4 (en) 2018-09-12
EP3291558B1 (en) 2021-05-12
US10638142B2 (en) 2020-04-28
CN106210612A (zh) 2016-12-07
US20180131950A1 (en) 2018-05-10

Similar Documents

Publication Publication Date Title
WO2016173277A9 (zh) 视频编码方法、解码方法及其装置
CA2967495C (en) System and method for compressing video data
KR102194499B1 (ko) 객체 이미지 인식 dcnn 기반 cctv 영상분석장치 및 그 장치의 구동방법
US8620026B2 (en) Video-based detection of multiple object types under varying poses
US20140063247A1 (en) Video-based vehicle speed estimation from motion vectors in video streams
CN109063667B (zh) 一种基于场景的视频识别方式优化及推送方法
CN112800860A (zh) 一种事件相机和视觉相机协同的高速抛撒物检测方法和系统
WO2014208963A1 (ko) 적응적 블록 분할을 이용한 다중 객체 검출 장치 및 방법
Rabiu Vehicle detection and classification for cluttered urban intersection
CN116129291A (zh) 一种面向无人机畜牧的图像目标识别方法及其装置
JP7255819B2 (ja) 映像ストリームからの物体検出において用いるためのシステム及び方法
Grbić et al. Automatic vision-based parking slot detection and occupancy classification
KR102127276B1 (ko) 복수의 고해상도 카메라들을 이용한 파노라마 영상 감시 시스템 및 그 방법
Devi et al. A survey on different background subtraction method for moving object detection
Ko et al. An energy-quality scalable wireless image sensor node for object-based video surveillance
CN110503049B (zh) 基于生成对抗网络的卫星视频车辆数目估计方法
Schreiber et al. GPU-based non-parametric background subtraction for a practical surveillance system
Muniruzzaman et al. Deterministic algorithm for traffic detection in free-flow and congestion using video sensor
Tavakkoli et al. A support vector data description approach for background modeling in videos with quasi-stationary backgrounds
Sindoori et al. Adaboost technique for vehicle detection in aerial surveillance
Sri Jamiya et al. A survey on vehicle detection and tracking algorithms in real time video surveillance
GB2598640A (en) Processing of images captured by vehicle mounted cameras
Oh et al. Improved deeplab v3+ with metadata extraction for small object detection in intelligent visual surveillance systems
Fleck et al. Low-Power Traffic Surveillance using Multiple RGB and Event Cameras: A Survey
Sankaranarayanan et al. Improved Vehicle Detection Accuracy and Processing Time for Video Based ITS Applications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15890651

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 15569840

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE