WO2020192095A1 - 一种监控视频背景帧的编码方法、装置、电子设备及介质 - Google Patents

一种监控视频背景帧的编码方法、装置、电子设备及介质 Download PDF

Info

Publication number
WO2020192095A1
WO2020192095A1 PCT/CN2019/111948 CN2019111948W WO2020192095A1 WO 2020192095 A1 WO2020192095 A1 WO 2020192095A1 CN 2019111948 W CN2019111948 W CN 2019111948W WO 2020192095 A1 WO2020192095 A1 WO 2020192095A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
background
detection model
frames
monitoring element
Prior art date
Application number
PCT/CN2019/111948
Other languages
English (en)
French (fr)
Inventor
严柯森
吴辉
Original Assignee
浙江宇视科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江宇视科技有限公司 filed Critical 浙江宇视科技有限公司
Publication of WO2020192095A1 publication Critical patent/WO2020192095A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/142Detection of scene cut or scene change
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast

Definitions

  • This application relates to the technical field of video coding, for example, to a coding method, device, electronic device, and computer-readable storage medium for monitoring video background frames.
  • surveillance video has a distinguishing feature. It is because it is impossible to predict the appearance time of surveillance elements (such as people and cars and other non-fixed objects).
  • surveillance cameras usually work 24 hours a day, but for users, surveillance video
  • the function is to record the behavior of the monitoring elements.
  • the parts that do not contain the monitoring elements are usually skipped or discarded as invalid data that affects the capture of valid information.
  • the part of the surveillance video that does not contain the surveillance elements is not cared by the user, and the user only cares about the part that contains the surveillance elements.
  • the industry usually adopts a method to reduce the picture quality of the background frame that does not contain the surveillance elements. The way to achieve the balance between high picture quality and the amount of transmitted data as much as possible, in order to improve the picture quality while the amount of data does not increase significantly.
  • the SKIP coding mode is the coding mode with the highest compression rate, so it is widely used in the coding of continuous background frames (that is, frames that do not contain monitoring elements).
  • P frame also known as inter-frame predictive encoding frame
  • P frame records the deviation of the frame from the previous frame on the image
  • B frame also known as bidirectional predictive encoding Frame
  • the B frame simultaneously records the deviation of the frame relative to the previous frame and the next frame on the image.
  • FIG. 1 A schematic diagram of coding based on related technologies can be seen in FIG. 1.
  • This application provides a monitoring video background frame encoding method, device, electronic equipment and computer readable storage medium, which can solve that each non-I frame in the continuous background frame group needs to be based on different associated frames to perform SKIP Under the encoding mechanism, the problem of poor anti-packet and anti-frame loss due to continuity.
  • This application provides an encoding method for monitoring video background frames.
  • the encoding method includes:
  • a continuous background frame group is selected from the frames of the surveillance video; wherein the continuous background frame group includes a plurality of background frames, and there is only one I frame among the plurality of background frames, and the I frame is the continuous background The first frame of the frame group;
  • Each frame in the continuous background frame group except the I frame is SKIP-encoded with the I frame as a reference object.
  • This application also provides an encoding device for monitoring video background frames, which includes:
  • the continuous background frame group selection unit is configured to select a continuous background frame group from the frames of the surveillance video; wherein the continuous background frame group includes a plurality of background frames, and there is only one I frame among the plurality of background frames, and The I frame is the first frame of the continuous background frame group;
  • the frame-level SKIP encoding unit is configured to perform SKIP encoding on each frame in the continuous background frame group except for the I frame using the I frame as a reference object.
  • This application also provides an electronic device, which includes:
  • Memory used to store computer programs
  • the processor is used to implement the monitoring video background frame encoding method described in the above content when the computer program is executed.
  • the present application also provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method for encoding a surveillance video background frame as described above is implemented.
  • Fig. 1 is a schematic diagram of encoding of each background frame in a continuous background frame group in the related art
  • FIG. 2 is a flowchart of a method for encoding a surveillance video background frame provided by an embodiment of the application
  • FIG. 3 is a schematic diagram of encoding each background frame in the continuous background frame group corresponding to the encoding method shown in FIG. 2;
  • FIG. 5 is a flowchart of another method for encoding a background frame of a surveillance video provided by an embodiment of the application;
  • FIG. 6 is a flowchart of still another method for encoding a background frame of a surveillance video provided by an embodiment of the application;
  • FIG. 7 is a flowchart of still another method for encoding a background frame of a surveillance video provided by an embodiment of the application.
  • FIG. 8 is a flowchart of a method for encoding monitoring element frames in a monitoring video provided by an embodiment of the application
  • FIG. 9 is a schematic flowchart of a monitoring video frame encoding method provided by an embodiment of the application.
  • FIG. 10 is a structural block diagram of an encoding device for monitoring video background frames provided by an embodiment of the application.
  • FIG. 11 is a structural block diagram of an electronic device provided by an embodiment of the application.
  • This application provides a monitoring video background frame encoding method, device, electronic equipment and computer readable storage medium, which can solve that each non-I frame in the continuous background frame group needs to be based on different associated frames to perform SKIP Under the encoding mechanism, the problem of poor anti-packet and anti-frame loss due to continuity.
  • FIG. 2 is a flowchart of a method for encoding a surveillance video background frame provided by an embodiment of the application, which includes the following steps:
  • a continuous background frame group can be selected from the surveillance video captured by the surveillance camera on the target surveillance area.
  • each continuous background frame group usually contains only one I frame (ie: intra-frame coded frame), and the I frame is usually used as the first continuous background frame group.
  • Frames exist in the continuous background frame group, and other frames in the continuous background frame group except the I frame can be B frames or P frames according to actual application scenarios and user settings.
  • the new I frame in the frame sequence usually exists as the first frame of another continuous background frame group, rather than being placed in the continuous background frame group where the previous I frame is located.
  • a continuous background frame group In order to select a continuous background frame group, you can first select a background frame from each frame that actually constitutes the surveillance video, and then form a continuous background frame group with continuous, small difference and high similarity background frames, continuous background The number of background frames contained in a frame group is not fixed. In determining whether it can form a continuous background frame group with other continuous background frames, it mainly depends on the similarity and difference between the current background frame and other continuous background frames , According to all the special requirements that may exist in the actual application scenario, the number of background frames included in the continuous background frame group can be flexibly adjusted.
  • Methods for selecting background frames from the frames that make up the surveillance video can include the following: background frame selection methods based on background features, background frame selection methods based on monitoring elements, and backgrounds that combine the two features of background features and monitoring elements Frame selection method.
  • the background frame selection method based on background features refers to the method to determine whether the current frame is a background frame by judging whether the current frame has the same background features as the preset background frame;
  • the background frame selection method based on monitoring elements refers to: A method of judging whether there is a monitoring element consistent with the characteristics of the preset monitoring element in the current frame to determine whether the current frame is a background frame. It should be noted that the above-mentioned judgment of whether it is a background frame based solely on the presence or absence of the background feature or the monitoring element feature may cause misjudgment, because even if the same background feature exists, it is not necessarily the background frame. When the monitoring element does not affect When the background feature mode appears in the current frame, misjudgment will occur. Similarly, if there is no monitoring element, it is not necessarily the background frame.
  • the current frame is a background frame from different angles at the same time, so that it does not include monitoring elements and contains the same background features as the preset background features.
  • the current frame of can be selected as the background frame.
  • Algorithms for extracting and detecting background features/monitoring features can include the following:
  • Feature extraction methods include: the simplest feature area delineation method (the feature area delineation method is a method that realizes the extraction of features in the area by determining the area where the feature is located), the motion/behavior feature determination method (motion/behavior feature determination method) It is a method of extracting corresponding features based on the relative movement of the monitored elements compared to the background), ROI (Region Of Interest, region of interest) determination method (ROI determination method is a method of automatically delineating the region of interest )
  • ROI Region Of Interest, region of interest
  • ROI determination method is a method of automatically delineating the region of interest
  • feature detection is usually based on the comparison of similarity or difference, but there are many algorithms based on this principle, such as gray distribution, gray difference, gray average, etc.
  • the similarity comparison method of gray-scale related parameters can also be based on other parameters with the same or similar functions. This part of the content is already well known by those skilled in the art and will not be repeated one by one.
  • the extraction model based on deep learning algorithms can extract more and deeper features through the algorithm structure imitating the structure of human neurons, based on deep features. Compared with the similarity comparison of conventional surface features, the similarity comparison of, the effect is better, and the misjudgment rate is lower.
  • S102 Perform SKIP encoding on each frame except the I frame in the continuous background frame group using the I frame as a reference object.
  • every other frame that is not an I frame in the continuous background frame group can be SKIP-encoded with the I frame as the reference object.
  • Figure 3 shows that the frame structure is customized to include only I and P frames. Therefore, in the continuous background frame group shown in Figure 3, except for the I frame of the first frame, the rest All are P frames. In some embodiments, except for the I frame of the first frame in the continuous background frame group, the rest are P frames or B frames.
  • each P frame in the continuous background frame group has a line pointing to the I frame, and the arrow of the line points to the reference object when SKIP encoding is performed, which is compared with Figure 1 of the corresponding related technology. It can be seen that under the new coding method provided by this application, the reference object of each non-I frame in the continuous background frame group is the same, but is different from the related technology in which each non-I frame has its own reference object.
  • the coding no longer has continuity, so in the case of the I frame is not lost, the loss of any non-I frame coding result will not affect the coding of other non-I frames, so that it is necessary to ensure that the stable existence of the frame is from the original continuous background
  • Each background frame in the frame group becomes an I frame with a number of 1, which significantly improves the anti-packet loss and anti-frame loss capabilities.
  • there is no need to refer to the previous frame or the previous frame and the next frame there is no need to calculate the deviation between two P frames or P frames and B frames, which can also reduce the amount of calculation and calculation in the encoding process to a certain extent. The amount of encoding.
  • the current frame in the related art requires the previous frame, or both the previous frame and the next frame as the reference object to be encoded, so that the encoding process of each background frame in the continuous background frame group is linear and continuous.
  • Serial processing cannot fully utilize the processor’s multi-process concurrent processing capabilities.
  • the encoding of each background frame can be non-linear and discontinuous due to the fixed reference object. It is possible to use multi-process concurrency technology to accelerate the encoding process and effectively improve the encoding speed.
  • This application provides a new coding mechanism in which each non-I frame in a continuous background frame group uses the same I frame as a reference object for SKIP encoding. That is, each non-I frame in a continuous background frame group is no longer It needs to be based on mutually different associated frames as the reference object for SKIP encoding, but each non-I frame only needs to be based on the same associated frame as the reference object for encoding, because each background frame in the continuous background frame group The difference between them is very small, so the picture quality based on the new coding method will not be significantly different from the related technology, but the application of the new coding mechanism will significantly improve the anti-frame and anti-packet ability due to the unification of related frames.
  • FIG. 4 is a flowchart of another method for encoding a surveillance video background frame provided by an embodiment of this application. Based on the first embodiment, this embodiment provides A method for improving the encoding speed through multi-process concurrency technology, including the following steps:
  • this step can take out all non-I frames from the continuous background frame group, so that these non-I frames can be allocated to multiple threads or multiple coroutines provided by the multi-threaded concurrency technology to perform encoding operations simultaneously.
  • S203 Perform SKIP encoding on each frame in the continuous background frame group except the I frame by using the multi-thread concurrency technology at the same time using the I frame as a reference object.
  • each of the other frames can be allocated to multiple processes or multiple coroutines provided by the multi-threaded concurrency technology, so as to improve by executing each relatively independent process or coroutine at the same time.
  • Overall encoding speed When the multi-threaded concurrency technology cannot provide a process or coroutine for each non-I frame in the continuous background frame group at one time, it can also be used to group each non-I frame according to the number of processes or coroutines that can actually be provided , So that each process or coroutine is responsible for the encoding of non-I frames in the corresponding group.
  • This method can also be called a micro-batch method.
  • a special sorting mark or timestamp can be added during batch encoding, so that after the encoding is completed, it can be arranged according to the timestamp or sorting mark. The same sequence.
  • the new coding method provided in the first embodiment liberates the requirement for coding continuity in the coding process, and therefore makes the application of the parallel mode possible.
  • This embodiment provides a way to use multiple processes from the perspective of how to improve the overall coding efficiency. Concurrent technology is used to batch process unrelated coding operations, which significantly improves the overall coding efficiency and makes full use of the processor's multi-process processing capabilities.
  • FIG. 5 is a flowchart of another method for encoding background frames of surveillance video provided by an embodiment of the application.
  • this embodiment On the basis of any of the foregoing embodiments, a special structure based on a deep learning algorithm is provided to enable feature extraction and detection to include more deep features, including the following steps:
  • S301 Use the background frame detection model to select a continuous background frame group from the actual surveillance video frames
  • the background frame detection model used in this step is a detection model obtained after training with a deep learning algorithm based on real background frames.
  • the process of obtaining a corresponding model based on the deep learning algorithm can be:
  • the deep learning algorithm extracts the common target features hidden behind the sample data through the internal multi-layer structure
  • a classifier based on the target feature is constructed to use the classifier to distinguish actual frames that contain features that are the same or similar to the target feature from actual frames that do not contain features that are the same or similar to the target feature.
  • Deep learning algorithms are also divided into two categories: supervised and unsupervised according to whether they need to provide guidance information.
  • Supervised means that while providing sample data, it also provides some targeted guidance information, which is suitable for those with clear requirements for characteristics.
  • Application scenarios can achieve better classification and detection results; unsupervised is the opposite. Because targeted guidance information is not provided, the extracted features may have large deviations from expectations, which is suitable for features that have no clear requirements or needs In this way, find an application scenario with suitable characteristics. Combining the requirements of this application for background features, using a supervised deep learning algorithm can achieve better results.
  • Deep learning algorithms can choose common convolutional neural networks, deep residual networks, etc., because different algorithms have different pertinence, which one is more suitable can also be concluded according to a limited number of tests in actual application scenarios.
  • use The difference between the activation function and the loss function may also lead to different detection effects.
  • the similarity determination algorithm can also be added to complete which continuous background frames can form a continuous background frame group. Judgment, the judgment algorithm can be used as a part of the model, or it can exist alone, and it is more integrated when it exists as part of the model.
  • S302 Perform SKIP encoding on each frame except the I frame in the continuous background frame group using the I frame as a reference object.
  • the monitoring element detection model is a detection model obtained after training using a deep learning algorithm based on the characteristics of real monitoring elements. It can be seen that, different from the implementation steps shown in Figure 5, the implementation steps shown in Figure 6 also use the deep learning algorithm's ability to extract deep features, but the extracted features are different from the background features.
  • the feature of the monitoring element, the feature of the monitoring element is another feature that can be used to determine whether it is a background frame. The other parts are the same as those shown in Figure 5, and will not be repeated one by one.
  • S402 Perform SKIP encoding on each frame except the I frame in the continuous background frame group using the I frame as a reference object.
  • Figures 5 and 6 show the extraction of deep-level features by the deep learning algorithm and introduces the background features and the features of the monitoring elements, but the detection is still completed based on one type of feature alone. It has been explained in the S101 section, and it is based on one type alone. The detection result obtained by the feature may be inaccurate. Therefore, this application also provides an implementation method for joint detection using these two types of features, and the detection model of each feature is a detection model constructed based on a deep learning algorithm, in order to obtain Test results as accurate as possible:
  • S501 Use the background frame detection model to detect whether each frame of the actual surveillance video contains background features; wherein, the background frame detection model is a detection model obtained after training using a deep learning algorithm based on real background frames, and the background features are derived from the deep learning algorithm Extracted from the real background frame;
  • S502 Use the monitoring element detection model to detect whether each frame of the actual monitoring video contains monitoring elements
  • the monitoring element detection model is a detection model obtained after training using a deep learning algorithm based on the characteristics of real monitoring elements.
  • S503 Select a set of continuous frames that contain background features and do not contain monitoring elements as the continuous background frame group.
  • the two types of methods given above for distinguishing whether the current frame is a background frame can be used to distinguish whether the current frame is a background frame, but because of their different methods, they often affect the same conclusion. The degree is also different. For example, when determining a parameter that is affected by multiple factors, each factor will often cause a different amount of change in the parameter under the same amount of change, that is to say, its influence on the parameter or conclusion is different. Similarly, when combining background features and monitoring element features at the same time, in different application scenarios, the two types of features have different degrees of influence on the conclusion that the current frame is the background frame, so they can also be detected based on the background frame. The accuracy of the model and the monitoring element detection model on the background frame in the actual surveillance video. The background frame detection model and the monitoring element detection model are assigned corresponding weights to obtain a more accurate selection through a weighted calculation method based on weights. Background frame.
  • the accuracy of using the background frame detection model to determine whether a frame is a background frame alone is 80%
  • the accuracy of using the monitoring element detection model to determine whether a frame is a background frame alone is 70%
  • the accuracy can be regarded as its respective Weights, and use each weight as a product factor with the background frame evaluation probability obtained after the corresponding model discriminates the frame.
  • the values obtained from the two models are added to obtain a background frame comprehensive evaluation probability based on the synthesis The evaluation probability evaluates whether the frame is a background frame.
  • the evaluation probability is 85%
  • the monitoring element detection model determines that the target frame is a background frame.
  • the evaluation probability is 80%.
  • a comprehensive evaluation probability threshold with a value of 1.15 can be formulated so that only the value calculated by the weighted calculation method exceeds 1.15.
  • the frame for evaluating the probability is discriminated as a background frame. Among them, the size of the comprehensive evaluation probability threshold can be set according to the actual situation.
  • the background frame detection model or the monitoring element detection model is trained as a binary classifier.
  • the above-mentioned weight method is not applicable to the binary classifier, which can only obtain the background frame or It does not belong to the discrimination result of the background frame.
  • each sub-detection model can also be assigned a corresponding weight according to the degree of influence on the overall detection result, to obtain a comprehensive result through a weighted calculation method, and obtain a corresponding conclusion based on the comprehensive result.
  • this embodiment introduces a deep learning algorithm imitating the structure of human neurons into the feature extraction part from the perspective of improving the accuracy of background frame discrimination, and uses it to mine deep features The ability to improve the discrimination accuracy of background frames in the process of extracting feature comparisons.
  • Figures 5 and 6 provide an implementation method for a certain type of feature respectively.
  • Figure 7 provides an alternative, Combining two types of features at the same time to distinguish the realization method.
  • FIG. 8 is a flowchart of a method for encoding monitoring element frames in a surveillance video provided by an embodiment of this application.
  • this embodiment additionally provides a A coding method for non-background frames cooperates with the coding method for background frames to jointly form a method for coding all types of frames of surveillance video, including the following steps:
  • S601 Select a frame that is a non-background frame in the surveillance video as a surveillance element frame;
  • the monitoring element frame opposite to the background frame that constitutes the continuous background frame group selected in S101 can be obtained.
  • the non-background frame can but does not necessarily contain monitoring elements.
  • Monitoring feature frame may be an actual frame that does not contain background features but contains the characteristics of the monitoring element.
  • S602 Mark the area where the monitoring element is located in the monitoring element frame as the monitoring element area;
  • S603 Encode the monitoring element area according to the inter-frame or intra-frame encoding method
  • S602 and S603 provide Determine the area where the monitoring element is located, and use high-quality intra-frame or inter-frame encoding for the area of the monitored element (in specific implementation, which encoding method can be selected according to the actual situation can select the best one) for encoding, Conducive to provide high-quality images for the monitoring elements that users care about.
  • S604 Encode areas in the monitored element frame except for the monitored element area through the macroblock-level SKIP mode.
  • the area outside the monitoring element area is usually a relatively fixed background part in the monitoring element frame, and for the background area that is not concerned in the monitoring element frame (understandably, the relatively fixed background part in the monitoring element frame is different from that in the monitoring element frame.
  • the background area is not necessarily the same), this step is to reduce the amount of coding and data as much as possible by using ordinary macroblock-level SKIP coding.
  • the macroblock-level SKIP coding needs to first split the target area into multiple macroblocks according to the preset size, that is, ordinary macroblock-level SKIP coding uses macroblocks as the object to judge whether SKIP coding can be performed, and each macro
  • the reference object of the block can be a macroblock at the same position or adjacent position in the previous frame, and the coding result is obtained by combining the algorithm of motion estimation and motion vector.
  • the size of the split macroblock can be selected among 16 ⁇ 16, 8 ⁇ 16, and 8 ⁇ 8 according to the actual situation.
  • OSD On Screen Display, that is, screen menu adjustment method
  • OSD Change area namely: screen menu change area
  • the OSD change area is usually located at a corner, top or bottom of the screen, it is convenient to partition the area, and the amount of coding will not increase significantly due to this operation, and the impact is small.
  • this application also provides a schematic flow diagram of an actual surveillance video frame encoding method based on the above content. See Figure 9:
  • this embodiment uses both a background frame detection model constructed based on a deep learning algorithm and a monitoring element detection model to jointly determine the frame type of the actual frame.
  • Each actual frame is identified as a background frame, and the background
  • select the continuous background frame group whose similarity is maintained within a certain range, and encode each background frame contained in the continuous background frame group according to the frame-level SKIP coding method (that is, each non-I frame is the same
  • the I frame is the reference object for SKIP coding, which is used to distinguish it from the conventional coding method called macroblock-level SKIP coding)
  • the OSD change area in each background frame, the monitoring element area and the OSD in the monitoring element frame The change area is coded by intra-frame or inter-frame coding, and the background area in the monitoring element is coded according to the common macroblock-level SKIP coding method to complete the coding process of all types of frames that constitute the monitoring video.
  • FIG. 10 is a structural block diagram of a monitoring video background frame encoding device provided by an embodiment of the present application.
  • the encoding device may include:
  • the continuous background frame group selection unit 100 is configured to select a continuous background frame group from the frames of the surveillance video; wherein, the continuous background frame group includes multiple background frames, and there is only one I frame among the multiple background frames, and the I frame is The first frame of the continuous background frame group;
  • the frame-level SKIP encoding unit 200 is configured to perform SKIP encoding on each frame except the I frame in the continuous background frame group using the I frame as a reference object.
  • the frame-level SKIP encoding unit 200 may include:
  • Other frame extraction subunits are configured to extract all frames except the I frame from the continuous background frame group;
  • the multi-threaded concurrent encoding subunit is configured to perform SKIP encoding on each frame in the continuous background frame group except the I-frame through the multi-threaded concurrent technology using the I frame as a reference object.
  • the continuous background frame group selection unit 100 may include:
  • the background frame detection model selection subunit is configured to use the background frame detection model to select a continuous background frame group from the actual surveillance video frame; wherein, the background frame detection model is a detection model obtained after training on the real background frame using a deep learning algorithm ;
  • the monitoring element detection model selection sub-unit is configured to monitor and use the monitoring element detection model to select a continuous background frame group from the actual surveillance video frames; among them, the monitoring element detection model is obtained after training using deep learning algorithms based on the characteristics of the real monitoring elements The detection model.
  • the continuous background frame group selection unit 100 may include:
  • the background feature detection subunit is configured to use the background frame detection model to detect whether each frame of the actual surveillance video contains background features; wherein, the background frame detection model is a detection model obtained after training on the real background frame using a deep learning algorithm. Features are extracted from real background frames by deep learning algorithms;
  • the monitoring element detection subunit is configured to use the monitoring element detection model to detect whether each frame of the actual surveillance video contains a monitoring element; among them, the monitoring element detection model is a detection model obtained after training using a deep learning algorithm based on the characteristics of the real monitoring element ;
  • the multi-detection model selection subunit is configured to select a set of continuous frames containing background features and not containing monitoring elements as a continuous background frame group.
  • the apparatus for encoding background frames of surveillance video may further include:
  • the weight assignment unit is configured to determine the accuracy of the background frame in the actual surveillance video according to the background frame detection model and the monitoring element detection model, and assign corresponding weights to the background frame detection model and the monitoring element detection model.
  • the weighted calculation method of the weight value is selected more accurately to obtain the continuous background frame group.
  • the apparatus for encoding background frames of surveillance video may further include:
  • the monitoring element frame selection unit is configured to select frames other than the background frame in the monitoring video from the monitoring video as the monitoring element frame;
  • the monitoring element area marking unit is configured to mark the area of the monitoring element in the monitoring element frame as the monitoring element area;
  • the high-quality coding unit of the monitoring element area is configured to encode the monitoring element area according to the inter-frame or intra-frame coding method
  • the macro-block-level SKIP encoding unit is configured to encode regions in the monitoring element frame except for the monitoring element area in the macro-block-level SKIP mode.
  • the apparatus for encoding background frames of surveillance video may further include:
  • the OSD change area marking unit is configured to mark the area that displays OSD information in the continuous background frame group and in the monitoring element frame except for the monitoring element area as an OSD change area;
  • the OSD change area encoding unit is configured to encode the OSD change area in an inter-frame or intra-frame encoding manner.
  • the encoding device for surveillance video background frames provided in this embodiment corresponds to the encoding method given above.
  • This embodiment exists as a product embodiment corresponding to the method embodiment, and has the same beneficial effects as the method embodiment.
  • Each functional unit For the explanation of the description, please refer to the foregoing method embodiments, which will not be repeated here.
  • Fig. 11 is a block diagram showing an electronic device 300 according to an exemplary embodiment.
  • the electronic device 300 may include a processor 301 and a memory 302, and may also include one or more of a multimedia component 303, an information input/information output (I/O) interface 304, and a communication component 305.
  • a multimedia component 303 may include one or more of a multimedia component 303, an information input/information output (I/O) interface 304, and a communication component 305.
  • I/O information input/information output
  • the processor 301 is used to control the overall operation of the electronic device 300 to complete all or part of the steps in the above-mentioned transparent database encryption method; the memory 302 is used to store various types of data to support the execution of the processor 301 Such operations, these data may include, for example, instructions for any application program or method operated on the electronic device 300, and application-related data.
  • the memory 302 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (Static Random Access Memory, SRAM), electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (Read-Only Memory) Only Memory, ROM), magnetic memory, flash memory, magnetic disk or optical disk one or more.
  • SRAM static random access memory
  • EEPROM Electrically erasable programmable read-only memory
  • EPROM Erasable Programmable Read-Only Memory
  • PROM Programmable Read-Only Memory
  • Read-Only Memory Read-Only Memory
  • the multimedia component 303 may include a camera for collecting images and a microphone for collecting audio signals.
  • the collected images and received audio signals can be stored in the memory 302 or sent through the communication component 305.
  • the I/O interface 304 provides an interface between the processor 301 and other interface modules.
  • the above-mentioned other interface modules may be a keyboard and a mouse.
  • the communication component 305 is used for wired or wireless communication between the electronic device 300 and other devices. Wireless communication, such as Wi-Fi, Bluetooth, Near Field Communication (NFC), 2G, 3G, or 4G, or a combination of one or more of them, so the corresponding communication component 305 may include: Wi-Fi module, Bluetooth module, NFC module.
  • the electronic device may be a surveillance camera with encoding capability.
  • the electronic device 300 may be used by one or more application specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), digital signal processor (Digital Signal Processor, DSP for short), and digital signal processing equipment (Digital Signal Processor).
  • ASIC Application Specific Integrated Circuit
  • DSP Digital Signal Processor
  • Digital Signal Processor Digital Signal Processor
  • DSPD Signal Processing Device
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • controller microcontroller, microprocessor or other electronic components Realization, used to implement the encoding method of the surveillance video background frame given in the foregoing embodiment.
  • a computer-readable storage medium storing program instructions
  • the program instructions will implement operations corresponding to the program instructions when executed by a processor.
  • the computer-readable storage medium may be the aforementioned memory 302 including program instructions, which can be executed by the processor 301 of the electronic device 300 to complete the surveillance video background frame encoding method given in the aforementioned embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

本申请公开了一种监控视频背景帧的编码方法、装置、电子设备及介质,所述方法包括:从监控视频的帧中选取出连续背景帧组;其中,所述连续背景帧组包括多个背景帧,所述多个背景帧中仅有一个I帧,且所述I帧为所述连续背景帧组的首帧;将所述连续背景帧组中除所述I帧之外的每个帧均以所述I帧为参考对象进行SKIP编码。

Description

一种监控视频背景帧的编码方法、装置、电子设备及介质
本公开要求在2019年03月22日提交中国专利局、申请号为201910221762.7的中国专利申请的优先权,以上申请的全部内容通过引用结合在本公开中。
技术领域
本申请涉及视频编码技术领域,例如涉及一种监控视频背景帧的编码方法、装置、电子设备及计算机可读存储介质。
背景技术
遍布大街小巷的监控摄像头为人们提供了可靠的信息追溯能力,但随着人们对画面质量、清晰度要求的不断提高,导致增大的数据量向数据传输网络施加了更大的压力,压力的增加会导致数据传输网络出错率的激增,反映在监控视频数据上就是丢包率的上升。
监控视频相比于其他类型视频拥有一个区别特性,就是由于无法预知监控要素(例如人和车等非固定物体)的出现时间,监控摄像头通常24小时工作,但对使用者来讲,监控视频的作用就是记录监控要素的行为,不包含监控要素的部分通常会被当作影响抓取有效信息的无效数据被跳过或被舍弃。换句话说,不包含监控要素的部分监控视频是不被用户所关心的,用户关心的仅仅是包含监控要素的部分,基于此特性,业界通常采用降低不包含监控要素的背景帧的画面质量的方式来尽可能的实现高画面质量与传输数据量间的平衡,以求在画面质量提升的同时数据量增加不明显。
在主流视频压缩编码标准中,SKIP编码模式是一种压缩率最高的编码模式,因此广泛应用于对连续背景帧(即不包含监控要素的帧)的编码中,但在相关技术中的SKIP编码模式下,P帧(又称帧间预测编码帧)需要参考前一帧才能进行编码,P帧记录的是该帧相对于前一帧在图像上的偏差量;B帧(又称双向预测编码帧)则需要同时参考前一帧和后一帧才能进行编码,B帧同时记录该帧相对于前一帧和后一帧在图像上的偏差量。由此可知,在进行SKIP编码时,当前帧的前一帧或前一帧以及后一帧都将作为必要的关联帧参与编码过程,且由于帧的连续性,使得每个背景帧在进行SKIP编码时其关联帧都互不相同。因此,导致任何一帧的丢失都将会对后续帧的编码造成极大的影响,其抗丢包和抗丢 帧能力较差。基于相关技术的一种编码示意图可参见图1。
发明内容
本申请提供了一种监控视频背景帧的编码方法、装置、电子设备及计算机可读存储介质,能够解决连续背景帧组中的每个非I帧均需要基于互不相同的关联帧才能进行SKIP编码的机制下,因连续性导致的抗丢包和抗丢帧能力较差的问题。
本申请提供了一种监控视频背景帧的编码方法,该编码方法包括:
从监控视频的帧中选取出连续背景帧组;其中,所述连续背景帧组包括多个背景帧,所述多个背景帧中仅有一个I帧,且所述I帧为所述连续背景帧组的首帧;
将所述连续背景帧组中除所述I帧之外的每个帧均以所述I帧为参考对象进行SKIP编码。
本申请还提供了一种监控视频背景帧的编码装置,该编码装置包括:
连续背景帧组选取单元,配置为从监控视频的帧中选取出连续背景帧组;其中,所述连续背景帧组包括多个背景帧,所述多个背景帧中仅有一个I帧,且所述I帧为所述连续背景帧组的首帧;
帧级SKIP编码单元,配置为将所述连续背景帧组中除所述I帧之外的每个帧均以所述I帧为参考对象进行SKIP编码。
本申请还提供了一种电子设备,该电子设备包括:
存储器,用于存储计算机程序;
处理器,用于执行所述计算机程序时实现如上述内容所描述的监控视频背景帧的编码方法。
本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如上述内容所描述的监控视频背景帧的编码方法。
附图说明
下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员 来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1为相关技术下连续背景帧组中每个背景帧的编码示意图;
图2为本申请实施例提供的一种监控视频背景帧的编码方法的流程图;
图3为与图2所示的编码方法对应的连续背景帧组中每个背景帧的编码示意图;
图4为本申请实施例提供的另一种监控视频背景帧的编码方法的流程图;
图5为本申请实施例提供的又一种监控视频背景帧的编码方法的流程图;
图6为本申请实施例提供的再一种监控视频背景帧的编码方法的流程图;
图7为本申请实施例提供的还一种监控视频背景帧的编码方法的流程图;
图8为本申请实施例提供的一种监控视频中监控要素帧的编码方法的流程图;
图9为本申请实施例提供的一种监控视频的帧编码方法的流程示意图;
图10为本申请实施例提供的一种监控视频背景帧的编码装置的结构框图;
图11为本申请实施例提供的一种电子设备的结构框图。
具体实施方式
本申请提供了一种监控视频背景帧的编码方法、装置、电子设备及计算机可读存储介质,可以解决连续背景帧组中的每个非I帧均需要基于互不相同的关联帧才能进行SKIP编码的机制下,因连续性导致的抗丢包和抗丢帧能力较差的问题。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
实施例一
请参见图2,图2为本申请实施例提供的一种监控视频背景帧的编码方法的流程图,其包括以下步骤:
S101:从监控视频的帧中选取出连续背景帧组;
本步骤可以从监控摄像头对目标监控区域拍摄得到的监控视频中选取出连 续背景帧组。
由于实际构成监控视频的是一帧帧的图像,因此选取出连续背景帧组的操作是从实际构成监控视频的帧中进行的。其中,作为多个连续背景帧集合的连续背景帧组,每个连续背景帧组中通常仅包含一个I帧(即:帧内编码帧),且该I帧通常作为该连续背景帧组的首帧存在于该连续背景帧组中,连续背景帧组中除I帧外的其它帧根据实际应用场景以及用户设定的不同可以为B帧或P帧。帧序列中新的I帧通常会作为另一个连续背景帧组的首帧存在,而不是被置入前一个I帧所在的连续背景帧组。
为选取出连续背景帧组,首先可以从实际构成监控视频的各帧中选取出背景帧,再将连续的、相差较小的、相似度较高的背景帧形成一个连续背景帧组,连续背景帧组中包含的背景帧的数量并不固定,在界定能否与其它连续的背景帧形成一个连续背景帧组,主要是看当前背景帧与其相连续的其它背景帧间的相似度和差异大小,根据实际应用场景下所有可能存在的特殊要求,连续背景帧组中包含的背景帧个数可灵活调整。
从构成监控视频的帧中选取出背景帧的方法可以包括以下几种:基于背景特征的背景帧选取方法、基于监控要素的背景帧选取方法,以及结合背景特征和监控要素这两种特征的背景帧选取方法。
基于背景特征的背景帧选取方法是指:通过判断当前帧是否存在与预设背景帧相同的背景特征,来判断当前帧是否为背景帧的方法;基于监控要素的背景帧选取方法是指:通过判断当前帧中是否存在与预设监控要素的特征相一致的监控要素,来判断当前帧是否为背景帧的方法。需要说明的是,上述单纯的依据背景特征或监控要素特征的有无来判别是否为背景帧可能会存在误判,因为即使存在相同的背景特征也并非一定就是背景帧,当监控要素以不影响背景特征的方式出现在当前帧里时,误判就会发生,同理,不存在监控要素也并非一定就是背景帧。
因此,为了尽可能的降低误判率,还可以同时结合上述两种分别从不同角度来判别当前帧是否为背景帧的方法,以使不包含监控要素且包含与预设背景特征相同的背景特征的当前帧才能够被选取为背景帧。
提取检测背景特征/监控要素特征的算法可以包括以下几种:
特征的提取方法包括:最简单的特征区域圈定法(特征区域圈定法为一种通过确定特征所在区域实现对区域内特征进行提取的方法)、运动/行为特征确定 法(运动/行为特征确定法为一种根据监控要素相比于背景存在较大的相对运动来提取相应特征的方法)、ROI(Region Of Interest,感兴趣区域)确定法(ROI确定法为一种自动圈定感应趣区域的方法)等其他类似算法,特征的检测通常都是基于相似度或差异度的比对来实现的,但在此原理基础上存在多种算法,例如基于灰度分布、灰度差异、灰度均值等灰度相关参数的相似性比对法,也可以基于其它有相同或类似功能的参数,此部分内容已被本领域技术人员所熟知,不再一一赘述。
应当理解的是,相比于常规特征提取算法更多提取到的表层特征,基于深度学习算法的提取模型可以通过仿自人类神经元构造的算法结构提取到更多更深层的特征,基于深层特征的相似性比对,相比于常规表层特征的相似性比对,效果更好,误判率更低。
S102:将连续背景帧组中除I帧之外的每个帧均以I帧为参考对象进行SKIP编码。
在S101的基础上,本步骤可以将连续背景帧组中非I帧的每个其它帧均以I帧为参考对象进行SKIP编码,其中一种采用本步骤提供的新编码方式的示意图可参见图3,图3给出的是在将帧结构自定义为仅包含I帧和P帧的情况下得到的,因此在图3所示的连续背景帧组中除首帧的I帧外,剩下的均为P帧。在一些实施例中,连续背景帧组中除首帧的I帧外,剩下的均为P帧或者B帧。
结合图3可以看出,连续背景帧组中的每一个P帧均有一条指向I帧的连线,该连线的箭头指向进行SKIP编码时的参考对象,相比于对应相关技术的图1可以看出,在本申请提供的新编码方式下,连续背景帧组中的每个非I帧的参考对象是相同的,而与每个非I帧都有自己专属的参考对象的相关技术不同,编码不再拥有连续性,因此在I帧不丢失的情况下,丢失任何一个非I帧的编码结果都不会影响其它非I帧的编码,使得需要保证稳定存在的帧从原先位于连续背景帧组中的每个背景帧变为了数量为1的I帧,使得抗丢包和抗丢帧能力得到了显著的提升。同时,由于不需要再参考前一帧或前一帧和后一帧,两个P帧或P帧和B帧间不需要再计算偏差量,也能够在一定程度上减少编码过程的计算量和编码量。
同时,相关技术中当前帧需要以前一帧,或以前一帧和后一帧同时作为参考对象才能进行编码,使得连续背景帧组中每个背景帧的编码过程是线性的、连续的,要通过串行的方式进行处理,无法充分发挥处理器的多进程并发处理能力,而采用新编码方式后,由于参考对象的固定,使得各背景帧的编码可以 是非线性的、不连续的,也就使得采用多进程并发技术来加速编码过程成为可能,有效提升编码速度。
本申请提供了一种将位于连续背景帧组中的每个非I帧均以相同的I帧作为参考对象来进行SKIP编码的新编码机制,即连续背景帧组中每个非I帧不再需要基于互不相同的关联帧作为进行SKIP编码时的参考对象,而是每个非I帧都只需基于相同的关联帧作为编码时的参考对象,由于位于连续背景帧组中的各背景帧间差异极小,因此基于新编码方式得到的画面质量并不会与相关技术存在明显差距,新编码机制的应用却会使抗丢帧和抗丢包能力因为关联帧的统一得到明显提升。
实施例二
请参见图4,图4为本申请实施例提供的另一种监控视频背景帧的编码方法的流程图,本实施例在实施例一的基础上,从提升整体编码效果的角度出发,提供了一种通过多进程并发技术来提升编码速度的方法,包括以下步骤:
S201:从监控视频的帧中选取出连续背景帧组;
S202:从连续背景帧组中提取得到除I帧以外的所有帧;
在S201的基础上,本步骤可以从连续背景帧组中取出所有非I帧,以便将这些非I帧分别分配给多线程并发技术提供的多个线程或多个协程来同时执行编码操作。
S203:将所述连续背景帧组中除所述I帧之外的每个帧通过多线程并发技术同时以I帧为参考对象进行SKIP编码。
在S202的基础上,本步骤可以将取出的各其它帧分别分配给多线程并发技术提供的多个进程或多个协程,以通过同时执行相对独立的每个进程或协程的方式来提升整体编码速度。当多线程并发技术无法一次性为连续背景帧组中的每一个非I帧都对应提供一个进程或协程时,还可以采用将各非I帧按照实际能够提供的进程数或协程数分组,使得每个进程或协程负责对应组内的非I帧的编码,这种方式也可称为微批量方式。
采用多线程并发技术将有助于提升整体编码速度,但在一定程度上也会存在破坏各帧在时间序列上排列的情况,因此为了保证编码后的每个帧在时间序列上的排列情况不变,因此还可以根据每个非I帧在时间序列上的排列情况在进行批量编码时附加特殊的排序标记或时间戳,以便在编码完成后能够根据该时 间戳或排序标记再将其排列为相同的序列。
实施例一所提供的新编码方式解放了编码过程中对编码连续性的要求,因此也就使得并行方式的应用成为可能,本实施例就从如何提升整体编码效率出发提供了一种利用多进程并发技术来批量处理无关联的编码操作,使得整体编码效率得到明显提升,充分利用了处理器的多进程处理能力。
实施例三
请参见图5,图5为本申请实施例提供的又一种监控视频背景帧的编码方法的流程图,为提升背景帧、连续背景帧组的选取准确度、降低误判率,本实施例在上述任意实施例的基础上,提供了一种基于深度学习算法的特殊构造使得特征的提取和检测都能够包含更多深层次特征的方法,包括如下步骤:
S301:利用背景帧检测模型从实际监控视频的帧中选取出连续背景帧组;
其中,本步骤所使用的背景帧检测模型为利用深度学习算法基于真实背景帧训练后得到的检测模型,基于深度学习算法得到一个相应模型的过程可以为:
获取大量真实的针对性样本数据,以构建一个用于检测当前帧是否为背景帧的背景帧检测模型为例,本步骤需要获取到大量真实的背景帧作为样本数据;
将样本数据作为深度学习算法的输入数据;
深度学习算法通过内部的多层构造提取得到隐藏在样本数据背后的具有共性的目标特征;
构建得到基于目标特征的分类器,以利用该分类器将包含与目标特征相同或类似的特征的实际帧,与不包含与目标特征相同或类似的特征的实际帧区分开。
深度学习算法根据是否需要提供指导信息,还被分为有监督和无监督两大类,有监督是指在提供样本数据的同时还给出一些针对性的指导信息,适合对特征有明确要求的应用场景,能够得到较好的分类和检测效果;无监督则相反,由于不提供针对性的指导信息,使得提取出的特征有可能与预期存在较大的偏差,适合对特征没有明确要求或者需要此种方式自行寻找一个合适特征的应用场景。结合本申请对背景特征的需求,采用有监督的深度学习算法可以得到更好的效果。
深度学习算法可以选用常见的卷积神经网络、深度残差网络等等,由于不同算法的针对性各有不同,哪种更适合还可以根据实际应用场景下的有限次测 试得到结论,此外,采用的激活函数和损失函数的不同也可能导致检测效果存在差别。
在一些实施例中,在利用基于深度学习算法构建出背景帧检测模型完成高质量的背景帧检测之后,还可以通过增加相似性判定算法的方式完成哪些连续背景帧可以形成一个连续背景帧组的判定,该判定算法可以作为模型中的一部分,也可以单独存在,作为模型一部分存在的时候集成度更高。
S302:将连续背景帧组中除I帧之外的每个帧均以I帧为参考对象进行SKIP编码。
上面给出了一种通过引入深度学习算法来提升基于背景特征的背景帧检测精度的方法,当然也可以通过将相同的深度学习算法应用在监控要素的特征方面来提升背景帧检测精度,其实现步骤可参见图6:
S401:利用监控要素检测模型从实际监控视频的帧中选取出连续背景帧组;
其中,监控要素检测模型为利用深度学习算法基于真实监控要素的特征训练后得到的检测模型。可以看出,区别于如图5所示的实现步骤,图6所示的实现步骤中同样是利用了深度学习算法对深层次特征的提取能力,只不过所提取的特征是区别于背景特征的监控要素的特征,监控要素的特征是可以用于判别是否为背景帧的另一种特征。其它部分与图5所示的内容相同,不再一一赘述。
S402:将连续背景帧组中除I帧之外的每个帧均以I帧为参考对象进行SKIP编码。
图5和图6为将深度学习算法对深层次特征的提取分别引入了背景特征和监控要素的特征,但依然还是单独依据一类特征完成的检测,在S101部分已经说明过,单独依据一类特征得到的检测结果可能不太准确,因此本申请还提供了一种利用这两类特征共同进行检测的实现方式,且每种特征的检测模型都是依据深度学习算法构造的检测模型,以期得到尽可能准确的检测结果:
S501:利用背景帧检测模型检测实际监控视频的每个帧中是否包含背景特征;其中,背景帧检测模型为利用深度学习算法基于真实背景帧训练后得到的检测模型,背景特征是深度学习算法从真实背景帧中提取得到的;
S502:利用监控要素检测模型检测实际监控视频的每个帧中是否包含监控要素;
其中,监控要素检测模型为利用深度学习算法基于真实监控要素的特征训练后得到的检测模型。
S503:将包含背景特征且不包含监控要素的连续帧的集合选为连续背景帧组。
需要说明的是,虽然上述给出的两类区别当前帧是否为背景帧的方法均能单独用于区别当前帧是否为背景帧,但由于其方式不同,往往对得出相同结论所起到影响程度也有所不同,例如在确定一个受多因素影响的参数时,每种因素在相同变化量下往往会使参数出现不同的变化量,也就是说其对参数或结论的影响不同。同理,在同时结合背景特征和监控要素特征时,在不同应用场景下,这两类特征对得到当前帧为背景帧这一结论的影响程度也不尽相同,因此还可以分别根据背景帧检测模型和监控要素检测模型对实际监控视频中背景帧的判别准确度,为背景帧检测模型和监控要素检测模型分别分配对应的权值,以通过基于权值的加权计算法更准确的选取得到连续背景帧。
为方便理解,此处将通过一个例子进行说明:
假定利用背景帧检测模型单独判别一帧是否为背景帧的准确率为80%,利用监控要素检测模型单独判别一帧是否为背景帧的准确率为70%,那么可以将准确率作为其各自的权值,并将每个权值作为乘积因子与对应模型对该帧进行判别后得到的背景帧评估概率,最终将两种模型得到的值相加得到一个背景帧综合评估概率,以便基于该综合评估概率进行该帧是否为背景帧的评定。
假定,背景帧检测模型判断目标帧为背景帧的评估概率为85%,监控要素检测模型判断目标帧为背景帧的评估概率为80%,应用加权计算法的一个计算过程为:综合背景帧评估概率=0.85×0.8+0.7×0.8,可得到为1.24的综合评估概率,在此基础上,可以制定数值为1.15的综合评估概率阈值,以仅将通过加权计算法计算得到的数值超过1.15的综合评估概率的帧判别为背景帧。其中,综合评估概率阈值的大小可以根据实际情况自行设定。
需要说明的是,在实际情况下,也存在将背景帧检测模型或监控要素检测模型训练为一个二分分类器的情况,上述权值的方式并不适用于二分分类器仅能得到属于背景帧或不属于背景帧的判别结果的情况。
另一方面,即使是有监督的深度学习算法也依然存在多种不同倾向性的算法,因此为了尽可能提升检测准确度,还可以设置述背景帧检测模型和/或监控要素检测模型中包括预设数量的子检测模型,其中不同子检测模型为利用不同 深度学习算法且基于相同训练样本训练后得到的检测模型,以通过多个子模型的共同作用来期望得到一个更准确的检测结果。
当然,每个子检测模型也可以根据对整体检测结果的影响程度被分配有相应大小的权值,来通过加权计算法得到一个综合结果,并基于综合结果得到相应的结论。
本实施例在上述任意实施例的基础上,从提升背景帧判别准确度这一角度出发,将拥有仿自人类神经元构造的深度学习算法引入特征提取部分,并通过其对深层次特征的挖掘能力来提升提取特征对比过程中对背景帧的判别准确度,图5、图6分别某一类特征出发提供了一种实现方式,图7则在其基础上,提供了一种可选地、结合两类特征同时判别的实现方式。
实施例四
请参见图8,图8为本申请实施例提供的一种监控视频中监控要素帧的编码方法的流程图,本实施例在上述任意实施例实现对背景帧的检测之外,额外提供了一种对非背景帧的编码方法,以与背景帧的编码方法相互配合,共同形成对监控视频所有类型的帧进行编码的方法,包括如下步骤:
S601:将监控视频中为非背景帧的帧选取为监控要素帧;
本步骤可以得到与S101中选取出的构成连续背景帧组的背景帧相对的监控要素帧,当仅基于背景帧检测模型判别当前是否为背景帧时,非背景帧可以但不一定为包含监控要素的监控要素帧。当然,上面以及提及过,无论单纯使用哪一种类型的判别方法,都可能不太准确,因此结合实施例三给出的内容,当同时采用两类特征判别背景帧时,与之相对的监控要素帧的选取方式可以为不包含背景特征、包含监控要素的特征的实际帧。
S602:将监控要素帧中监控要素的所在区域标记为监控要素区域;
S603:按帧间或帧内的编码方式对监控要素区域进行编码;
虽然被判别为监控要素帧,但由于其中包含的监控要素并非会占据这一帧图像的全部区域,因此为了尽可能的减少编码量、降低编码后形成的待传输数据的大小,S602和S603提供确定监控要素所在区域,并针对监控要素区域采用高质量的帧内或帧间编码(在具体实现中,选择哪种编码方式可以根据实际情况选取其中最优的那种)的方式来进行编码,利于针对用户关心的监控要素部分提供高质量的画面。
S604:将监控要素帧中除监控要素区域之外的区域通过宏块级SKIP模式进行编码。
在监控要素区域外的区域通常为监控要素帧中相对固定的背景部分,针对监控要素帧中不关心的背景区域(可理解的是,监控要素帧中相对固定的背景部分,与监控要素帧中的背景区域不一定等同),本步骤通过沿用普通的宏块级SKIP编码来尽可能的减少编码量和数据量。其中,进行宏块级SKIP编码需要首先将目标区域按预设大小拆分为多个宏块,即普通的宏块级SKIP编码是以宏块作为判断是否可以进行SKIP编码的对象,每个宏块的参考对象可以是前一帧相同位置或邻近位置的宏块,结合运动估计、运动矢量的算法得到编码结果。其中,拆分出的宏块的大小可以根据实际情况在16×16、8×16以及8×8中自行选择。
在一些实施例中,为了便于监控站区别不同监控摄像头所监控的区域的内容,通常还会采用OSD(On Screen Display,即屏幕菜单式调节方式)技术在画面上叠加一些诸如摄像头名称、监控区域名称、时间等信息,为了保证SKIP编码对变化的时间信息的影响,还可以将在连续背景帧组中和在非监控要素区域之外显示OSD信息(即:屏幕菜单信息)的区域标记为OSD变化区域(即:屏幕菜单变化区域),并按帧间或帧内的编码方式对OSD变化区域进行编码,以更好的显示变化的信息。由于OSD变化区域通常位于画面的一角、顶部或者底部,便于进行区域的分割,不会因为此操作导致编码量明显增加,影响很小。
为加深对本申请发明点和本申请发明点在整体监控视频编码过程中的作用的理解,本申请还在上述内容的基础上,提供了一种实际的监控视频的帧编码方法的流程示意图,请参见图9:
如图9所示,本实施例同时采用基于深度学习算法构造的背景帧检测模型和监控要素检测模型共同完成对实际帧的帧类型的判断,将每个实际帧判别为背景帧,并在背景帧的基础上选取出相似度保持在一定范围内的连续背景帧组,将连续背景帧组中包含的每个背景帧按帧级SKIP编码方式进行编码(即每个非I帧都以相同的I帧为参考对象进行SKIP编码,用于区别于常规被称之为宏块级SKIP编码的编码方式),而位于每个背景帧中的OSD变化区域、监控要素帧中的监控要素区域和OSD变化区域则采用帧内或帧间编码的方式完成编码,监控要素中的背景区域则按照普通的宏块级SKIP编码方式进行编码,以此完成对 构成监控视频的所有类型的帧的编码过程。
因为情况复杂,无法一一列举进行阐述,本领域技术人员应能意识到根据本申请提供的基本方法原理结合实际情况可以存在很多的例子,在不付出足够的创造性劳动下,应均在本申请的保护范围内。
实施例五
下面请参见图10,图10为本申请实施例提供的一种监控视频背景帧的编码装置的结构框图,该编码装置可以包括:
连续背景帧组选取单元100,配置为从监控视频的帧中选取出连续背景帧组;其中,连续背景帧组包括多个背景帧,多个背景帧中仅有一个I帧,且I帧为连续背景帧组的首帧;
帧级SKIP编码单元200,配置为将连续背景帧组中除I帧之外的每个帧均以I帧为参考对象进行SKIP编码。
其中,该帧级SKIP编码单元200可以包括:
其它帧提取子单元,配置为从连续背景帧组中提取得到除I帧之外的所有帧;
多线程并发编码子单元,配置为将所述连续背景帧组中除所述I帧之外的每个帧通过多线程并发技术同时以I帧为参考对象进行SKIP编码。
其中,该连续背景帧组选取单元100可以包括:
背景帧检测模型选取子单元,配置为利用背景帧检测模型从实际监控视频的帧中选取出连续背景帧组;其中,背景帧检测模型为利用深度学习算法基于真实背景帧训练后得到的检测模型;
或,
监控要素检测模型选择子单元,配置为监控利用监控要素检测模型从实际监控视频的帧中选取出连续背景帧组;其中,监控要素检测模型为利用深度学习算法基于真实监控要素的特征训练后得到的检测模型。
其中,该连续背景帧组选取单元100可以包括:
背景特征检测子单元,配置为利用背景帧检测模型检测实际监控视频的每个帧中是否包含背景特征;其中,背景帧检测模型为利用深度学习算法基于真实背景帧训练后得到的检测模型,背景特征是深度学习算法从真实背景帧中提取得到的;
监控要素检测子单元,配置为利用监控要素检测模型检测实际监控视频的每个帧中是否包含监控要素;其中,监控要素检测模型为利用深度学习算法基于真实监控要素的特征训练后得到的检测模型;
多检测模型选取子单元,配置为将包含背景特征且不包含监控要素的连续帧的集合选为连续背景帧组。
在一些实施例中,该监控视频背景帧的编码装置还可以包括:
权值分配单元,配置为根据背景帧检测模型和监控要素检测模型对实际监控视频中背景帧的判别准确度,为背景帧检测模型和监控要素检测模型分别分配对应的权值,这样可以通过基于权值的加权计算法更准确的选取得到连续背景帧组。
在一些实施例中,该监控视频背景帧的编码装置还可以包括:
监控要素帧选取单元,配置为将监控视频中为所述监控视频中除背景帧之外的帧选取为监控要素帧;
监控要素区域标记单元,配置为将监控要素帧中监控要素的所在区域标记为监控要素区域;
监控要素区域高质量编码单元,配置为按帧间或帧内的编码方式对监控要素区域进行编码;
宏块级SKIP编码单元,配置为将所述监控要素帧中除监控要素区域之外的区域通过宏块级SKIP模式进行编码。
在一些实施例中,该监控视频背景帧的编码装置还可以包括:
OSD变化区域标记单元,配置为将在连续背景帧组中和在所述监控要素帧中除监控要素区域之外显示OSD信息的区域标记为OSD变化区域;
OSD变化区域编码单元,配置为按帧间或帧内的编码方式对OSD变化区域进行编码。
本实施例提供的监控视频背景帧的编码装置对应于上述给出的编码方法,本实施例作为一个与方法实施例对应的产品实施例存在,具有与方法实施例相同的有益效果,各功能单元的说明的解释可参见上述各方法实施例,此处不再一一赘述。
图11是根据一示例性实施例示出的一种电子设备300的框图。如图11所示,电子设备300可以包括处理器301和存储器302,还可以包括多媒体组件303、信 息输入/信息输出(I/O)接口304以及通信组件305中的一或多者。
其中,处理器301用于控制电子设备300的整体操作,以完成上述的数据库透明加密方法中的全部或部分步骤;存储器302用于存储各种类型的数据以支持处理器301所需执行的各种操作,这些数据例如可以包括用于在该电子设备300上操作的任何应用程序或方法的指令,以及应用程序相关的数据。该存储器302可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,例如静态随机存取存储器(Static Random Access Memory,SRAM)、电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)、可编程只读存储器(Programmable Read-Only Memory,PROM)、只读存储器(Read-Only Memory,ROM)、磁存储器、快闪存储器、磁盘或光盘中的一或多者。
多媒体组件303可以包括用于采集图像的摄像头和采集音频信号的麦克风。所采集到的图像和接收的音频信号可以被存储在存储器302或通过通信组件305发送。I/O接口304为处理器301和其他接口模块之间提供接口,上述其他接口模块可以是键盘,鼠标。通信组件305用于电子设备300与其他设备之间进行有线或无线通信。无线通信,例如Wi-Fi,蓝牙,近场通信(Near Field Communication,简称NFC),2G、3G或4G,或它们中的一种或几种的组合,因此相应的该通信组件305可以包括:Wi-Fi模块,蓝牙模块,NFC模块。
示例性地,该电子设备可以为拥有编码能力的监控摄像头。
在一示例性实施例中,电子设备300可以被一个或多个应用专用集成电路(Application Specific Integrated Circuit,简称ASIC)、数字信号处理器(Digital Signal Processor,简称DSP)、数字信号处理设备(Digital Signal Processing Device,简称DSPD)、可编程逻辑器件(Programmable Logic Device,简称PLD)、现场可编程门阵列(Field Programmable Gate Array,简称FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述实施例给出的监控视频背景帧的编码方法。
在另一示例性实施例中,还提供了一种存储有程序指令的计算机可读存储介质,该程序指令将在被处理器执行时实现与该程序指令对应的操作。例如,该计算机可读存储介质可以为上述包括程序指令的存储器302,上述程序指令可以为可由电子设备300的处理器301在执行时完成上述实施例给出的监控视频背 景帧的编码方法。
可理解的是,本申请提供的电子设备及计算机可读存储介质,具有与方法实施例相同的有益效果,在此不再赘述。
本文中应用了具体个例对本公开的原理及实施方式进行了阐述,且各个实施例间为递进关系,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,可参见对应的方法部分说明。
还需要说明的是,在本说明书中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其它变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其它要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、物品或者设备中还存在另外的相同要素。

Claims (13)

  1. 一种监控视频背景帧的编码方法,包括:
    从监控视频的帧中选取出连续背景帧组;其中,所述连续背景帧组包括多个背景帧,所述多个背景帧中仅有一个I帧,且所述I帧为所述连续背景帧组的首帧;
    将所述连续背景帧组中除所述I帧之外的每个帧均以所述I帧为参考对象进行SKIP编码。
  2. 根据权利要求1所述的编码方法,其中,将所述连续背景帧组中除所述I帧之外的每个帧均以所述I帧为参考对象进行SKIP编码,包括:
    从所述连续背景帧组中提取得到除所述I帧之外的所有帧;
    将所述连续背景帧组中除所述I帧之外的每个帧同时以所述I帧为参考对象进行SKIP编码。
  3. 根据权利要求1所述的编码方法,其中,将所述连续背景帧组中除所述I帧之外的每个帧同时以所述I帧为参考对象进行SKIP编码,包括:
    将所述连续背景帧组中除所述I帧之外的每个帧通过多线程并发技术同时以所述I帧为参考对象进行SKIP编码。
  4. 根据权利要求1所述的编码方法,其中,从监控视频的帧中选取出连续背景帧组,包括:
    利用背景帧检测模型从所述监控视频的帧中选取出所述连续背景帧组;其中,所述背景帧检测模型为利用深度学习算法基于真实背景帧训练后得到的检测模型;
    或,
    利用监控要素检测模型从所述监控视频的帧中选取出所述连续背景帧组;其中,所述监控要素检测模型为利用深度学习算法基于真实监控要素的特征训练后得到的检测模型。
  5. 根据权利要求1所述的编码方法,其中,从监控视频的帧中选取出连续背景帧组,包括:
    利用背景帧检测模型检测所述监控视频的每个帧中是否包含背景特征;其中,所述背景帧检测模型为利用深度学习算法基于真实背景帧训练后得到的检测模型;
    利用监控要素检测模型检测所述监控视频的每个帧中是否包含监控要素;其中,所述监控要素检测模型为利用深度学习算法基于真实监控要素的特征训 练后得到的检测模型;
    将包含所述背景特征且不包含所述监控要素的连续帧的集合选为所述连续背景帧组。
  6. 根据权利要求1所述的编码方法,其中,从监控视频的帧中选取出连续背景帧组,包括:
    将利用背景帧检测模型判断目标帧为背景帧的评估概率作为第一评估概率;其中,所述目标帧为所述监控视频的帧,所述背景帧检测模型为利用深度学习算法基于真实背景帧训练后得到的检测模型;
    将利用监控要素检测模型判断所述目标帧为背景帧的评估概率作为第二评估概率;其中,所述监控要素检测模型为利用深度学习算法基于真实监控要素的特征训练后得到的检测模型;
    根据单独利用所述背景帧检测模型判别一帧是否为背景帧的准确率、单独利用所述监控要素检测模型判别一帧是否为背景帧的准确率、所述第一评估概率和所述第二评估概率,计算综合背景帧评估概率;
    基于所述综合评估概率超过所述综合评估概率阈值的判定结果,将所述目标帧判别为背景帧。
  7. 根据权利要求1或6所述的编码方法,还包括:
    根据背景帧检测模型和监控要素检测模型对所述监控视频中背景帧的判别准确度,为所述背景帧检测模型和所述监控要素检测模型分别分配对应的权值;其中,所述背景帧检测模型为利用深度学习算法基于真实背景帧训练后得到的检测模型,所述监控要素检测模型为利用深度学习算法基于真实监控要素的特征训练后得到的检测模型。
  8. 根据权利要求5所述的编码方法,其中,所述背景帧检测模型和所述监控要素检测模型其中至少之一包括预设数量的子检测模型;其中,不同所述子检测模型为利用不同深度学习算法且基于相同训练样本训练后得到的检测模型。
  9. 根据权利要求1至8任一项所述的编码方法,还包括:
    将所述监控视频中的除所述背景帧以外的帧选取为监控要素帧;
    将所述监控要素帧中监控要素的所在区域标记为监控要素区域;
    按帧间或帧内的编码方式对所述监控要素区域进行编码;
    将所述监控要素帧中除所述监控要素区域之外的区域通过宏块级SKIP模式进行编码,所述宏块级SKIP编码,是以宏块作为判断是否可以进行SKIP编码 的对象的编码方式。
  10. 根据权利要求9所述的编码方法,还包括:
    将在所述连续背景帧组中和在所述监控要素帧中除所述监控要素区域之外显示屏幕菜单信息的区域标记为屏幕菜单变化区域;
    按帧间或帧内的编码方式对所述屏幕菜单变化区域进行编码。
  11. 一种监控视频背景帧的编码装置,包括:
    连续背景帧组选取单元,配置为从监控视频的帧中选取出连续背景帧组;其中,所述连续背景帧组包括多个背景帧,所述多个背景帧中仅有一个I帧,且所述I帧为所述连续背景帧组的首帧;
    帧级SKIP编码单元,配置为将所述连续背景帧组中除所述I帧之外的每个帧均以所述I帧为参考对象进行SKIP编码。
  12. 一种电子设备,包括:
    存储器,用于存储计算机程序;
    处理器,用于执行所述计算机程序时实现如权利要求1至10任一项所述的监控视频背景帧的编码方法。
  13. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至10任一项所述的监控视频背景帧的编码方法。
PCT/CN2019/111948 2019-03-22 2019-10-18 一种监控视频背景帧的编码方法、装置、电子设备及介质 WO2020192095A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910221762.7A CN111726620A (zh) 2019-03-22 2019-03-22 一种监控视频背景帧的编码方法、装置、电子设备及介质
CN201910221762.7 2019-03-22

Publications (1)

Publication Number Publication Date
WO2020192095A1 true WO2020192095A1 (zh) 2020-10-01

Family

ID=72563512

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/111948 WO2020192095A1 (zh) 2019-03-22 2019-10-18 一种监控视频背景帧的编码方法、装置、电子设备及介质

Country Status (2)

Country Link
CN (1) CN111726620A (zh)
WO (1) WO2020192095A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112040232B (zh) * 2020-11-04 2021-06-22 北京金山云网络技术有限公司 实时通信的传输方法和装置、实时通信的处理方法和装置
CN113038133B (zh) * 2021-05-24 2021-12-24 星航互联(北京)科技有限公司 一种基于卫星传输的视频压缩传输系统

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020168175A1 (en) * 2001-05-14 2002-11-14 Green Dustin L. Systems and methods for playing digital video in reverse and fast forward modes
US7139409B2 (en) * 2000-09-06 2006-11-21 Siemens Corporate Research, Inc. Real-time crowd density estimation from video
CN101192903A (zh) * 2007-11-28 2008-06-04 腾讯科技(深圳)有限公司 数据帧编解码控制方法
CN101207813A (zh) * 2007-12-18 2008-06-25 中兴通讯股份有限公司 一种视频序列的编码、解码方法及编码、解码系统
CN101216942A (zh) * 2008-01-14 2008-07-09 浙江大学 一种自适应选取权重的增量式特征背景建模算法
CN102222349A (zh) * 2011-07-04 2011-10-19 江苏大学 一种基于边缘模型的前景帧检测方法
CN103546747A (zh) * 2013-09-29 2014-01-29 北京航空航天大学 一种基于彩色视频编码模式的深度图序列分形编码方法
CN104077757A (zh) * 2014-06-09 2014-10-01 中山大学 一种融合实时交通状态信息的道路背景提取与更新方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100669634B1 (ko) * 2004-12-06 2007-01-15 엘지전자 주식회사 동영상 압축 및 복원 방법
CN101321287B (zh) * 2008-07-08 2012-03-28 浙江大学 基于运动目标检测的视频编码方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7139409B2 (en) * 2000-09-06 2006-11-21 Siemens Corporate Research, Inc. Real-time crowd density estimation from video
US20020168175A1 (en) * 2001-05-14 2002-11-14 Green Dustin L. Systems and methods for playing digital video in reverse and fast forward modes
CN101192903A (zh) * 2007-11-28 2008-06-04 腾讯科技(深圳)有限公司 数据帧编解码控制方法
CN101207813A (zh) * 2007-12-18 2008-06-25 中兴通讯股份有限公司 一种视频序列的编码、解码方法及编码、解码系统
CN101216942A (zh) * 2008-01-14 2008-07-09 浙江大学 一种自适应选取权重的增量式特征背景建模算法
CN102222349A (zh) * 2011-07-04 2011-10-19 江苏大学 一种基于边缘模型的前景帧检测方法
CN103546747A (zh) * 2013-09-29 2014-01-29 北京航空航天大学 一种基于彩色视频编码模式的深度图序列分形编码方法
CN104077757A (zh) * 2014-06-09 2014-10-01 中山大学 一种融合实时交通状态信息的道路背景提取与更新方法

Also Published As

Publication number Publication date
CN111726620A (zh) 2020-09-29

Similar Documents

Publication Publication Date Title
CN110941594B (zh) 一种视频文件的拆分方法、装置、电子设备及存储介质
RU2693906C2 (ru) Основанный на правилах анализ важности видео
CN104063883B (zh) 一种基于对象和关键帧相结合的监控视频摘要生成方法
US11055516B2 (en) Behavior prediction method, behavior prediction system, and non-transitory recording medium
KR100995218B1 (ko) 화이트보드 콘텐츠의 개선된 데이터 스트림 생성용 컴퓨터구현 처리방법, 시스템 및 컴퓨터 판독가능 기록 매체
CN111046959A (zh) 模型训练方法、装置、设备和存储介质
CN110012302A (zh) 一种网络直播监测方法及装置、数据处理方法
CN111225234A (zh) 视频审核方法、视频审核装置、设备和存储介质
CN110619350B (zh) 图像检测方法、装置及存储介质
US20130342636A1 (en) Image-Based Real-Time Gesture Recognition
WO2020192095A1 (zh) 一种监控视频背景帧的编码方法、装置、电子设备及介质
CN105243356B (zh) 一种建立行人检测模型的方法及装置及行人检测方法
CN109922334A (zh) 一种视频质量的识别方法及系统
US20170220894A1 (en) Image processing device, image processing method, and program
JP2007148663A (ja) オブジェクト追跡装置及びオブジェクト追跡方法、並びにプログラム
CN110866473B (zh) 目标对象的跟踪检测方法及装置、存储介质、电子装置
CN108564057A (zh) 一种基于opencv的人物相似度系统的建立方法
CN111083469A (zh) 一种视频质量确定方法、装置、电子设备及可读存储介质
JP7074174B2 (ja) 識別器学習装置、識別器学習方法およびコンピュータプログラム
CN111753590A (zh) 一种行为识别方法、装置及电子设备
CN111047049B (zh) 基于机器学习模型处理多媒体数据的方法、装置及介质
TWI757940B (zh) 視訊會議系統及其排除打擾的方法
CN107545212A (zh) 双通道云数据管理平台
CN106575359B (zh) 视频流的动作帧的检测
Yousefi et al. Energy aware multi-object detection method in visual sensor network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19921504

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19921504

Country of ref document: EP

Kind code of ref document: A1