CN110572665A

CN110572665A - Static background video self-adaptive compression method based on background subtraction

Info

Publication number: CN110572665A
Application number: CN201910905961.XA
Authority: CN
Inventors: 蓝龙; 徐传福; 尹晓尧; 张翔; 车永刚; 高翔; 李超; 吴诚堃; 郭晓威; 骆志刚
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2019-09-24
Filing date: 2019-09-24
Publication date: 2019-12-13
Anticipated expiration: 2039-09-24
Also published as: CN110572665B

Abstract

the invention discloses a background subtraction-based static background video self-adaptive compression method, and aims to solve the problems of low compression ratio and high storage cost of static background video data. The technical scheme is that a static background video compression system which is composed of a background subtraction module, a target tracking module, a distribution planning module and a video compression synthesis module is constructed. The background subtraction module performs background subtraction on the video frame to obtain a foreground target extension frame and a background picture of the video frame; the target tracking module tracks a foreground target external frame of the video frame to obtain a motion track pipeline T of the foreground target; the distribution planning module distributes labels to the T to obtain a label set L of the T in a compression result; and the video compression synthesis module carries out synthesis compression on the T, L background picture set to obtain a video compression result picture set. The invention can realize the limit pressure to the static background video, and has high compression ratio, low information loss rate and greatly reduced storage space.

Description

Static background video self-adaptive compression method based on background subtraction

Technical Field

The invention relates to a video data compression method, in particular to a compression method of a static background video.

Background

With the development of digital media technology, the cost of video recording is continuously decreasing, and various types of video recording instruments are deployed in various places to record interested persons or things. Among them, the fixed camera is the most common, and some of these video recording apparatuses are in operation for 24 hours a day, so that a large amount of static background video is generated, and the static background video refers to video data shot by the fixed camera, such as surveillance video. The acquisition of massive static background video presents a great challenge to storage. In fact, many information in the static background video is redundant, for example, the background information in consecutive video frames is highly similar, which takes up a lot of space to store the information and causes great inconvenience to the content browsing of the video. According to statistics, only one percent of video information recorded by the existing monitoring camera is utilized, and the monitoring information of nineteen percent is discarded without any processing. The video compression technology can realize complete browsing of effective information in the monitoring video in a short time, and greatly saves manpower and material resources. For example, through video compression technology, monitoring video information of a day is compressed to a segment of one minute, and the integrity of the original information and the viewing enjoyment of browsing are not influenced. Video compression techniques are widely used in various industries. For example, public security: intelligently compressing videos of key roads and checkpoints; traffic: video intelligent compression is carried out on key road sections and charging gates; prison guard: intelligently compressing video of key supervision houses and personnel shift; large-scale exhibitions: stadiums, doorway video intelligence compression, and the like. In fact, in the application requiring video storage, the storage overhead can be reduced by video compression, the processing efficiency can be improved, and the information storage period can be prolonged.

The research of video compression attracts the interests of a large number of computer vision researchers, and the spatio-temporal video montage method proposed by Kang et al (Kang et al, spatio-temporal video montage method, published in the computer vision and pattern recognition meeting in 2006, see pages 1331 to 1338 of the CVPR2006 corpus) of the hong kong university of chinese simultaneously takes into account the time and space distribution information of all moving objects in the whole video, and rearranges the time sequence and the space sequence, so that the information is not lost when the compression ratio reaches 2, but obvious image mosaicing and gaps occur and the time and space correlation of the same object is lost. The video mosaic proposed by Iran et al, the research center of taveronov, usa [ Iran et al, efficient expression and application of video sequence, published in 1996 in journal of signal processing and image communication, No. 8, No. 4, pages 327 to 351 ] tries to present the original video in the form of panoramic picture, because the overlapping information in consecutive frames is removed in the construction of panoramic picture, the original video is greatly compressed, but this method is considered for each frame separately, so it is unable to associate objects in adjacent frames. The stripe cropping method [ Zhuang et al, video enrichment based on stripe cropping, published in 2009 at volume 11, pages 2572 to 2583 of the image processing journal 18 ] treats each motion trajectory as a stripe and treats this stripe as the smallest processing unit, solving the problem that the same object cannot be associated, but the method fails when adjacent objects have different motion speeds and motion directions. More seriously, the strip cropping method always produces a significant gap in either the vertical or horizontal direction. Nee Yongwei et al (Nie et al, compact video condensation method based on global space-time optimization, published in 2013 at 10 th page 1664 to 1676 of the journal of visualization and computer graphics at 19 th edition) propose to transform space-time sequences of moving objects, and further, they (Nie et al, concentrated moving objects based on partial assembly and splicing, published in 2014 at 9 th page 1303 to 1315 of the journal of visualization and computer graphics at 20 th edition) propose to synthesize a complex background using a multi-plane reconstruction method, thereby pasting the moving objects into the background to synthesize a compressed video. The method effectively eliminates obvious gaps in the compressed video, but the method cannot well process conflicts existing between object motion tracks in the video. The team then divides the trajectory of a moving object into successive segments, each of which selects a representative frame to be pasted into the background to form the final compressed video, while introducing a structure completion method to compensate for the voids that occur between the moving object and the background. This method effectively solves the problem of conflict, but its compression ratio is still not high (only 5-10 times compression ratio). He et al, who is based on a potential conflict graph, rapid online video concentration method published in 2016, on page 22 to page 26 of volume 1 of signal processing promulgated at 24, propose to take potential conflicts into account and then rearrange moving objects, but interference on the moving objects in time and space makes the association of the same object in a compressed video a difficult problem.

Most of the currently known video compression methods and patents are mainly classified into three categories, the first category is compression storage of high definition video, and the first category utilizes the essential phenomenon that most of pixels are basically unchanged and a small number of pixels are changed between consecutive frames to store the changed part by using sparse coding, thereby reducing the requirement of the video on storage space, but does not help video browsing and retrieval, and typical works include a video compression system (patent publication No. CN 208027742U) of felodian corporation, a video compression method and video compression device (patent publication No. CN 103533375 a) of the institute of technology and technology of the finance and corporate industry, and a video compression method and video compression device (patent publication No. CN 109040758 a) of the limited chenxing semiconductor corporation. The second type is mainly based on the combined data compression of multiple cameras in the same place, and realizes the association of moving objects among multiple cameras and extracts and stores the objects according to the known relative positions and space transformation among the cameras and the moving directions of the moving objects, such as the video concentration method of Lizhogwei et al (patent publication No. CN 104284198A), the video concentration method of Sichuan Haote communications Limited company (patent publication No. CN 105306880A) and the video concentration system (patent publication No. CN 105323547A). However, the method selects a specified number of moving objects by default, and when there are more moving objects in the field of view or different moving speeds occur, information loss occurs. The third type is a mode of compressing a long video into a short video or a picture, which greatly improves the efficiency of video browsing and retrieval while reducing the storage space requirement, and typical work includes an online video concentration method (patent publication No. CN 104093001 a) proposed by kyushanconr, university of Shandong buildings, which obtains a moving object track by dynamic background subtraction and online tracking, and then pastes the object into the background under the condition of ensuring no collision among objects, but the method needs to specify a concentration ratio in advance, cannot select a proper concentration ratio in a self-adaptive manner, and is easy to cause information redundancy or information loss. A video compression method based on video abstraction (patent publication No. CN 108260025 a) of beijing non-fighting science and technology development limited company extracts key video frames by detecting abrupt change frames in a video, thereby implementing compression of the video, but this method may cause loss of a large amount of information. A video compression method and technique (patent publication No. CN 103686095 a) proposed by qinhongde et al of zhonganning technology limited adopts similar techniques and proposes a new optimization method to implement the rearrangement of the motion trajectory, but the compression result of the method has a large amount of redundant information.

How to increase the compression ratio of video compression while retaining important information is a technical problem of great interest to those skilled in the art.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: aiming at the current situations that static background video data acquired by a fixedly-installed camera is low in compression ratio, high in storage cost, extremely low in browsing speed and high in consumption of manpower and material resources, a compression method of a static background video is provided, effective information in the video is extracted, and on the premise of keeping the original information quantity, the video which is as long as several hours is compressed into a short video or dozens of or hundreds of pictures which can be played in one minute, so that the ultimate compression of the static background video is realized, the storage efficiency of the static background video is effectively improved, and the storage cost is reduced; meanwhile, the browsing and reading efficiency of the video content is improved, and the labor and time cost is reduced.

The technical scheme of the invention is as follows:

The method comprises the following steps of firstly, constructing a static background video compression system, wherein the static background video compression system consists of a background subtraction module, a target tracking module and a video compression module, and the video compression module consists of a distribution planning module and a video compression synthesis module.

the background subtraction module is connected with the original video, the target tracking module and the video compression and synthesis module. The background subtraction module performs background subtraction on the original video according to frames (assuming that the original video has N frames), obtains external frame information and a background picture of a foreground object in the video frames, and sends the external frame information of the foreground object in the N video frames to the target tracking module. And after the N video frames are processed, sending the background pictures of the N video frames to a video compression synthesis module.

The background subtraction module is composed of a background segmentation submodule, a binarization submodule,the corrosion submodule, the expansion submodule and the external frame acquisition submodule. The background segmentation submodule is connected with the original video, the video compression and synthesis module and the binarization submodule, and segments a video frame (assumed as an ith frame) read from the original video to obtain a foreground picture F_iand background picture B_iWill F_iSending to a binarization submodule, and sending B_iPresence of background set B_ackgroundIn the method, after N video frames are processed, B is processed_ackgroundsending to a video compression and composition module, B_ackground＝{B₁，...，B_i，...，B_N}，1≤i≤N。

The binarization submodule is connected with the background segmentation submodule and the corrosion submodule, and the foreground picture F received from the background segmentation submodule_iEach pixel value is subjected to binarization processing to obtain a binarized foreground picture F_iband F is_ibSent to the corrosion submodule.

the corrosion submodule is connected with the binarization submodule and the expansion submodule, and F obtained from the binarization submodule is subjected to_ibCarrying out corrosion treatment to obtain a corroded binary foreground picture F_ibewill F_ibeAnd sending to the expansion submodule.

The expansion submodule is connected with the corrosion submodule and the external frame acquisition submodule, and F obtained from the corrosion submodule is subjected to_ibeExpansion treatment is carried out to obtain a corroded and expanded binary foreground picture F_ibedWill F_ibedAnd sending the data to an external frame acquisition submodule.

The extension frame acquisition submodule is connected with the expansion submodule and the target tracking module (specifically, connected with the calculation IOU submodule of the target tracking module), and F obtained from the expansion submodule_ibedObtaining an outer frame to obtain F_ibedN in expanded binary foreground picture_iBounding box set Bboxes of individual foreground object_iBboxes are_iAnd sending the data to a computation IOU submodule of the target tracking module.

The target tracking module is connected with the background subtraction module and the distribution planning module and is arranged from the backThe scene subtraction module receives N in the ith frame of N video frames_iBounding box set Bboxes of individual foreground object_iObtaining N in the ith frame by using a multi-target tracking algorithm_iand after the respective motion tracks of the foreground objects are processed by the N video frames, obtaining a motion track pipeline set T of the foreground objects, and sending the T to the distribution planning module. The target tracking module consists of a Kalman filtering prediction submodule, a calculation IOU submodule and a track association submodule.

The Kalman filtering prediction submodule is connected with the external frame acquisition submodule, the calculation IOU submodule and the track association submodule, and the Kalman filtering prediction submodule is used for acquiring a motion track pipeline of the ith-1 video frame time obtained from the track association submodulePerforming Kalman filtering to predict the possible position of the current motion track pipeline in the ith frame,1≤p≤temp_i-1Whereina motion trajectory pipeline representing the p-th foreground object, wherein p represents the number of the motion trajectory pipeline,Is the start time (start) of the p-th pipe,Represents the end time (end) of the p-th pipe,Is the time of the jth occurrence of the pth pipe,Is the position of the jth occurrence of the p-th pipe, f_jrepresenting the video frame that occurred the jth time in the pth pipeline,Represents the bounding box of the jth occurrence of the pth pipeline,Representing the top left vertex coordinates of the bounding box,the width of the external frame is the same as the width of the external frame,Is the height of the outer frame. According toPredicting probable track pipeline bounding box set Bboxes of ith frame_i，predBboxes are_i，predAnd sending the data to the IOU submodule.

the IOU submodule is connected with the external frame acquisition submodule, the Kalman filtering prediction submodule and the track correlation submodule, and Bboxes obtained from the external frame acquisition submodule_iThe outer bounding box in (1) and Bboxes obtained from the Kalman Filter predictor Module_i，predCarrying out IOU calculation on the external frame to obtain an IOU matrix Mat_IOUmixing Mat_IOUAnd sending the data to a track association submodule.

The track correlation submodule is connected with the calculation IOU submodule, the Kalman filtering prediction submodule and the distribution planning module, and is used for obtaining Mat from the calculation IOU submodule_IOUDistributing to obtain the track association result, namely the motion track pipeline of the ith-1 video frame timeWill be provided withAnd sending the data to a Kalman filtering prediction submodule. After the N video frames are processed, obtaining a motion track pipeline set T of the foreground object in the N video frames, wherein i-1 is equal to N,，1≤p≤temp_NLet M denote the number of motion trajectory pipes in N video frames, M being temp_Nand sending the T to a distribution planning module.

The video compression module consists of a distribution planning module and a video compression synthesis module, and particularly,

the distribution planning module is connected with the target tracking module and the video compression synthesis module, receives the motion track pipeline set T of the foreground object in the N video frames from the target tracking module, constructs an objective function E for the T, optimizes the E to obtain a label set L of the motion track pipeline of the foreground object in the N video frames, and sends the T and the L to the video compression synthesis module.

The distribution planning module comprises a construction objective function submodule and an optimization solution submodule. And the building objective function submodule is connected with the track association submodule and the optimization solving submodule, and an energy minimization function E is built for the T received from the track association submodule and is transmitted to the optimization solving submodule. The optimization solution submodule is connected with the objective function building submodule and the video compression synthesis module, L and SN are obtained through optimization solution E, L is a label set of a motion track pipeline T, and L is { f ═ f }_p，1≤p≤M，1≤f_pSN ≦ SN, SN being the number of pictures into which the original video (N frames) is to be compressed, f_pIs the label of the p-th motion trail pipeline, namely the optimization solution result considers that the p-th motion trail pipeline should be placed to the f-th of the compression result_pIn a picture. T and L are sent to the acquisition time span submodule.

The video compression and synthesis module is connected with the original video, the background segmentation submodule and the distribution planning module, receives T and L from the distribution planning module, and receives B from the background segmentation submodule_ackgroundobtaining the pixel value of the foreground motion track in T from the original video, and adding B to the foreground motion track_ackgroundAnd (5) pasting the background picture to obtain a video compression result.

The video compression synthesis module consists of an acquisition time span submodule, an average background submodule, a pasting submodule and a linear interpolation submodule.

the time span acquisition submodule is connected with the optimization solution submodule and the average background submodule of the distribution planning module, receives T and L from the optimization solution submodule, and performs calculation on any T_pE.g. T, to obtain T_pTime span ofAccording to the label L, based on the Nth compression result picture N1_qCalculating the time span of the motion track pipeline to obtain the time span T of the q-th compression result picture_span，q， 1≤q≤SN，1≤z≤N1_qWherein N1_qIndicates the number of motion trajectory pipes in the qth compression result picture,For the z-th motion trail pipeline time span appearing in the q-th compression result picture, a compression result picture time span set T formed by the time spans of SN compression result pictures_spanPassed to the average background submodule, T_span＝{T_span，1，...，T_span，q，...，T_span，SN}；

The average background submodule is connected with the original video, the acquisition time span submodule and the background segmentation submodule, and T is obtained from the acquisition time span submodule_spanObtaining B from the background segmentation submodule_ackgroundAccording to T_span，qObtaining a background picture set in the time periodto pairAt the same positionAveraging the pixels to obtain the q-th background picture of the compression result pictureAnd a background picture set composed of SN compressed result picturesT, L to a paste sub-module,

The paste submodule is connected with the average background submodule and the linear interpolation submodule, and the paste submodule and the average background submodule are obtained from the average background submoduleAnd T, L, pasting T to LCorresponding positions to obtain a primary pasting result set S_corseAnd then S is_corseT, L are passed to a linear interpolation sub-module.

the linear interpolation submodule is connected with the pasting submodule to obtain S from the pasting submodule_corseT, L, linear interpolation is carried out on the edge position of the pasting result to achieve the effect of softening the edge, and the final compressed picture set S is obtained_fine。

and secondly, reading in video frames of an original video (containing N video frames) frame by the background subtraction module, performing background subtraction on the video frames to obtain foreground target external frames and background pictures of the N video frames, and tracking the foreground target external frames of the N video frames by the target tracking module to obtain a motion track pipeline of the foreground targets of the N video frames. The specific method comprises the following steps:

2.1 making variable i equal to 1, making temporary motion track pipeline set

2.2 background segmentation submodule Using basesIn the adaptive partitioner of pixels, PBAS (pixel based adaptive segmenter) algorithm Hofmann et al, background subtraction with feedback: adaptive pixel-based segmenter published in 2012 in computer vision and pattern recognition conference seminars]Carrying out foreground and background segmentation on the ith video frame read from the original video to obtain a foreground picture F of the ith video frame_iand background picture B_iWill F_iSending to a binarization submodule, and sending B_iSave to B_ackgroundIn (1).

2.3 binarization submodule pairs F received from background segmentation submodule_iPerforming binarization processing, wherein the binarization processing method adopts a threshold method (only c 2.threshold function is called, the input is foreground picture, the output is binarization foreground picture) in an opencv computer vision processing method library (version 3.4.3 and above, and the website of an open source algorithm library is https:// opencv.org/reeases /), and the foreground picture F of the ith video frame is input_iTo obtain F_ibinarized foreground picture F_ibWill F_ibsent to the corrosion submodule. F_ibEach pixel in the binarization foreground is 0 or 1, wherein 0 represents that the position corresponding to the pixel is a background, and 1 represents that the position corresponding to the pixel is a foreground.

2.4 etching submodule by etching method to F received from binaryzation submodule_ibCarrying out corrosion treatment to obtain a corroded binary foreground picture F_ibeWill F_ibeAnd sending to the expansion submodule. The corrosion method adopts an enode method in an opencv computer vision processing method library (calling a cv2. enode function, wherein the input is a binary foreground picture and the output is a corroded binary foreground picture).

2.5 expansion submodule receiving F from Corrosion submodule_ibeProcessing by adopting an expansion method to obtain a corroded and expanded binary foreground picture F_ibedWill F_ibedand sending the data to an external frame acquisition submodule. The expansion method adopts a dilate method (calling cv2.dilate function) in an opencv computer vision processing method library, wherein the input of the dilate method is a corroded binarization foreground picture, and the output of the dilate method is a corroded and expanded binarization foreground picture)。

2.6 the outer frame acquisition submodule receives F from the expansion submodule_ibedFrom F_ibedDetecting N in the ith video frame_iBounding box set Bboxes of individual foreground objects_ibboxes are_iIs sent to the computation IOU sub-module, Wherein N is_iRepresenting the number of foreground objects in the ith frame of video,The coordinates of the top left vertex of the bounding box of the jth 1 th foreground object in the ith frame of video,Is the width of the bounding box for the j1 th foreground object in the ith frame of video,The height of the bounding box of the j1 th foreground object in the ith frame of video. The detection method adopts a findcontour method in an opencv computer vision processing method library (namely calling a cv2 findcontour function, wherein the input of the findcontour function is a corroded and expanded binary foreground picture, and the output of the findcontour function is an outer frame set of a foreground object in a current frame).

2.7Kalman filtering prediction submodule obtains motion track pipeline from track association submoduleIf it isthen orderOrder toAnd 2.8. If it isThen there isUsing Kalman filtering [ see Kalman et al, a new method for linear filtering and prediction of problems, published in 1960 in journal of basic engineering, Vol.82, pp.35 to 45]Predicting to obtain the ith frame predicted track pipeline outer frame set the specific method comprises the following steps:

2.7.1 let p be 1;

2.7.2 fromIn the p-th motion track pipeline T_pBounding box information in the i-1 th frame

2.7.3 prediction using Kalman filteringpossible positions in the ith frameWill be provided withput into the ith frame to predict the track pipeline bounding box set Bboxes_i，predPerforming the following steps;

2.7.4 determining whether p is less than temp_i-1If yes, changing p to p +1 and changing to 2.7.2; otherwise, go to 2.8.

2.8 Bboxes_i，predAnd sending the data to the IOU submodule.

2.9 compute IOU submodule vs Bboxes obtained from the Extra Box acquisition submodule_iThe outer bounding box in (1) and Bboxes obtained from the Kalman Filter predictor Module_i，predthe peripheral frame in (1) is subjected to IOU (intersection and intersection ratio) calculation to obtain an IOU matrix Mat_IOUIn which Mat_IOURow k and column lAnd k is more than or equal to 1 and less than or equal to N_i，1≤l≤N_i，pred，Bboxes_iThe number of the middle and outer connecting frames is N_i，Bboxes_i，predThe number of the middle and outer connecting frames is N_i，pred. Mixing Mat_IOUAnd sending the data to a track association submodule. In particular, the present invention relates to a method for producing,

2.9.1 let k equal to 1,

2.9.2 obtaining Bboxes_iThe kth outer frame of (1)

2.9.3 when l is 1,

2.9.4 obtaining Bboxes_i，predThe first external frame

2.9.5 calculationnamely Bboxes_iMiddle kth circumscribed rectangle frameAnd Bboxes_i，predMiddle and first external rectangular frameIs divided by the sum of the areas of the two circumscribed rectangles, and the result is put in Mat_IOURow k, column l;

2.9.6 judging whether l is less than N_i，predIf yes, let l be l +1,Turning to 2.9.4; otherwise, turn to 2.9.7.

2.9.7 determining if k is less than N_iIf yes, changing k to k +1, and turning to 2.9.2; otherwise, turn to 2.10.

2.10 trajectory correlation submodule adopts Hungarian's Hungarian algorithm (see Munkres et al, an algorithm for distribution and transportation problems, vol.1, pages 32 to 38 of journal of the society of Industrial and applied mathematics, Vol.5, 1957]Matrix Mat based on cross-over ratio_IOUFor Bboxes_iAnd Bboxes_i，predThe external frames of the foreground target in the foreground are correlated to obtain M_iA set of associated tuples Based on Tupple_iMotion track pipeline for generating ith frame by information of middle and outer connecting framesWill be provided withAnd sending the data to a Kalman filtering prediction submodule.

2.11 the background segmentation submodule determines whether the video frame can still be received from the camera, and if so, changes i to i +1 and changes to 2.2; if not, i is equal to N, which means that N video frames are obtained from the camera, and the background segmentation sub-module obtains a background picture set B composed of N color background pictures_ackground’B_ackground＝{B_iI is more than or equal to 1 and less than or equal to N, and B is equal to or less than N_ackgroundSending the average background to an average background submodule of a video compression and synthesis module; and the track association submodule obtains a motion track pipeline set T of the foreground target in the N video frames, sends the T to a target function construction submodule of the distribution planning module, and then the third step is carried out.

Thirdly, the distribution planning module distributes labels to the motion track pipeline T of the foreground target received from the target tracking module to obtain a label set L ═ f in a compression result of the T_p，1≤p≤M，1≤f_p≤SN}，f_pIs the label of the p motion trail pipe. T and L are sent to the acquisition time span submodule.

3.1 the objective function submodule constructs an energy minimization function E, and three problems need to be considered in constructing the energy minimization function: the first is that the pipe should be kept as much as possible in the compression result. Since most of the existing video compression methods manually preset the length of the compression result (e.g., compress the video to one tenth of the original length), in order to avoid overlapping foreground objects, some pipelines are partially or completely discarded in the final result. The second aspect is that as little overlap as possible should occur between foreground object motion trajectory pipelines that occur simultaneously in the same frame. Since severe overlap can give a visually poor impression, motion trajectory pipelines of different objects appearing at the same location should be assigned different timestamps, thereby avoiding overlap conflicts. The third aspect is to keep the time sequence of the foreground object motion trajectory as much as possible. Different from the purpose that the traditional video compression method tries to compress a longer video into a video with a fixed proportional length, the invention aims to compress an original video into a certain number of pictures, wherein the specific number of the pictures depends on the number and the positions of foreground objects tracked in an image (in brief, the more the tracked foreground object tracks are, the more the final compressed pictures are, so as to ensure that all motion tracks can appear in the final compression result, and the more the motion track pipelines of different objects appearing at the same position are, the more the final compressed pictures are, so as to avoid the conflict among the motion tracks). Therefore, it is desirable to find as few pictures as possible to reproduce the information in the original video, and the motion trajectory pipeline in the pictures should satisfy as few overlapping conflicts as possible between different pipelines, and the same motion trajectory pipeline should appear in the same picture as possible. In summary, the objective function sub-module constructs the energy minimization function E as shown in equation (1):

Where M is the number of motion trajectory pipes, SN is the number of pictures to be compressed, f_pfor the label of the p-th motion track pipeline (when initializing, one motion track pipeline and one label, i.e. f_p∈{1，2，...，M})。

First itemfor the collision loss function, p is more than or equal to 1, and M is more than or equal to q, so as to prevent the p-th motion track pipeline and the q-th motion track pipeline from having too serious overlap. In particular, the present invention relates to a method for producing,

Wherein b is_pIs the outer frame of the p-th motion track pipeline, b_qis the outer frame of the q-th motion trail pipeline, b_p∩b_qAn overlapping area of the p-th motion trail and the q-th motion trail is represented. This conflict occurs in the energy minimization function E to allocate objects with overlapping motion trajectories to different compressed pictures, so that as few conflicts as possible occur between the objects, which affect the enjoyment of the compressed pictures.

second itemIs a similarity loss function that prevents different parts of the same pipeline from being assigned to different pictures when the motion trajectory pipeline of the same object is disconnected, tracked into two or even several segments. By measuring the similarity between different trajectory pipes, it is constrained that similar objects are assigned to the same label (i.e., to the same compressed picture). In particular, the present invention relates to a method for producing,

WhereinIndicates that 2 conditions are satisfied: f. of_p≠f_qIndicating that the motion trajectory pipeline p and the motion trajectory pipeline q are allocated to different pictures;Indicating that the start time of the motion trajectory pipe p should be after the end time of the motion trajectory pipe q. The Λ represents that these two requirements should be satisfied simultaneously.Is used for calculating an external frame asAs a function of the position of the center of the object,Is measured as an extension frameAndtwo object center points ofA function of the Euclidean distance therebetween, andIs to calculate an external frame asIs determined as a function of the color histogram of the object,Is to calculate an external frame asAndTwo object color histograms ofAs a function of the apparent similarity therebetween.

Last itemIt is a label loss function that constrains the number of compressed pictures that are finally synthesized. Using the hyperparameter h_lAs a weight of the loss, h is generally empirically determined_lSet to 1000. Using delta_l(f) to determine whether the label i is used by a certain motion trajectory pipeline p, specifically,

A1 is used and a 0 is not used.

3.2 the Optimization solving submodule adopts QBPO (Quadratic Pseudo-Boolean Optimization) Pseudo-Quadratic Boolean Optimization method [ Hammer et al, dual complementation and persistence in Quadratic 0-1 Optimization, volume 2, pages 121 to 155 in journal of mathematics planning, journal 28, published in 1984]E is optimized to obtain an optimal solution L, namely, a label f is distributed to each motion track pipeline in T_pThe f-th pipeline representing the motion trajectory should be put into the compression result_pIn a picture. And the optimization solving submodule sends the L and the T to the compression synthesis module.

Fourthly, the video compression synthesis module receives T and L from the optimization solution submodule and receives a background picture set B of N video frames from the background subtraction module_ackgroundAnd performing synthesis compression to obtain a video compression result picture set. The specific method comprises the following steps:

4.1 obtaining the time span from the optimization solution submodule receives T and L, for any T_pE.g. T, to obtain T_pTime span ofObtaining a compression result picture time span set T according to the T and the L_span，T_span＝{T_span，1，...，T_span，q，...，T_span，SNget T out of_spanPassed to the average background sub-module. In particular, the present invention relates to a method for producing,

4.1.1 making q 1;

4.1.2 let i2 be 1;

4.1.3 judging whether the label of the i2 th motion trail pipeline in the T is q, if so, carrying out T on the i2 th motion trail pipeline_i2Putting a sub-motion track pipeline set TS with a label of q_qIn, i.e. For a pipeline T with a motion track in L_i2I2 is the serial number of the motion track pipeline with the label q in T; obtaining Ti2 motion track pipeline time spanputting into a motion trajectory pipeline time span set T_q，spanperforming the following steps; 4.1.4, judging whether i2 is smaller than M, if so, changing i2 to i2+1, and changing to 4.1.3; otherwise, 4.1.5 is rotated;

4.1.5 based on T_q，spanObtaining a union T of the time spans of the qth compression result picture_span，q， Wherein N1_qrepresenting the number of motion track pipelines in the qth compression result picture, and converting T into a number of motion track pipelines_span，qPut into T_spanperforming the following steps;

4.1.6, judging whether q is smaller than SN, if so, changing q to q +1, and turning to 4.1.2; otherwise 4.2.

4.2 average background submodule slave acquisition time spanModule acquisition T_spanobtaining B from the background segmentation submodule_ackgroundBased on T_spanAnd B_ackgroundObtaining a background picture set of a compression result picturewill be provided withT, L to a paste sub-module, Whereinthe picture is a background picture of the qth compression result picture. In particular, the present invention relates to a method for producing,

4.2.1 making q ═ 1;

4.2.2 from T_spanAcquiring a union T of motion trail pipeline time spans appearing in the q-th compression result picture_span，q；

4.2.3 according to T_span，qTime information in (1) from (B)_ackgroundObtaining a set of corresponding frames

4.2.4 pairsAveraging the pixels at the same position to obtain the q-th background picture of the compression result pictureWill be provided withIs put intoIn the set;

4.2.5, judging whether q is smaller than SN, if so, changing q to q +1, and turning to 4.2.2; otherwise 4.3.

4.3 paste submodule gets from average background submoduleT, L, pasting the moving track pipeline in T to the pipeline according to the label in LThe corresponding position in the T is determined by the external frame information in the T, and a preliminary pasting result set S is obtained_corse. And mixing S_corseT, L are passed to a linear interpolation sub-module. In particular, the present invention relates to a method for producing,

4.3.1 letting the variable q be 1;

4.3.2 fromThe q-th background picture of the compression result picture is obtained

4.3.3 let the variable r be 1;

4.3.4 from TS_qin the pipeline for obtaining the r-th motion track let T_rMiddle two-tupleIs N_rti.e. the r-th motion track pipeline T_rIn which is N_rtan outer connecting frame;

4.3.5 letting the variable u be 1;

4.3.6 obtaining T_rThe u-th tuple in

4.3.7 background pictureOuter frame inSetting the pixel of the corresponding area to be zero;

4.3.8 find the first in the original videoOf framesArea pastingIs/are as followsAn area;

4.3.9 determining if u is less than N_rtif yes, changing u to u +1, and changing to 4.3.6; otherwise, turning to 4.3.10;

4.3.10 determining whether r is less than N_qif yes, let r be r +1, turn 4.3.4; otherwise, turning to 4.3.11;

4.3.11 background picture to be pastedIs put into S_corseJudging whether q is smaller than SN, if so, changing q to q +1, and turning to 4.3.2; otherwise, S containing SN background pictures which are pasted completely is obtained_corseturn 4.4.

4.4 the Linear interpolation submodule gets S from the paste submodule_corseT, L, based on the label in L and the position information in T, for S_corsethe edge position of the foreground picture in each pasted background picture is linearly interpolated to achieve the effect of softening the edge, and a final compressed picture set S is obtained_fineand then S is_finesave to the local folder. In particular, the present invention relates to a method for producing,

4.4.1 making q ═ 1;

4.4.2 same as 4.3.2, fromThe q-th background picture of the compression result picture is obtained

4.4.3 making r ═ 1;

4.4.4 from sub-motion trajectory pipeline set TS labeled q_qIn obtaining T_r；

4.4.5 making u ═ 1;

4.4.6 obtaining T_rThe u-th tuple inLet T_rhas M edge pixels_ua plurality of;

4.4.7 let v be 1;

4.4.8 pairs of S_corsemiddle-pasting motion track pipeline T_rThe u-th binary group in (1)In thatThe pasting result in (1) is interpolated, specifically, the v-th pixel point p on the external frame is obtained_vobtaining the external frameInterior and p_vAdjacent pixel point p_v，-1And an outer frameouter portion and p_vAdjacent pixel point p_v，+1Let the v-th pixel point p_v＝(p_v，-1+p_v，+1)/2；

4.4.9 determining whether v is less than M_uif yes, converting v to v +1 and converting to 4.4.8; otherwise, 4.4.10 is turned;

4.4.10 determining if u is less than N_rtIf yes, changing u to u +1, and changing to 4.4.6; otherwise, turning to 4.4.11;

4.4.10 determining whether r is less than N_qIf yes, changing r to r +1, and changing to 4.4.4; otherwise, 4.4.12 is turned;

4.4.11 mixing S_corseThe interpolation result of the q picture is stored in S_fineJudging whether q is smaller than SN, if so, changing q to q +1, and turning to 4.4.2; otherwise, finishing interpolation to obtain S containing SN image interpolation results (SN compressed result images)_fine。

The invention can achieve the following technical effects:

The compression ratio is high. The short video can be further compressed into a few pictures, the original video is compressed to the maximum extent on the premise of not influencing the integrity of video information, compared with the prior art, the compression ratio is about 20-50, the compression ratio is 200-2000 when the static background video is compressed by adopting the method, and the average compression ratio is not lower than 500;

the information loss rate is low. The invention can realize high compression ratio, and simultaneously keep meaningful information in the original video, and basically can not cause information loss. The loss rate of real motion information in actual test is below 3%.

The method supports the compression from the video to the image, improves the compression ratio of the video, compresses the static background video which is as long as several minutes into several pictures, and saves the storage space.

the compression results on the 5 test videos are shown in table 1, the video sequences under six different scenes are total, the precision is the size of a video frame, N is the frame number of the video, M is the number of motion tracks in the video frame, SN is the number of pictures in the compression results, the compression ratio is obtained by calculating N/SN, and the information loss rate is the information loss caused by the loss of objects by the detection and tracking algorithm. It can be seen that, for the video sequences recorded in different scenes, the compression ratio of the video sequences can reach 200 to 2000 times according to the difference between the length of the original video and the number of the object tracks tracked in the video, thereby greatly saving the storage space and improving the speed of browsing and retrieving the video contents. It can be seen that the compression ratio of the video sequence shot at high speed is relatively low, because the vehicle runs on a specific track on the highway, the motion track pipeline overlapping rate of the foreground object is higher, and therefore the number of the retained pictures in the compression result is relatively large (i.e. the compression ratio in the table is small, and is only 200.0). In a hall scene, people generally do not walk on the same road when advancing, so that the probability of conflict generated by motion track pipelines of different people is low, a compression result only needs a small number of pictures, and the compression ratio is high (2003.2).

TABLE 1 compression ratio of different video sequences

The invention detects and tracks moving objects in a static background video based on a background subtraction algorithm. And (3) carrying out optimization solution on the identification and tracking results of different objects by a pseudo-quadratic Boolean optimization method (QBPO), pasting a motion trail pipeline into a background picture, and compressing a static background video which is as long as several minutes into a plurality of pictures to realize data compression of the static background video. After the method is adopted, the video of one day can be seen in a plurality of minutes without fast forwarding, and the activity details are played at the real speed;

in summary, the present invention provides a background subtraction-based adaptive compression method for a static background video, which uses a background subtraction method to detect an object in a field of view and uses a tracking method to track the object. Furthermore, the method of minimizing the energy function adaptively finds the appropriate number of compressed pictures and the picture labels that should be assigned to each track pipeline. And then paste the orbit pipeline in the background picture, realize the compression of the video, can use a small amount of pictures to summarize the monitoring scene in hours or even a day and keep nearly all information. The method is simple and easy to implement and has strong universality.

Drawings

FIG. 1 is a general flow diagram of the present invention.

FIG. 2 is a logic diagram of a constructed static background video compression system

Detailed Description

As shown in fig. 1, the present invention comprises the steps of:

In the first step, a static background video compression system is constructed. As shown in fig. 2, the static background video compression system is composed of a background subtraction module, a target tracking module and a video compression module, and the video compression module is composed of a distribution planning module and a video compression composition module.

the background subtraction module is connected with the original video, the target tracking module and the video compression and synthesis module. The background subtraction module performs background subtraction on the original video according to frames to obtain external frame information and a background picture of the foreground object in the video frames, and sends the external frame information of the foreground object in the N video frames to the target tracking module. And after the N video frames are processed, sending the background pictures of the N video frames to a video compression synthesis module.

The background subtraction module consists of a background segmentation submodule, a binarization submodule, a corrosion submodule, an expansion submodule and an external frame acquisition submodule. The background segmentation submodule is connected with the original video, the video compression and synthesis module and the binarization submodule and is used for segmenting the ith frame read from the original video to obtain a foreground picture F_iAnd background picture B_iWill F_iSending to a binarization submodule, and sending B_ipresence of background set B_ackgroundIn the method, after N video frames are processed, B is processed_ackgroundSending to a video compression and composition module, B_ackground＝{B₁，...，B_i，...，B_N}，1≤i≤N。

Etching toolthe module is connected with the binarization submodule and the expansion submodule, and F obtained from the binarization submodule is subjected to_ibcarrying out corrosion treatment to obtain a corroded binary foreground picture F_ibeWill F_ibeAnd sending to the expansion submodule.

the external frame acquisition submodule is connected with the expansion submodule and the target tracking module, and F obtained from the expansion submodule is subjected to_ibedObtaining an outer frame to obtain F_ibedN in expanded binary foreground picture_iBounding box set Bboxes of individual foreground object_iBboxes are_iAnd sending the data to a computation IOU submodule of the target tracking module.

The target tracking module is connected with the background subtraction module and the distribution planning module, and receives N frames in the ith frame of the N video frames from the background subtraction module_iBounding box set Bboxes of individual foreground object_iObtaining N in the ith frame by using a multi-target tracking algorithm_iand after the respective motion tracks of the foreground objects are processed by the N video frames, obtaining a motion track pipeline set T of the foreground objects, and sending the T to the distribution planning module. The target tracking module consists of a Kalman filtering prediction submodule, a calculation IOU submodule and a track association submodule.

The Kalman filtering prediction submodule is connected with the external frame acquisition submodule, the calculation IOU submodule and the track association submodule, and the Kalman filtering prediction submodule is used for acquiring a motion track pipeline of the ith-1 video frame time obtained from the track association submodulePerforming Kalman filtering to predict the possible position of the current motion track pipeline in the ith frame, Wherein A motion trajectory pipeline representing the p-th foreground object, wherein p represents the number of the motion trajectory pipeline,Is the start time (start) of the p-th pipe,Represents the end time (end) of the p-th pipe,Is the time of the jth occurrence of the pth pipe,Is the position of the jth occurrence of the p-th pipe, f_jRepresenting the video frame that occurred the jth time in the pth pipeline,Represents the bounding box of the jth occurrence of the pth pipeline,Representing the top left vertex coordinates of the bounding box,The width of the external frame is the same as the width of the external frame,Is the height of the outer frame. According toPredicting the ith frame mayEnergy trace pipe bounding box set Bboxes_i，predBboxes are_i，predAnd sending the data to the IOU submodule.

Distribution planningThe module comprises a construction objective function submodule and an optimization solution submodule. And the building objective function submodule is connected with the track association submodule and the optimization solving submodule, and an energy minimization function E is built for the T received from the track association submodule and is transmitted to the optimization solving submodule. The optimization solution submodule is connected with the objective function building submodule and the video compression synthesis module, L and SN are obtained through optimization solution E, L is a label set of a motion track pipeline T, and L is { f ═ f }_p，1≤p≤M，1≤f_pSN ≦ SN, SN being the number of pictures into which the original video (N frames) is to be compressed, f_pIs the label of the p-th motion trail pipeline, namely the optimization solution result considers that the p-th motion trail pipeline should be placed to the f-th of the compression result_pIn a picture. T and L are sent to the acquisition time span submodule.

the time span acquisition submodule is connected with the optimization solution submodule and the average background submodule of the distribution planning module, receives T and L from the optimization solution submodule, and performs calculation on any T_pE.g. T, to obtain T_pTime span ofaccording to the label L, based on the Nth compression result picture N1_qCalculating the time span of the motion track pipeline to obtain the time span T of the q-th compression result picture_span，q’ 1≤q≤SN，1≤z≤N1_qWherein N1_qIndicates the number of motion trajectory pipes in the qth compression result picture,For the z-th motion trail pipeline time span appearing in the q-th compression result picture, a compression result picture time span set T formed by the time spans of SN compression result pictures_spanPassed to the average background submodule, T_span＝{T_span，1，...，T_span，q，...，T_span，SN}；

The average background submodule is connected with the original video, the acquisition time span submodule and the background segmentation submodule, and T is obtained from the acquisition time span submodule_spanObtaining B from the background segmentation submodule_ackgroundAccording to T_span，qObtaining a background picture set in the time periodTo pairaveraging the pixels at the same position to obtain the q-th background picture of the compression result pictureAnd a background picture set composed of SN compressed result picturest, L to a paste sub-module,

2.1 making variable i equal to 1, making temporary motion track pipeline set

2.2 the background segmentation sub-module adopts a self-adaptive segmenter (PBAS) algorithm based on pixels to segment the foreground and the background of the ith video frame read from the original video to obtain a foreground picture F of the ith video frame_iAnd background picture B_iwill F_iSending to a binarization submodule, and sending B_isave to B_ackgroundIn (1).

2.3 binarization submodule pairs F received from background segmentation submodule_iPerforming binarization processing, wherein the binarization processing method adopts a threshold method (only a threshold function is called, the input is a foreground picture, the output is a binarization foreground picture) in an opencv computer vision processing method library (for example, version 3.4.3, the website of an open source algorithm library is https:// opencv_iTo obtain F_iBinarized foreground picture F_ibWill F_ibsent to the corrosion submodule. F_ibEach pixel in the binarization foreground is 0 or 1, wherein 0 represents that the position corresponding to the pixel is a background, and 1 represents that the position corresponding to the pixel is a foreground.

2.5 expansion submodule receiving F from Corrosion submodule_ibeProcessing by adopting an expansion method to obtain a corroded and expanded binary foreground picture F_ibedWill F_ibedAnd sending the data to an external frame acquisition submodule. And the expansion method adopts a dilate method in an opencv computer vision processing method library (namely calling a cv2.dilate function, wherein the input is a corroded binarization foreground picture, and the output is a corroded and expanded binarization foreground picture).

2.6 the outer frame acquisition submodule receives F from the expansion submodule_ibedFrom F_ibedDetecting N in the ith video frame_iBounding box set Bboxes of individual foreground objects_iBboxes are_iis sent to the computation IOU sub-module,

Wherein N is_irepresenting the number of foreground objects in the ith frame of video,The coordinates of the top left vertex of the bounding box of the jth 1 th foreground object in the ith frame of video,Is the width of the bounding box for the j1 th foreground object in the ith frame of video,The height of the bounding box of the j1 th foreground object in the ith frame of video. The detection method adopts a findcontour method in an opencv computer vision processing method library (namely calling a cv2 findcontour function, wherein the input of the findcontour function is a corroded and expanded binary foreground picture, and the output of the findcontour function is an outer frame set of a foreground object in a current frame).

2.7Kalman filtering prediction submodule obtains motion track pipeline from track association submoduleIf it isThen orderOrder toAnd 2.8. If it isThen there isObtaining an ith frame prediction track pipeline external frame set by using Kalman filtering predictionThe specific method comprises the following steps:

2.7.1 let p be 1;

2.8 Bboxes_i，predAnd sending the data to the IOU submodule.

2.9 compute IOU submodule vs Bboxes obtained from the Extra Box acquisition submodule_iThe outer bounding box in (1) and Bboxes obtained from the Kalman Filter predictor Module_i，predCarrying out IOU calculation on the external frame to obtain an IOU matrix Mat_IOUIn which Mat_IOUrow k and column l And k is more than or equal to 1 and less than or equal to N_i，1≤l≤N_i，pred，Bboxes_ithe number of the middle and outer connecting frames is N_i，Bboxes_i，predthe number of the middle and outer connecting frames is N_i，pred. Mixing Mat_IOUand sending the data to a track association submodule. In particular, the present invention relates to a method for producing,

2.9.1 let k equal to 1,

2.9.2 obtaining Bboxes_iThe kth outer frame of (1)

2.9.3 when l is 1,

2.9.4 obtaining Bboxes_i，predThe first external frame

2.9.6 judging whether l is less than N_i，predIf yes, let l be l +1, go to 2.9.4; otherwise, turn to 2.9.7.

2.10 the trajectory correlation submodule adopts Hungarian algorithm based on the intersection-and-comparison matrix Mat_IOUFor Bboxes_iand Bboxes_i，predThe external frames of the foreground target in the foreground are correlated to obtain M_iA set of associated tuplesBased on Tupple_iMotion track pipeline for generating ith frame by information of middle and outer connecting framesWill be provided withAnd sending the data to a Kalman filtering prediction submodule.

2.11 the background segmentation submodule determines whether the video frame can still be received from the camera, and if so, changes i to i +1 and changes to 2.2; if not, i is equal to N,representing that N video frames are obtained from the camera, the background segmentation sub-module obtains a background picture set B consisting of N color background pictures_ackground，B_ackground＝{B_iI is more than or equal to 1 and less than or equal to N, and B is equal to or less than N_ackgroundsending the average background to an average background submodule of a video compression and synthesis module; and the track association submodule obtains a motion track pipeline set T of the foreground target in the N video frames, sends the T to a target function construction submodule of the distribution planning module, and then the third step is carried out.

3.1 the objective function sub-module construction, as the energy minimization function E of equation (1):

First itemP is more than or equal to 1, q is more than or equal to M,

Wherein b is_pIs the outer frame of the p-th motion track pipeline, b_qIs the outer frame of the q-th motion trail pipeline, b_p∩b_qan overlapping area of the p-th motion trail and the q-th motion trail is represented.

second itemIs a function of the similar loss as the loss,

Last itemthen it is the tag loss function, using the hyperparameter h_lAs a weight of the loss, h_lSet to 1000. Using delta_l(f) To determine whether the label i is used by a certain motion trajectory pipeline p, specifically,

A1 is used and a 0 is not used.

3.2 the optimization solution submodule optimizes E by adopting a QBPO pseudo-quadratic Boolean optimization method to obtain an optimal solution L, namely, each motion track pipeline in T is allocated with a label f_pThe f-th pipeline representing the motion trajectory should be put into the compression result_pIn a picture. And the optimization solving submodule sends the L and the T to the compression synthesis module.

4.1 obtaining time span sub-Module fromThe optimization solution submodule receives T and L and carries out optimization on any T_pE.g. T, to obtain T_pTime span ofaccording to the T and the L, obtaining a time span set T of the compression result picture_span，T_span＝{T_span，1，...，T_span，q，...，T_span，SNGet T out of_spanPassed to the average background sub-module. In particular, the present invention relates to a method for producing,

4.1.1 making q 1;

4.1.2 let i2 be 1;

4.1.3 judging whether the label of the i2 th motion trail pipeline in the T is q, if so, carrying out T on the i2 th motion trail pipeline_i2Putting a sub-motion track pipeline set TS with a label of q_qIn, i.e. For a pipeline T with a motion track in L_i2I2 is the serial number of the motion track pipeline with the label q in T; obtaining T_i2Time span of motion trajectory pipelinePutting into a motion trajectory pipeline time span set T_q，spanperforming the following steps; judging whether i2 is smaller than M or not at middle 4.1.4, if so, changing i2 to i2+1, and changing to 4.1.3; otherwise, 4.1.5 is rotated;

4.1.5 based on T_q，spanObtaining a union T of the time spans of the qth compression result picture_span，q，1≤q≤SN，1≤z≤N1_qWherein N1_qRepresenting the number of motion track pipelines in the qth compression result picture, and converting T into a number of motion track pipelines_span，qPut into T_spanPerforming the following steps;

4.2 average background submodule obtains T from the acquisition time span submodule_spanobtaining B from the background segmentation submodule_ackgroundBased on T_spanAnd B_ackgroundObtaining a background picture set of a compression result pictureWill be provided withT, L to a paste sub-module, whereinThe picture is a background picture of the qth compression result picture. In particular, the present invention relates to a method for producing,

4.2.1 making q ═ 1;

4.3.1 letting the variable q be 1;

4.3.3 let the variable r be 1;

4.3.5 letting the variable u be 1;

4.3.6 obtaining T_rThe u-th tuple in

4.4 the Linear interpolation submodule gets S from the paste submodule_corset, L, based on the label in L and the position information in T, for S_corseThe foreground in each background picture which is pastedThe edge position of the picture is linearly interpolated to achieve the effect of softening the edge, and a final compressed picture set S is obtained_fineAnd then S is_finesave to the local folder. In particular, the present invention relates to a method for producing,

4.4.1 making q ═ 1;

4.4.3 making r ═ 1;

4.4.4 from sub-motion trajectory pipeline set TS labeled q_qIn obtaining T_r；

4.4.5 making u ═ 1;

4.4.6 obtaining T_rThe u-th tuple inLet T_rHas M edge pixels_uA plurality of;

4.4.7 let v be 1;

Claims

1. a static background video self-adaptive compression method based on background subtraction is characterized by comprising the following steps:

the method comprises the steps that firstly, a static background video compression system is constructed, wherein the static background video compression system consists of a background subtraction module, a target tracking module and a video compression module, and the video compression module consists of a distribution planning module and a video compression synthesis module;

The background subtraction module is connected with the original video, the target tracking module and the video compression and synthesis module; the background subtraction module performs background subtraction on an original video containing N video frames according to the frames to obtain external frame information and a background picture of a foreground object in the video frames, and sends the external frame information of the foreground object in the N video frames to the target tracking module; after the N video frames are processed, sending background pictures of the N video frames to a video compression synthesis module;

the background subtraction module consists of a background segmentation sub-module, a binarization sub-module, a corrosion sub-module, an expansion sub-module and an external frame acquisition sub-module; the background segmentation submodule is connected with the original video, the video compression and synthesis module and the binarization submodule, and segments the ith frame of video frame read from the original video to obtain a foreground picture F_iAnd background picture B_iWill F_iSending to a binarization submodule, and sending B_iPresence of background set B_ackgroundIn the method, after N video frames are processed, B is processed_ackgroundSending to a video compression and composition module, B_ackground＝{B₁，...，B_i，...，B_N}，1≤i≤N；

The binarization submodule is connected with the background segmentation submodule and the corrosion submodule, and F received from the background segmentation submodule is subjected to_iEach pixel value is subjected to binarization processing to obtain a binarized foreground picture F_ibAnd F is_ibSending to a corrosion submodule;

The corrosion submodule is connected with the binarization submodule and the expansion submodule, and F obtained from the binarization submodule is subjected to_ibCarrying out corrosion treatment to obtain a corroded binary foreground picture F_ibewill F_ibeSending to an expansion submodule;

The expansion submodule is connected with the corrosion submodule and the external frame acquisition submodule, and F obtained from the corrosion submodule is subjected to_ibeexpansion treatment is carried out to obtain a corroded and expanded binary foreground picture F_ibedWill F_ibedSending the data to an external frame acquisition submodule;

the external frame acquisition submodule is connected with the expansion submodule and the target tracking module, and F obtained from the expansion submodule is subjected to_ibedObtaining an outer frame to obtain F_ibedn in expanded binary foreground picture_iBounding box set Bboxes of individual foreground object_iBboxes are_iThe calculation IOU submodule is sent to a target tracking module;

The target tracking module is connected with the background subtraction module and the distribution planning module, and receives N frames in the ith frame of the N video frames from the background subtraction module_iBounding box set Bboxes of individual foreground object_iobtaining N in the ith frame by using a multi-target tracking algorithm_iAfter the N video frames are processed, obtaining a motion track pipeline set T of the foreground object, and sending the T to a distribution planning module;

The target tracking module consists of a Kalman filtering prediction submodule, a calculation IOU submodule and a track association submodule;

The Kalman filtering prediction submodule is connected with the external frame acquisition submodule, the calculation IOU submodule and the track association submodule, and the Kalman filtering prediction submodule is used for acquiring a motion track pipeline of the ith-1 video frame time obtained from the track association submodulePerforming Kalman filtering to predict the possible position of the current motion track pipeline in the ith frame,p denotes the number of the motion trajectory pipe,A motion trajectory pipeline representing the p-th foreground object, whereinis the start time of the p-th pipe,Indicating the end time of the p-th pipe,Is the time of the jth occurrence of the pth pipe,Is the position of the jth occurrence of the p-th pipe, f_jRepresenting the video frame that occurred the jth time in the pth pipeline,Represents the bounding box of the jth occurrence of the pth pipeline,Representing the top left vertex coordinates of the bounding box,The width of the external frame is the same as the width of the external frame,The height of the external frame; according toPredicting probable track pipeline bounding box set Bboxes of ith frame_i，predBboxes are_i，predSending to the IOU submodule;

The IOU submodule is connected with the external frame acquisition submodule, the Kalman filtering prediction submodule and the track correlation submodule, and Bboxes obtained from the external frame acquisition submodule_iThe outer bounding box in (1) and Bboxes obtained from the Kalman Filter predictor Module_i，predCarrying out IOU calculation on the external frame to obtain an IOU matrix Mat_IOUmixing Mat_IOUSending the data to a track association submodule;

the track correlation submodule is connected with the calculation IOU submodule, the Kalman filtering prediction submodule and the distribution planning module, and is used for obtaining Mat from the calculation IOU submodule_IOUDistributing to obtain motion track pipeline of i-1 video frame timeWill be provided withSending the data to a Kalman filtering prediction submodule; after the N video frames are processed, obtaining a motion track pipeline set T of the foreground object in the N video frames, wherein i-1 is equal to N,Let M denote the number of motion trajectory pipes in N video frames, i.e. M is temp_NSending T to the distribution planA module;

The video compression module consists of a distribution planning module and a video compression synthesis module;

The distribution planning module is connected with the target tracking module and the video compression synthesis module, receives T from the target tracking module, constructs a target function E for T, performs optimization solution on E to obtain a label set L of a foreground object motion track pipeline in N video frames, and sends T and L to the video compression synthesis module;

The distribution planning module consists of a target function constructing sub-module and an optimization solving sub-module; the building objective function submodule is connected with the track association submodule and the optimization solving submodule, an energy minimization function E is built for the T received from the track association submodule, and the energy minimization function E is transmitted to the optimization solving submodule; the optimization solution submodule is connected with the objective function building submodule and the video compression synthesis module, L and SN are obtained through optimization solution E, L is a label set of a motion track pipeline T, and L is { f ═ f }_p，1≤p≤M，1≤f_pSN ≦ SN, SN being the number of pictures into which the original video is to be compressed, f_pIs the label of the p-th motion trail pipeline, namely the optimization solution result considers that the p-th motion trail pipeline should be placed to the f-th of the compression result_popening a picture; sending the T and the L to an acquisition time span submodule;

the video compression and synthesis module is connected with the original video, the background segmentation submodule and the distribution planning module, receives T and L from the distribution planning module, and receives B from the background segmentation submodule_ackaroundObtaining the pixel value of the foreground motion track in T from the original video, and adding B to the foreground motion track_ackaroundPasting the background picture to obtain a video compression result;

The video compression synthesis module consists of an acquisition time span submodule, an average background submodule, a pasting submodule and a linear interpolation submodule;

The time span acquisition submodule is connected with the optimization solution submodule and the average background submodule of the distribution planning module, receives T and L from the optimization solution submodule, and performs calculation on any T_pE.g. T, to obtain T_pTime span ofaccording to the label L, based on the Nth compression result picture N1_qCalculating the time span of the motion track pipeline to obtain the time span T of the q-th compression result picture_span，q， Wherein N1_qIndicates the number of motion trajectory pipes in the qth compression result picture,For the z-th motion trail pipeline time span appearing in the q-th compression result picture, a compression result picture time span set T formed by the time spans of SN compression result pictures_spanPassed to the average background submodule, T_span＝{T_span，1，...，T_span，q，...，T_span，SN}；

The average background submodule is connected with the original video, the acquisition time span submodule and the background segmentation submodule, and T is obtained from the acquisition time span submodule_spanobtaining B from the background segmentation submodule_ackgroundAccording to T_spanT in (1)_span，qobtaining a background picture set in the time periodto pairAveraging the pixels at the same position to obtain the q-th background picture of the compression result pictureAnd composed of SN compressed result picturesBackground picture collectionT, L to a paste sub-module,

The paste submodule is connected with the average background submodule and the linear interpolation submodule, and the paste submodule and the average background submodule are obtained from the average background submoduleAnd T, L, pasting T to LCorresponding positions to obtain a primary pasting result set S_corseand then S is_corseT, L to a linear interpolation sub-module;

The linear interpolation submodule is connected with the pasting submodule to obtain S from the pasting submodule_corset, L, linear interpolation is carried out on the edge position of the pasting result to achieve the effect of softening the edge, and the final compressed picture set S is obtained_fine；

Secondly, the background subtraction module reads in video frames of an original video frame by frame, performs background subtraction on the video frames to obtain foreground target external frames and background pictures of N video frames, and the target tracking module tracks the foreground target external frames of the N video frames to obtain a motion track pipeline of foreground targets of the N video frames, wherein the specific method comprises the following steps:

2.1 making variable i equal to 1, making temporary motion track pipeline set

2.2 the background segmentation sub-module adopts the pixel-based adaptive segmenter (PBAS) algorithm to segment the foreground and the background of the ith video frame read from the original video to obtain the foreground picture F of the ith video frame_iAnd background picture B_iWill F_iSending to a binarization submodule, and sending B_iSave to B_ackgroundPerforming the following steps;

2.3 binarization submodule pairs F received from background segmentation submodule_iPerforming binarization treatment to obtain F_iBinarized foreground picture F_ibWill F_ibSending to a corrosion submodule; f_ibeach pixel in the binarization foreground is 0 or 1, wherein 0 represents that the corresponding position of the pixel is a background, and 1 represents that the corresponding position of the pixel is a foreground;

2.4 etching submodule by etching method to F received from binaryzation submodule_ibCarrying out corrosion treatment to obtain a corroded binary foreground picture F_ibeWill F_ibeSending to an expansion submodule;

2.5 expansion submodule receiving T from Corrosion submodule_ibeProcessing by adopting an expansion method to obtain a corroded and expanded binary foreground picture F_ibedwill F_ibedSending the data to an external frame acquisition submodule;

2.6 the outer frame acquisition submodule receives F from the expansion submodule_ibedFrom F_ibedDetecting N in the ith video frame_iBounding box set Bboxes of individual foreground objects_iBboxes are_iIs sent to the computation IOU sub-module, wherein N is_iRepresenting the number of foreground objects in the ith frame of video,The coordinates of the top left vertex of the bounding box of the jth 1 th foreground object in the ith frame of video,Is the width of the bounding box for the j1 th foreground object in the ith frame of video,the height of the circumscribed frame of the j1 th foreground object in the ith frame of video;

2.7Kalman filtering prediction submodule obtains motion track pipeline from track association submoduleIf it isOrder toOrder toRotating for 2.8; if it isThen there isObtaining an ith frame prediction track pipeline external frame set by using Kalman filtering predictionRotating for 2.8;

2.8 Bboxes_i，precdSending to the IOU submodule;

2.9 compute IOU submodule vs Bboxes obtained from the Extra Box acquisition submodule_iThe outer bounding box in (1) and Bboxes obtained from the Kalman Filter predictor Module_i，predcarrying out IOU (input output) intersection-to-parallel ratio calculation on the external frame to obtain an IOU matrix Mat_IOUIn which Mat_IOURow k and column l And k is more than or equal to 1 and less than or equal to N_i，1≤l≤N_i，pred，Bboxes_iThe number of the middle and outer connecting frames is N_i，Bboxes_i，predThe number of the middle and outer connecting frames is N_i，predMixing Mat_IOUsending the data to a track association submodule;

2.10 track correlation submodule adopting Hungarian algorithm based on intersection-to-parallel ratio matrix Mat_IOUFor Bboxes_iand Bboxes_i，predThe external frames of the foreground target in the foreground are correlated to obtain M_ia set of associated tuplesBased on Tupple_iMotion track pipeline for generating ith frame by information of middle and outer connecting frameswill be provided withsending the data to a Kalman filtering prediction submodule;

2.11 the background segmentation submodule determines whether the video frame can still be received from the camera, and if so, changes i to i +1 and changes to 2.2; if not, the background segmentation sub-module obtains a background picture set B consisting of N color background pictures_ackground，B_ackground＝{B_iI is more than or equal to 1 and less than or equal to N, and B is equal to or less than N_ackgroundsending the average background to an average background submodule of a video compression and synthesis module; the track association submodule obtains a motion track pipeline set T of the foreground target in the N video frames, sends the T to a target function building submodule of the distribution planning module, and then the third step is carried out;

thirdly, the distribution planning module distributes labels to the motion track pipeline T of the foreground target received from the target tracking module to obtain a label set L ═ f in a compression result of the T_p，1≤p≤M，1≤f_p≤SN}，f_pIs the label of the p motion trail pipeline; sending the T and the L to an acquisition time span submodule;

3.1 the objective function submodule constructs an energy minimization function E as shown in equation (1):

Where SN is the number of pictures to be compressed into, f_pFor the p-th motion track pipe label, f_p∈{1，2，...，M})；

first itemp is more than or equal to 1 and q is more than or equal to M as a collision loss function;

Wherein b is_pIs the outer frame of the p-th motion track pipeline, b_qIs the outer frame of the q-th motion trail pipeline, b_p∩b_qRepresenting the overlapping area of the p-th motion track and the q-th motion track;

Second itemIs a function of the similar loss as the loss,

WhereinIndicates that 2 conditions are satisfied: f. of_p≠f_qindicating that the motion trajectory pipeline p and the motion trajectory pipeline q are allocated to different pictures;the starting time of the motion-trajectory pipe p should be after the ending time of the motion-trajectory pipe q, and Λ representsBoth requirements are met;Is used for calculating an external frame asAs a function of the position of the center of the object,Is measured as an extension frameandTwo object center points ofA function of the Euclidean distance therebetween, andis to calculate an external frame asis determined as a function of the color histogram of the object,is to calculate an external frame asAndTwo object color histograms ofThe appearance similarity betweenA function of (a);

last itemThen it is the tag loss function, using the hyperparameter h_lTo use δ as a weight of the loss_l(f) to determine whether the label i is used by a certain motion track pipeline p,

1 is to indicate used, 0 indicates not used;

3.2 the optimization solution submodule optimizes E by QBPO to obtain the optimal solution L, namely, each motion track pipeline in T is distributed with a label f_pThe f-th pipeline representing the motion trajectory should be put into the compression result_pIn a picture, the optimization solution submodule sends L and T to the compression synthesis module;

Fourthly, the video compression synthesis module receives T and L from the optimization solution submodule and receives a background picture set B of N video frames from the background subtraction module_ackgroundPerforming synthetic compression to obtain a video compression result picture set; the specific method comprises the following steps:

4.1 obtaining the time span from the optimization solution submodule receives T and L, for any T_pe.g. T, to obtain T_ptime span of whichobtaining a compression result picture time span set T according to the T and the L_span，T_span＝{T_span，1，...，T_span，q，...，T_span，SNGet T out of_spanpassed to the average background submodule, T_span，qFor the time span of the qth compression result picture, Wherein N1_qRepresenting the number of motion trail pipelines in the q picture of the compression result;

4.2 average background submodule obtains T from the acquisition time span submodule_spanObtaining B from the background segmentation submodule_ackgroundBased on T_spanand B_ackgroundObtaining a background picture set of a compression result pictureWill be provided witht, L to a paste sub-module, Whereina background picture of the q-th compression result picture;

4.3 paste submodule gets from average background submoduleT, L, pasting the moving track pipeline in T to the pipeline according to the label in LThe corresponding position in the T is determined by the external frame information in the T, and a preliminary pasting result set S is obtained_corseAnd then S is_corseT, L to a linear interpolation sub-module;

4.4 the Linear interpolation submodule gets S from the paste submodule_corseT, L, based on the label in L and the position information in T, for S_corseEach background picture that has been pastedPerforming linear interpolation on the edge position of the middle foreground picture to obtain a final compressed picture set S containing SN picture interpolation results_fine。

2. The adaptive compression method for static background video based on background subtraction as claimed in claim 1, wherein the binarization processing method in step 2.3 is a threshold method in opencv computer vision processing method library.

3. The background subtraction-based static background video adaptive compression method according to claim 1, wherein the erosion method in step 2.4 adopts an enode method in an opencv computer vision processing method library.

4. the adaptive compression method for static background video based on background subtraction as claimed in claim 1, wherein the expansion method in step 2.5 adopts a dilate method in opencv computer vision processing method library.

5. The adaptive compression method for static background video based on background subtraction as claimed in claim 1, wherein the step 2.6 the extension frame acquisition sub-module is selected from the group consisting of_ibedThe detection method adopted when the external frame set of the foreground target is detected is a findcontour method in an opencv computer vision processing method library.

6. The adaptive compression method for static background video based on background subtraction as claimed in claim 1, wherein said Kalman filtering prediction submodule in step 2.7 obtains the pipeline bounding box set of the ith frame prediction track by using Kalman filtering predictionThe specific method comprises the following steps:

2.7.1 let p be 1;

2.7.4 determining whether p is less than temp_i-1If yes, changing p to p +1 and changing to 2.7.2; otherwise, ending.

7. the adaptive compression method for static background video based on background subtraction as claimed in claim 1, wherein the specific method for calculating the IOU cross-over-cross-over ratio by the IOU calculation submodule in step 2.9 is:

2.9.1 making k equal to 1;

2.9.2 obtaining Bboxes_iThe kth outer frame of (1)

2.9.3 when l is 1,

2.9.4 obtaining Bboxes_i，predThe first external frame

2.9.5 calculationNamely Bboxes_iMiddle (k) thA circumscribed rectangle frameAnd Bboxes_i，predMiddle and first external rectangular frameIs divided by the sum of the areas of the two circumscribed rectangles, and the result is put in Mat_IoURow k, column l;

2.9.6 judging whether it is less than N_i，predIf yes, let l be l +1, go to 2.9.4; otherwise, turning to 2.9.7;

2.9.7 determining if k is less than N_iif yes, changing k to k +1, and turning to 2.9.2; otherwise, ending.

8. The adaptive compression method for static background video based on background subtraction as claimed in claim 1, wherein said h in formula (1) in the third step_lset to 1000.

9. The adaptive compression method for static background video based on background subtraction as claimed in claim 1, wherein the step 4.1 obtains the compression result picture time span set T according to T and L by the step obtaining time span sub-module_spanThe specific method comprises the following steps:

4.1.1 making q 1;

4.1.2 let i2 be 1;

4.1.3 judging whether the label of the i2 th motion trail pipeline in the T is q, if so, carrying out T on the i2 th motion trail pipeline_i2Putting a sub-motion track pipeline set TS with a label of q_qIn, i.e. For a pipeline T with a motion track in L_i2I2 is the serial number of the motion track pipeline with the label q in T; obtaining T_i2Time span of motion trajectory pipelinePutting into a motion trajectory pipeline time span set T_q，spanperforming the following steps;

4.1.4, judging whether i2 is smaller than M, if so, changing i2 to i2+1, and changing to 4.1.3; otherwise, 4.1.5 is rotated;

4.1.6, judging whether q is smaller than SN, if so, changing q to q +1, and turning to 4.1.2; otherwise, ending.

10. The adaptive compression method for static background video based on background subtraction as claimed in claim 1, wherein 4.2 steps of said average background sub-module is based on T_spanAnd B_ackgroundObtaining a background picture set of a compression result pictureThe specific method comprises the following steps:

4.2.1 letting the variable q be 1;

4.2.5, judging whether q is smaller than SN, if so, changing q to q +1, and changing to 4.2.2; otherwise, ending.

11. the adaptive compression method for static background video based on background subtraction as claimed in claim 1, wherein the pasting sub-module in step 4.3 pastes the motion trail pipeline in T to the label in LThe corresponding position in the previous step is obtained to obtain a preliminary pasting result set S_corseThe specific method comprises the following steps:

4.3.1 letting the variable q be 1;

4.3.3 let the variable r be 1;

4.3.4 child motion from label qTrajectory pipeline set TS_qIn the pipeline for obtaining the r-th motion track Let T_rMiddle two-tupleis N_rtI.e. the r-th motion track pipeline T_rIn which is N_rtAn outer connecting frame; For a pipeline T with a motion track in L_i2I2 is the serial number of the motion trail pipeline with q label in T, TS_qThe number of the middle motion track pipelines is N_q；

4.3.5 letting the variable u be 1;

4.3.6 obtaining T_rThe u-th tuple in

4.3.11 background picture to be pastedIs put into S_corseJudging whether q is smaller than SN, if so, changing q to q +1, and turning to 4.3.2; otherwise, S containing SN background pictures which are pasted completely is obtained_corse。

12. The adaptive compression method for static background video based on background subtraction as claimed in claim 1, wherein the 4.4-step linear interpolation sub-module pairs S from the pasting sub-module according to the label in L and the position information in T_corseCarrying out linear interpolation on the edge position of the foreground picture in each pasted background picture to obtain a final compressed picture set S_finethe specific method comprises the following steps:

4.4.1 making q ═ 1;

4.4.2 fromThe q-th background picture of the compression result picture is obtained

4.4.3 making r ═ 1;

4.4.4 Slave sub-trajectories with label qPipeline set TS_qin obtaining T_r(ii) a Is medium;

4.4.5 making u ═ 1;

4.4.6 obtaining T_rThe u-th tuple inLet T_rHas M edge pixels_uA plurality of;

4.4.7 let v be 1;

4.4.8 pairs of S_corseMiddle-pasting motion track pipeline T_rThe u-th binary group in (1)In thatThe interpolation is carried out on the pasting result in the step (2), and the method comprises the following steps: obtaining the v-th pixel point p on the external frame_vObtaining the external frameInterior and p_vAdjacent single pixel point p_v，-1and an outer frameOuter portion and p_vAdjacent single pixel point p_v，+1Let the v-th pixel point p_v＝(p_v，-1+p_v，+1)/2；

4.4.11 mixing S_corseThe interpolation result of the q picture is stored in S_finejudging whether q is smaller than SN, if so, changing q to q +1, and turning to 4.4.2; otherwise, completing the insertionValue S of SN image interpolation result, SN compression result image_fine。