CN112616023A

CN112616023A - Multi-camera video target tracking method in complex environment

Info

Publication number: CN112616023A
Application number: CN202011536697.6A
Authority: CN
Inventors: 刘文平; 王斌
Original assignee: Jingmen Huiyijia Information Technology Co ltd
Current assignee: Jingmen Huiyijia Information Technology Co ltd
Priority date: 2020-12-22
Filing date: 2020-12-22
Publication date: 2021-04-06

Abstract

Compared with the prior art, the multi-camera video target tracking method under the complex environment has the advantages that the tracking intelligent adjustment processing degree is obviously improved, the method has good resistance to difficult problems in target tracking, such as illumination change, target shielding and noise, the application value of the algorithm in practical projects is greatly improved, the capability of the target tracking method for coping with the complex environment is improved, and the robustness of the method is improved. The multiple cameras have good tracking feasibility, high accuracy and high efficiency.

Description

Multi-camera video target tracking method in complex environment

Technical Field

The invention relates to a video target tracking method, in particular to a multi-camera video target tracking method in a complex environment, and belongs to the technical field of video target tracking.

Background

With the rapid development of electronic components, communication technology and multimedia technology, video image acquisition equipment is applied to daily life in a large number, people have stronger and stronger requirements on machine intelligent analysis of video images, and computer vision is a subject of operation. Computer vision is a new subject developed on the basis of image processing, and related technologies of computer vision are successfully applied to the fields of industrial intelligence, intelligent robots, medical image processing, video monitoring, human-computer interaction and the like. The intelligent video monitoring is one of main applications of computer vision, and mainly refers to a monitoring mode of acquiring related information from a video image by a computer, intelligently analyzing and making corresponding decisions. Since its decision process depends on the information extraction of the video image source, how to analyze the image becomes a key issue for the application of this technology. In a traditional video monitoring mode, security personnel need to monitor video pictures for a long time, which is far beyond the acceptance of people, so that the actual video monitoring effect is low. Therefore, the monitoring effect is improved by utilizing the machine autonomous recognition, and the burden of security personnel is reduced, so that the method is very necessary and has great value.

Moving object tracking is one of the important technologies of video surveillance technology, and in recent years, although some object tracking methods are proposed, the video object tracking method is still very challenging and still faces many problems: firstly, the problem of long-term tracking is solved, the accuracy of the video target tracking method in the prior art can be ensured only in a short term, and the tracking becomes more and more inaccurate along with the lapse of the tracking time; secondly, the robustness problem, because of the influence of factors such as the posture, illumination change and shielding of the target, the tracking drift and even the loss of the tracked target in the target tracking can be caused frequently. In the current practical video monitoring application, the arrangement of the cameras is generally more than one, while the video target tracking method in the prior art mainly stays in target tracking under a single camera, and the tracking under multiple cameras involves less. Therefore, the research and development of the long-term target tracking method under the condition of multiple cameras have research significance and huge market application value.

The current situation and the deficiency of the video compression perception tracking method in the prior art and the problems to be solved by the invention are shown as follows:

with the increasing demand of video target tracking technology and more complex and changeable application environments, the requirement on the tracking method is higher and higher, the tracking method is required to have high operation efficiency and accurate tracking result, and the tracking method is also required to have strong anti-interference capability for the complex and changeable tracking environments, and the video compression sensing tracking method is still a space for improvement in the aspect as an algorithm with higher robustness.

The tracking target of the video compressed sensing tracking method is specified by a user, and the size of the specified tracking box when the target is selected is very important for the algorithm, because the size of the tracking box has an influence on the video compressed sensing tracking method all the time: firstly, the size of a target sampling frame is determined, a target in the tracking frame in the tracking method is regarded as a tracking target, the characteristics of the video compressed sensing tracking method are obtained in the tracking frame when the target is compressed and sampled, and then a characteristic matrix of the target is calculated according to the characteristics, so that the size of the tracking frame is an important factor influencing target sampling; secondly, the generation of positive and negative samples is determined, the positive and negative samples of the tracking target are generated by sampling around a tracking frame, and the size of the tracking frame influences the sampling range and accuracy of the positive and negative samples; thirdly, the range of target search is determined, the detector does not acquire candidate detection targets in the whole image range when detecting the targets, so that the calculation amount is too large, the purpose is lacked, and the detection is not allowed in time and space, therefore, the video compressed sensing real-time tracking method carries out target search around the target position of the previous frame, and the size of a tracking frame determines the range covered by the target search.

However, the video compression perception tracking method in the prior art does not have a mechanism for intelligently adjusting the size of the target tracking frame, the target tracking frame does not change along with the change of the size of the target, and when the size of the target changes significantly, some problems occur in the tracking process:

firstly, the change situation of the target cannot be reflected, the motion situation of the object cannot be predicted generally, a good tracker not only can correctly reflect the motion track of the target, but also can correctly reflect the change situation of the target size, and the tracking frame of the video compression perception tracking method is fixed and cannot timely and accurately reflect the change situation of the target size;

secondly, more noise is introduced to affect the subsequent target tracking result, when the size of the tracked target changes significantly, the samples in the range of the tracking frame cannot reflect the situation of the real target, when the target increases, the tracking frame of the video compressed sensing tracking method indicates a small part of the target, only the local part of the target can be sampled and tracked, the tracking result of the algorithm is within an acceptable range, but when the target decreases, other information besides the tracked target in the range of the target frame is necessary, for example: in an extreme case, when the information of the target itself only occupies a small part of the whole tracking frame, the tracker treats the target as noise, and takes the noise occupying the most of the tracking frame as the tracking target for subsequent tracking.

In view of the importance of the size of the target tracking frame to the algorithm, the improvement of the invention enables the compressed sensing method to sense the size change of the target, dynamically adjusts the size of the target frame, and improves the anti-noise capability and the accuracy of the video compressed sensing tracking method.

The current situation and the defects of the video tracking method of a plurality of cameras in the prior art and the problems to be solved by the invention are shown as follows:

in the short decades of development of video monitoring systems, the intelligent video monitoring system brings the change of covering the ground over the course of life, and a plurality of mature applications have penetrated the aspects of daily life, thus greatly saving labor force, for example, the application is applied to vehicle type identification, license plate number identification, traffic flow statistics in the traffic industry, ATM monitoring and peripheral safety precaution in the financial field, smoke induction identification in the forest fire prevention monitoring field, intelligent detection of open fire hidden dangers and the like, however, most of the mature applications only relate to single camera processing and do not relate to data transmission and sharing among multiple cameras, for the application of a large-scale multi-camera monitoring system such as a community security and a safe city, the prior art still relies on manpower to identify and track, the assistance of a machine is less, and the intelligent degree is lower.

Because the field of view of a single camera is very limited, in a single-camera environment, tracking cannot continue when the range of motion of the target exceeds the field of view of the camera. In current large-scale video monitoring systems, a large number of cameras are usually arranged to monitor a large area. Obviously, the tracking system under a single camera is not able to meet the demand of the increasingly complex intelligent surveillance system.

Most video monitoring systems in the prior art still stay in a manual monitoring state, the manual main monitoring mode is not only passive, but also low in efficiency, and often due to human negligence, crisis and problems cannot be found in time, the best opportunity for remediation is missed, and the functions of the video monitoring system cannot be exerted. Therefore, it is very important to develop the design of the intelligent target tracking system under multiple cameras.

Disclosure of Invention

Aiming at the defects of the prior art, the invention solves the problems that most video monitoring systems in the prior art still stay in a manual monitoring state, the monitoring mode is passive and low in efficiency, the crisis cannot be found timely due to manual negligence, the best opportunity for remediation is missed, and the function of the video monitoring system cannot be exerted through the theory and practice based on webpage information collection and retrieval. The intelligent target tracking system under the multiple cameras relates to data transmission and sharing among the multiple cameras, and for the application of a large-scale multiple-camera monitoring system of a residential security and safe city, the system does not depend on manpower to perform identification and tracking, and is more in machine assistance and high in intelligent degree. The problem that the observation range of a single camera is very limited, and when the moving range of the target exceeds the visual field range of the camera, the tracking cannot be continued is solved. In the current large-scale video monitoring system, the multi-camera video target tracking of the invention can meet the requirements of increasingly complex intelligent monitoring systems.

In order to achieve the technical effects, the technical scheme adopted by the invention is as follows:

a multi-camera video target tracking method under a complex environment is improved based on a video compressed sensing real-time tracking method, the video compressed sensing real-time tracking method for intelligent adjustment processing is provided, a tracked target image pyramid is constructed, targets under multiple scales are sampled and machine learning is carried out, and then intelligent adjustment processing of the sizes of tracked target frames is realized by constructing multi-scale search frames; the invention provides a video compressed sensing real-time tracking system under multiple cameras on the basis of a video compressed sensing real-time tracking method of intelligent adjustment processing, and a target tracking system of the multiple cameras is completed by introducing a network communication module, a moving target detection module and a matching module on the basis of a single-camera tracking system;

the video compression sensing method architecture for intelligent adjustment processing comprises the following steps: the method comprises a target image pyramid, target sampling and learning based on the image pyramid, and video target detection, and the multi-camera video compressed sensing real-time tracking method comprises the following frames: designing a network topology structure of a camera, detecting a video moving target and accurately matching the moving target;

the multi-camera video compressed sensing real-time tracking method comprises the following steps: constructing a main process as a central control system of the whole network to bear the storage and forwarding of data and output the current position information of a tracking target, constructing a plurality of terminal processes as control processes of each camera device, bearing the background modeling of the acquired image, detecting and matching the moving target, and tracking the target in a visible range; establishing network communication between a main process and a terminal process, transmitting relevant characteristic parameters of a tracked target and current position information, and simulating a video compression perception real-time tracking system under multiple cameras, wherein the main process mainly receives and transmits messages through network links, uniformly manages multiple camera terminals, receives target tracking data and results sent by each camera terminal process, and forwards the data to other camera equipment which detects target motion, so that the target can be continuously tracked when appearing in the system; each camera terminal process constructs and updates a background model of the intelligent adjustment processing of the camera terminal process, a moving target in a monitoring range of the camera terminal process is detected, the camera which detects the movement of the target communicates related tracking data information with a main process to judge whether the moving target is the target to be tracked, once the target is confirmed, the target is continuously tracked and tracking information is sent to the main process, wherein the short-term target tracking method adopts a real-time tracking method of intelligent adjustment processing video compression sensing.

In the multi-camera video target tracking method under the complex environment, further, a target image pyramid: adding the possible change of the target into the samples for classification, adjusting the size of a target frame when the size of the target is changed remarkably, predicting the size change of the target by constructing an image pyramid, and learning corresponding positive and negative samples of the target; according to the method, an original image is filtered by a bilinear interpolation method, and an image pyramid prediction target change with small change amplitude is constructed;

the bilinear interpolation filter is: the color value of an image pixel is obtained by weighting the 4 nearest neighboring points, let (x, y) be the coordinates of one point in the enlarged image, let u (x, y) represent its gray value, and for bilinear interpolation, the assigned gray value is expressed as:

u(x,y)＝ex+fy+sxy+g

the 4 coefficients are determined by 4 unknown equations written by 4 nearest points of the points (x, y), an image pyramid is constructed by bilinear interpolation filtering, and the variation amplitude between pyramid levels is freely adjusted so as to fit the variation amplitude of the tracking target.

The multi-camera video target tracking method under the complex environment further comprises the following steps of target sampling and learning based on an image pyramid: after an image pyramid is constructed, sampling targets under various scales, wherein the sampling process is based on a compressed sensing theory, and the sampling is expanded into the whole image pyramid;

firstly, sampling an original target image, wherein sampling samples are divided into a positive sample and a negative sample, a sample frame within the radius c range of an original tracking frame is used as a positive sample, a sample frame between the c range and the b range is used as a negative sample for sampling, compression characteristics of the obtained positive sample and the obtained negative sample are collected and used as sample representations of the positive sample and the negative sample for machine learning, and the collection of the compression characteristics is obtained by accumulating gray values after subframe frames with random positions, sizes and numbers are generated in a single sample frame;

the video compressed sensing real-time tracking method adopts a very sparse random measurement matrix to describe an original image, and the specific sampling process is as follows: firstly, generating 2 to 3 small sampling frames in a sample frame, wherein the sizes and the positions of the small sampling frames are randomly generated, then accumulating the gray values in the random sampling frames, obtaining a characteristic descriptor of the sample, and repeatedly performing more than m times of sampling processes on a target, wherein m-dimensional characteristic vectors formed by the m characteristic descriptors are compressed representations of the original target;

the invention not only obtains the random sampling matrix of the original image, but also expands the random sampling matrix, samples the original image under the multi-scale, constructs the expanded random sampling matrix, keeps the correlation degree between the original sampling value and the sampling value of other levels under the image pyramid, the size of the sampling frame of other levels is determined by the size proportion relation between the layer and the original image layer, and the definition is:

wherein D_iThe representation being an original image layer, D_i+nThe expression is the i + n layer image, w and h are the sizes of the sampling frames, m is the positive expression and is the up sampling layer, m is the negative expression and is the down sampling layer, the accumulation calculation of the gray value of the sampling frame is obtained by the calculation of the image integral graph, the image integral graph is the accumulation of the gray value of the image, specifically, the value of each pixel point is equal to the sum of all the pixel values before the point, and the expression is expressed by a formula as follows:

J(x，y)＝sum(D(i，j))，0≤i≤x，0≤j≤y

wherein J represents an integral image, D represents an original image, and the sum of pixel values of a rectangular frame is rapidly obtained by adopting an image integral image, and for the rectangular frame consisting of points E (x, y), F (x, y '), S (x', y) and G (x ', y'), the sum of pixel values is represented by an integral image as follows:

sum(G)＝J_E(x，y)+J_G(x’，y’)-J_F(x，y’)-J_S(x’，y)

and respectively calculating an integral image for each layer of image of the image pyramid and carrying out corresponding sampling, classifying the acquired compression features after the target image under each scale is sampled, and detecting the target of the next frame after the classification by adopting a naive Bayes classifier.

The multi-camera video target tracking method under the complex environment further comprises the following steps: after positive and negative sample sampling of an original target and construction of a classifier are completed, extracting a next frame of image for target detection, wherein the target detection process is a process of screening and classifying candidate frames, finding possible existing positions of the target, sampling the positions, substituting the positions into the classifier for operation, and then determining the current position of the target;

based on the characteristic that the motion trajectory of the object is continuous, the probability of the object appearing around the previous frame position is high, only a candidate frame needs to be generated around the previous frame position of the object, the generation range of the candidate frame is determined by a parameter t, and the candidate frame is defined as:

Cyc(x，y)＝{(x，y)||(x+y)-(x₀+y₀)||＜t}

wherein (x)₀,y₀) The target position calculated in the previous frame is determined, the maximum range of target movement which can be detected by a detection target is determined by the size of a t value, a larger t value is set when the target movement speed is higher, a smaller t value is set when the target movement is slower, the size of a candidate frame is a key for detecting the size change of the target, the number of generated candidate frames with different sizes is determined by the current actual size of the detection target detected by the detection target, the size and the number of the candidate frames are the same as the proportion and the position when an image pyramid is constructed during sampling, and the consistency of sampling and detection is ensured;

after candidate frames are generated, carrying out compression sampling on the candidate frames, substituting the feature vectors obtained by sampling into a naive Bayes classifier, solving classification score values of all features, maximizing the posterior probability of the vector with the highest score, taking the candidate frame corresponding to the feature vector as the real position of the current frame target, and then collecting positive and negative sample features around the current target to update the classifier for the use of the next frame in tracking classification;

the process from the construction of the target image pyramid to the positive and negative sampling classification of the target and the final acquisition of the real position of the target is the basic tracking process of the video compression sensing method of intelligent adjustment processing, and the process is repeated until the exit is the complete target tracking process.

The multi-camera video target tracking method under the complex environment further comprises an intelligent adjustment processing video compression perception real-time tracking method which mainly comprises two parts, wherein one part is an initialization part, mainly initializes a related data structure and parameters after a user selects a tracking target frame, generates a sub-sampling frame and constructs a classifier, and the other part is a tracking part, mainly repeatedly judges the current position of a target and updates the classifier;

the main functions include: 1) init (): the method is characterized in that a function is initialized, the function is composed of a series of sub-operations, and the method mainly comprises initializing a data structure and parameters, generating a sampling frame sub-window, collecting positive and negative samples around a target selection frame, and constructing a classifier; 2) HarrFeatur (): generating random sub-feature frames of initial samples, generating m groups of sub-feature frames according to the set repeated sampling times m of each sample, and generating sub-feature frames of other levels of the pyramid images according to the proportion between each layer of image and the original image; 3) getNextFrame (): acquiring an image of the next frame of the camera; 4) sampleRect (a, b): generating sample frames of different sizes in the range a to b of the trace frame, wherein the size ratio of the sample frames is determined by the ratio between pyramid levels, and the difference of positive and negative samples is determined by the distance ranges a and b from the sample frames to the trace frame; 5) getFeatureValue (): calculating a sample value of each sample by using an image integral graph, wherein the calculated sample value is an m-dimensional vector, the value of each dimension is obtained by accumulating gray characteristic values of a random subframe randomly generated in the sample frame, and the m-dimensional vector is compressed sensing sampling of an original sample and is dimension reduction representation of image data in the sample frame; 6) classierUpdate (): classifying and learning the input positive and negative samples to obtain an expectation and a variance value of sample distribution on each dimension, inputting new positive and negative samples to update the classifier, and determining weight values occupied by new and old samples by a learning proportion weight value during updating; 7) processFrame (): the method comprises the steps of intelligently adjusting and processing the realization of the tracking process of the video compressed sensing real-time tracking method, calculating the characteristic value of a tracking frame at a possible target position, obtaining a candidate frame with the highest posterior probability by using a constructed classifier as the tracking frame, and finally carrying out new sampling of positive and negative samples and updating of the classifier around the tracking frame; 8) radioClassifier (): and calculating the characteristic value of the frame where the possible target of the current frame is located through the input candidate frame, acquiring the candidate frame with the maximum classifier score, and taking the candidate frame as the actual position of the target of the frame.

In the multi-camera video target tracking method under the complex environment, further, the network topology structure design of the cameras: in the process of target tracking, if a target moves in only one camera, the continuity of appearance change is strong, and the target can be smoothly tracked by using initial characteristic data; however, when the target disappears from one camera and then appears in another camera, the appearance posture of the target generally changes greatly, so that a corresponding network is constructed, the characteristic information of the tracked target can be transmitted among the cameras, and the timely matching and continuous tracking of the tracked target are ensured;

each computer of the multi-camera tracking system controls a plurality of camera terminals, which is equivalent to a centralized network structure, a larger monitoring system partitions the cameras according to the geographical arrangement of the cameras, the cameras in each area are processed by one computer to form a distributed or cooperative network structure, and a TCP/IP protocol is adopted for communication between the terminals.

The multi-camera video target tracking method under the complex environment further comprises the following steps of: constructing a target detection mechanism, namely only needing target matching when a moving target appears, wherein the moving target is detected to be a found moving target, then the moving target is matched with a target which is being tracked, and whether the moving target is the tracked target is judged;

the detection method of the moving target is based on a background subtraction method, a background is modeled, when a moving object appears, the moving target can be detected only by differentiating a current image and the modeled background, a relatively complete contour of the target is obtained, and a background model processed by intelligent adjustment is adopted for gradual change of the background, and the mathematical expression of the background model is as follows:

current background ═ c × previous frame image + (1-c) × previous background

Wherein c is a background updating parameter, the value of which directly affects the updating speed and quality of the background, the process of detecting the target by adopting a background subtraction method is to subtract the current image frame from the constructed background model, and then carry out binarization processing on the difference image by adopting a specified critical value to obtain the contour of the area where the corresponding moving target is located; the method comprises the steps of processing a differential image by using a critical value, wherein the set value of the critical value is crucial to the quality of an acquired target contour, adjusting according to the color difference between a target and a background, when a moving object is initially sensed by a camera, the target does not completely enter a shooting area, the acquired target feature is incomplete at the moment, after the target completely enters the shooting area, extracting contour information of the target, acquiring a rectangular area where the moving object is located after the contour of the moving target is acquired, but the target in the rectangular frame is not necessarily a tracking target, and further matching judgment is required to be performed by a system according to target features left by other cameras.

The multi-camera video target tracking method under the complex environment further comprises the following steps of accurately matching moving targets: after a moving target is detected, matching the moving target, judging whether the moving target is a target which is tracked currently, searching a characteristic with illumination, angle and scale invariance to detect whether the moving target is the same target, wherein the operation speed meets the requirement of real-time calculation;

SURF characteristic points search local extreme points of pixels by constructing a multi-scale space pyramid, and an image pyramid is constructed by adopting an approximate Hessian matrix determinant, wherein the Hessian matrix consists of functions and partial derivatives thereof, and the mathematical expression is as follows:

for a point J (x, y) on the original image, after gaussian filtering, it becomes:

K(x，r)＝D(r)·J(x，r)

wherein D (r) is a Gaussian kernel, r is a Gaussian variance, K (x, r) is the representation of the image at different resolutions, and the Hessian approximation calculation of the filtered image is as follows:

det(H_approx)＝G_xxG_yy-(0.9G_xy)²

after an image pyramid is obtained by using a Hessian matrix, the subsequent characteristic point detection process is to compare the size of each pixel point with 26 points in the other field, if the point is the maximum value or the minimum value in the field, the point is used as an alternative characteristic point, then a sub-pixel level characteristic point is obtained by using a three-dimensional linear interpolation method, a critical value is adopted to filter out a part of characteristic points with weak characteristics, and finally the rest characteristic points with strong characteristics are required during matching.

The multi-camera video target tracking method under the complex environment further comprises the steps of calculating the main direction of the characteristic points after the characteristic points are determined, wherein the main direction of the SURF characteristic points is determined by harr wavelet characteristics in the field, namely, the sum of horizontal harr small wave features and vertical harr wavelet characteristics of all points in a 60-degree fan range in the field is counted, the maximum value is used as the main direction of the characteristic points, then a square frame is taken around the characteristic points, the side length of the frame is 20c, c is the dimension of a pyramid layer where the characteristic points are located, the frame is equally divided into 4 x 4 sub-regions, haar wavelet characteristics in the horizontal direction and the vertical direction are respectively calculated, and then a 64-dimensional vector is obtained, namely, the description operator of the characteristic points in the SURF algorithm;

for the initially selected tracking target, storing the SURF characteristics of the initially selected tracking target, extracting the SURF characteristics of the detected moving target, matching the SURF characteristics and the SURF characteristics, if the matching is successful, determining that the target appearing in the other camera is the initially selected tracking target, resampling and continuously tracking the target by using a video compression sensing method of intelligent adjustment processing, and if the target does not exist in the other camera, discarding the moving target and continuously judging other moving targets;

after the tracking target is successfully matched, the current area of the target is used as an initial tracking frame, then the target is tracked under the input image sequence of the camera until the target disappears from the camera, the tracking result is fed back to the central system, and the central system outputs the position of the current target.

In the multi-camera video target tracking method under the complex environment, further, in the multi-camera video compressed sensing real-time tracking method process, the main functions of the main process flow comprise: 1) init (): initializing a program function, wherein the program function is mainly used for initializing storage variables related to a tracking target, namely socket variables; 2) listen () and accept (): in socket programming, a server side adopts a correlation function, a socket () function is adopted to construct a socket, a bind () function is adopted to bind a monitored port, a list () function is adopted to monitor a link request, an accept () function is adopted to receive the link request, and a network link is constructed; 3) CreateThread (): constructing a new thread, wherein the constructed thread is that a main process can send a newly constructed socket link to the thread, and each thread bears and processes a message sent by a camera terminal process; 4) CheckMsg (): encapsulating recv () function, checking whether there is new message; 5) handle (): the method comprises the steps of processing information sent by a terminal system process, receiving and storing tracking target characteristic information sent by the terminal system, and controlling the detection state of each terminal system according to the position and the coordinate of a current camera; 6) SendMsg (): and sending the result information after the function processing of the Handle () to each terminal system process.

Compared with the prior art, the invention has the following contributions and innovation points:

firstly, compared with the prior art, the multi-camera video target tracking method under the complex environment provided by the invention has the advantages that the tracking intelligent adjustment processing degree is obviously improved, the method has good resistance to difficult problems in target tracking, such as illumination change, target shielding and noise, the application value of the algorithm in practical projects is greatly improved, the capability of the target tracking method for coping with the complex environment is improved, and the robustness of the method is improved. The multiple cameras have good tracking feasibility, high accuracy and high efficiency;

secondly, the multi-camera video target tracking method under the complex environment has good long-term tracking effect, can ensure the accuracy in a short time, and can ensure the tracking result to be still accurate along with the passing of the tracking time; the method has the advantages that the robustness is good, tracking drift and even loss of the tracked target can be avoided due to the influence of factors such as the posture, illumination change and shielding of the target, more than one camera is generally arranged in the current practical video monitoring application, the long-term target tracking method under multiple cameras is high in operation efficiency and accurate in tracking result, and has strong anti-interference capability on complex and variable tracking environments;

thirdly, the multi-camera video target tracking method under the complex environment provided by the invention has the advantages that the video compression perception tracking method has a mechanism for intelligently adjusting the size of the target tracking frame, the target tracking frame changes along with the change of the size of the target, when the size of the target changes remarkably, the tracking process can reflect the change condition of the target, the motion condition of an object is usually unpredictable, and the tracker not only can correctly reflect the motion track of the target, but also can correctly reflect the change condition of the size of the target; the method has the advantages that more noise is prevented from being introduced to influence the subsequent target tracking result, when the size of the tracked target changes remarkably, the sample in the range of the tracking frame can reflect the situation of a real target, when the target is increased or reduced, the sample in the range of the target frame has no other information except the tracked target, the improvement of the method enables the compressed sensing method to sense the size change of the target, the size of the target frame is dynamically adjusted, and the anti-noise capability and the accuracy of the video compressed sensing tracking method are improved;

fourthly, the multi-camera video target tracking method under the complex environment solves the problems that most video monitoring systems in the prior art still stay in a manual monitoring state, the monitoring mode is passive and low in efficiency, a crisis cannot be found timely due to manual negligence, the best opportunity for remediation is missed, and the functions of the video monitoring systems cannot be exerted. The intelligent target tracking system under the multiple cameras relates to data transmission and sharing among the multiple cameras, and for the application of a large-scale multiple-camera monitoring system of a residential security and safe city, the system does not depend on manpower to perform identification and tracking, and is more in machine assistance and high in intelligent degree. The problem that the observation range of a single camera is very limited, and when the moving range of the target exceeds the visual field range of the camera, the tracking cannot be continued is solved. In the current large-scale video monitoring system, the multi-camera video target tracking of the invention can meet the requirements of increasingly complex intelligent monitoring systems.

Drawings

Fig. 1 is a flow chart of a video compressed sensing real-time tracking method for intelligent adjustment processing.

Fig. 2 is a main flow chart of a central system of a multi-camera video compression-sensing real-time tracking method.

Fig. 3 is a flow chart of a camera terminal of a multi-camera video compressed sensing real-time tracking method.

Detailed Description

The following describes the technical solution of the multi-camera video target tracking method in a complex environment with reference to the accompanying drawings, so that those skilled in the art can better understand the present invention and can implement the present invention.

With the wide application of high-performance computers and cost-effective cameras, people have higher and higher requirements on intelligent video processing. Most of video target tracking methods in the prior art are ideal for environmental assumption, and have no good resistance to difficult problems in target tracking, such as illumination change, target occlusion and noise, so that the application value of the algorithms in practical projects is greatly reduced. The capability of the target tracking method for coping with complex environment is improved, and the improvement of the robustness of the method is the key of video target tracking.

The invention is improved based on a video compressed sensing real-time tracking method, provides a video compressed sensing real-time tracking method for intelligent adjustment processing, samples targets under multiple scales by constructing a tracking target image pyramid, performs machine learning, and then realizes intelligent adjustment processing of the size of a tracking target frame by constructing a multi-scale search frame. The intelligent adjustment processing video compressed sensing real-time tracking method is subjected to simulation experiment and analysis, and the experimental result shows that the intelligent adjustment processing degree of the intelligent adjustment processing video compressed sensing real-time tracking method is remarkably improved compared with the original algorithm. The invention provides a video compressed sensing real-time tracking system under multiple cameras on the basis of a video compressed sensing real-time tracking method of intelligent adjustment processing, and a target tracking system of the multiple cameras is completed by introducing a network communication module, a moving target detection module and a matching module on the basis of a single-camera tracking system. And finally, carrying out simulation experiment and analysis on the designed multi-camera video compression perception real-time tracking system, wherein the experiment result shows that when the tracking target moves from one camera to another or a plurality of cameras, the system can detect the tracking target and continuously track the target, and the feasibility and the high efficiency of the method are proved.

Intelligent adjustment processing video compressed sensing real-time tracking method

The video real-time tracking method in the prior art has high tracking accuracy when the interference of the surrounding environment is small and the target motion condition is single, but always needs to face the randomness of the target motion and various interferences brought by the environment in the real environment. When the size of the target changes or the target is partially or completely shielded, the tracking effect of the real-time tracking method in the prior art is not ideal. Therefore, the invention provides an intelligent adjustment processing video compressed sensing real-time tracking method, which can intelligently adjust the size of a tracking frame to adapt to the size change of a target when the target approaches or leaves, thereby avoiding introducing a large amount of errors in the tracking process and reducing the influence caused by environmental factors. The invention describes the technical scheme and the implementation process of the video compressed sensing real-time tracking method for intelligent adjustment processing in detail, and improves the performance and the advantages of the algorithm through simulation experiment analysis.

Video compression sensing method architecture for intelligent adjustment processing

The tracking frame of the video compression sensing real-time tracking method changes along with the change of the size of the target, the detector senses the size change of the size of the target, and the possible change situation of the size of the target is learned when the machine learns. The method is improved by mainly constructing a target image pyramid, carrying out multi-scale sampling learning on the target, then generating search boxes with different scales to search the target when searching the target, and taking the search box with the highest posterior probability as a position box of the current target, so that the target box can be intelligently adjusted and processed to change according to the size of the target.

1. Target image pyramid

In the video real-time tracking method in the prior art, a target tracking frame is used as the center of the next frame search, the target tracking frame is searched and detected within a certain radius range, and a candidate frame with the maximum posterior probability is determined as the new position of the target. In this way, although the target is tracked, the gradual change process of the target is ignored, and the change of the size of the target cannot be perceived. Therefore, the method adds the possible change of the target into the samples for classification, adjusts the size of the target frame when the size of the target is changed remarkably, predicts the size change of the target by constructing an image pyramid, and learns the corresponding positive and negative samples of the target.

The image pyramid is an image set, all images in the set are obtained by transforming original images by using a certain sampling rule, sampling is divided into an up sampling mode and a down sampling mode, the Gaussian pyramid is used for obtaining images with smaller sizes by down sampling, the Laplacian pyramid is used for up sampling, and a pair of images with larger sizes are reconstructed by the original images. In the sampling process of the Gaussian pyramid, convolution processing is carried out on the ith layer in the pyramid by utilizing a Gaussian core, then even rows and even columns of the image of the ith layer are deleted, and the image of the (i + 1) th layer is obtained, obviously, the image of the (i + 1) th layer is necessarily one fourth of the image of the ith layer due to the acquisition mode; the Laplacian pyramid is equivalent to the inverse form of the Laplacian pyramid, blank columns are added behind each row and each column of the (i + 1) th layer of images, then a filter is adopted to carry out convolution estimation on approximate values of the lost pixels to obtain the ith layer of image, and a new image constructed by the corresponding Laplacian pyramid is four times of the original image. The four times of the change speed of the Gaussian pyramid and the Laplacian pyramid is too large relative to the change amplitude of the target in the actual tracking process.

The change amplitude of the pyramid corresponds to the change rate of the target size, and the change amplitude is controlled in a smaller range, so that the Gaussian pyramid and the Laplace pyramid cannot be directly adopted. Therefore, the method adopts a bilinear interpolation method to filter the original image and construct an image pyramid prediction target change with small change amplitude.

u(x,y)＝ex+fy+sxy+g

the 4 coefficients are determined by 4 unknown equations written by 4 nearest points of the points (x, y), an image pyramid is constructed by bilinear interpolation filtering, and the variation amplitude between pyramid levels is freely adjusted so as to fit the variation amplitude of the tracking target. Specifically, the difference value is set as a fixed value of m pixels, the fixed value is suitable for the situation that the target changes slowly, for example, the pedestrian moves, and the change process of the target is generally gradual, so that the number of constructed image pyramid layers cannot be too large, excessive sampling learning not only occupies a large amount of computing resources and memory, but also has an influence on the final classification precision.

2. Image pyramid based target sampling and learning

After an image pyramid is constructed, sampling is carried out on targets under all scales, and the sampling process is based on a compressed sensing theory and is expanded into the whole image pyramid.

Firstly, sampling an original target image, wherein sampling samples are divided into a positive sample and a negative sample, a sample frame within the radius c range of an original tracking frame is used as a positive sample, a sample frame between the range c and the range b is used as a negative sample for sampling, compression characteristics of the obtained positive sample and the obtained negative sample are collected and used as sample representations of the positive sample and the negative sample for machine learning, and the collection of the compression characteristics is obtained by accumulating gray values after subframe frames with random positions, sizes and numbers are generated in a single sample frame.

The video compressed sensing real-time tracking method adopts a very sparse random measurement matrix to describe an original image, and the specific sampling process is as follows: firstly, generating 2 to 3 small sampling frames in a sample frame, wherein the sizes and the positions of the small sampling frames are randomly generated, then accumulating the gray values in the random sampling frames, obtaining a value which is a feature descriptor of the sample, repeating the sampling process of the target for more than m times, and enabling m-dimensional feature vectors formed by the m feature descriptors to be compressed and expressed of the original target.

The method of the present invention improves by not only acquiring a randomly sampled matrix of the original image, but also extending it. And sampling the original image under the multi-scale condition to construct an expanded random sampling matrix. In order to keep the correlation between the original sample value and the sample values at other levels of the image pyramid as much as possible, the size of the sample frame at other scales is determined by the size proportional relationship between the layer and the original image layer, and is defined as:

J(x，y)＝sum(D(i，j))，0≤i≤x，0≤j≤y

sum(G)＝J_E(x，y)+J_G(x’，y’)-J_F(x，y’)-J_S(x’，y)

the main advantage of using the integral map is that the sum of pixel values of rectangular frames of any size and position in the image can be rapidly obtained only by performing operation once on the image. For a compressed sensing method which needs to calculate the sum of rectangular frame pixel values in a large amount, the integral map is very necessary. For the video compression sensing method of intelligent adjustment processing, an integral image is calculated for each layer of image of an image pyramid and is sampled correspondingly, after sampling of target images under various scales is completed, collected compression features are classified, a naive Bayes classifier is adopted by the classifier, and after classification is completed, targets of the next frame are detected.

3. Video object detection

And after sampling positive and negative samples of the original target and constructing a classifier, extracting the next frame of image for target detection. The process of target detection is a process of screening and classifying candidate frames, and the possible existing positions of the target need to be found, sampled and substituted into the classifier for operation, and then the current position of the target is determined.

The generated position and size of the candidate frame determine whether the classifier can obtain the real position of the target during target detection, theoretically, the probability of hitting the real position of the target is higher when more candidate frames are generated for classification, but a large amount of calculation and storage space is needed, the running speed is reduced, and based on the characteristic that the motion track of the object is continuous, the probability of the target appearing around the position of the previous frame is very high, so that only the candidate frame needs to be generated around the position of the previous frame of the target, and the generation range of the candidate frame is determined by a parameter t and defined as:

Cyc(x，y)＝{(x，y)||(x+y)-(x₀+y₀)||＜t}

wherein (x)₀，y₀) The method comprises the steps that the target position calculated in the previous frame is determined, the maximum range of target movement which can be detected by a detection target is determined by the size of a t value, a larger t value is set when the target movement speed is higher, a smaller t value is set when the target movement speed is slower, the size of a candidate frame is a key for detecting the size change of the target, the number of generated candidate frames with different sizes determines the current actual size of the target detected by the detection target, the size and the number of the candidate frames are the same as the proportion and the position of the image pyramid constructed during sampling, and the consistency of sampling and detection is guaranteed.

After candidate frames are generated, the candidate frames are subjected to compression sampling, the feature vectors obtained by sampling are substituted into a naive Bayes classifier, classification score values of all features are obtained, the posterior probability of the vector with the highest score is the maximum, the candidate frame corresponding to the feature vector is taken as the real position of the current frame target, and then positive and negative sample features around the current target are collected to update the classifier for the next frame to be used in tracking classification.

(II) method flow implementation

The intelligent adjustment processing video compression sensing real-time tracking method mainly comprises two parts, wherein one part is an initialization part, related data structures and parameters are initialized mainly after a user selects a tracking target frame, a sub-sampling frame is generated, a classifier is constructed, the other part is a tracking part, the current position of a target is repeatedly judged, and the classifier is updated, and a specific algorithm flow chart is shown in figure 1.

Video compressed sensing real-time tracking method for two or more cameras

The target tracking necessarily considers the moving range of the target, and in the single-camera tracking system, the moving range of the target is limited to the visible range of a single camera, but the range is far from enough for practical application. The invention analyzes the defects of a single-camera tracking system, provides the necessity of designing a multi-camera tracking system, then contrasts and analyzes the difference between the two systems, the problems to be solved in the multi-camera tracking system and the corresponding methods, and finally analyzes and verifies the effectiveness of the tracking method under the multi-camera through simulation experiments.

Framework of multi-camera video compressed sensing real-time tracking method

The main difference between target tracking under multiple cameras and target tracking under a single camera is that a target shuttles among multiple cameras, so that sensing the target from one camera to another or multiple cameras becomes a main problem to be solved by the multi-camera tracking system. The multi-camera tracking system is still based on a single-camera tracking system, namely when a target moves in one camera, the target is tracked by adopting a single-target tracking method, when the target disappears from one camera and then appears in another camera or cameras, the system senses the appearance of the target, and then judges whether the moving target is the target which is currently tracked or not according to the characteristic information left by the previous camera so as to continuously track the target. Therefore, the target detection, the target matching and the related characteristic information data transmission are problems to be solved by a target tracking system under multiple cameras.

1. Network topology design of camera

In the process of target tracking, if the target moves in only one camera, the appearance change continuity is strong, and the target can be smoothly tracked by using the initial characteristic data. However, when a target disappears from one camera and then appears in another camera, the appearance posture of the target usually changes greatly, which is mainly because the installation position and the shooting angle of the camera are greatly different, and if the current camera monitoring system does not have the feature information of the original target at this time, whether the target appearing in the field of view of the camera is the target which is currently tracked cannot be judged, so a corresponding network needs to be constructed, the feature information of the tracked target can be transmitted among the cameras, and the timely matching and continuous tracking of the tracked target are ensured. Before a multi-camera target tracking system is constructed, a corresponding network topology structure is selected according to needs, and the selection range comprises: centralized network architecture, distributed network architecture, collaborative network architecture.

The multi-camera tracking system controls a plurality of camera terminals per computer, and is equivalent to a centralized network structure. The larger monitoring system adopts more camera terminals, one computer is not enough to control all the data of all the devices, the cameras are partitioned according to the geographical arrangement of the cameras, the cameras in each area are processed by one computer to form a distributed or cooperative network structure, and the TCP/IP protocol is adopted for the communication between the terminals. The system data must guarantee reliable delivery, and the TCP/IP protocol is adopted for network communication because the TCP/IP protocol is more reliable than data transmission of the UDP protocol.

2. Video moving object detection

Before matching the tracking target, the question of when to match, which region of the image to match needs to be addressed. An object which moves does not exist in one camera terminal all the time, so that a corresponding target detection mechanism is constructed, and only the target matching is needed when a moving target appears. The detection of the moving target is to find the moving target, then match the moving target with the target being tracked, and judge whether the moving target is the tracking target.

The application scene of the invention is mainly under the condition that the backgrounds of building channels and cell streets are relatively fixed and the visual angle of the camera is relatively fixed, so the detection method of the moving object is based on the background subtraction method which is only suitable for the movement detection method under the condition that the camera is static, because the scene shot by the camera is relatively fixed, the background is modeled, when a moving object appears, the moving object can be detected only by differentiating the current image and the modeled background, and the relatively complete contour of the object is obtained, aiming at the gradual change of the background, the background model processed by intelligent adjustment is adopted, the mathematical expression of which is as follows:

current background ═ c × previous frame image + (1-c) × previous background

And c is a background updating parameter, the value of which directly influences the updating speed and quality of the background, the process of detecting the target by adopting a background subtraction method is to subtract the current image frame from the constructed background model, and then carry out binarization processing on the difference image by adopting a specified critical value to obtain the outline of the area where the corresponding moving target is located. The method comprises the steps of directly subtracting a foreground image from a background to obtain a differential image, wherein the differential image is relatively noisy, and the noises can generate certain influence on the acquisition of the outline of a tracked target, so that the differential image is processed by adopting a critical value, the set value of the critical value is vital to the quality of the acquired outline of the target, adjustment is carried out according to the color difference of the target and the background, when a camera initially senses a moving object, the target does not completely enter a shooting area, the acquired target features are incomplete at the moment, when the target completely enters the shooting area, the outline information of the target is extracted, and after the outline of the moving target is acquired, the rectangular area where the moving object is located is acquired. However, the target in the rectangular frame is not necessarily the tracking target, and the system needs to make further matching judgment according to the target features left by other cameras.

3. Precise matching of moving objects

After the moving target is detected, the moving target is matched, and whether the moving target is the target which is tracked currently is judged. Since the compression features extracted in the video compression sensing real-time tracking method do not have the invariance of illumination, angle and scale, the compression features cannot be used for matching the target. The invention needs to find a feature with illumination, angle and scale invariance to detect whether the target is the same target, and the operation speed needs to meet the requirement of real-time calculation.

K(x，r)＝D(r)·J(x，r)

where D (r) is a Gaussian kernel, r is a Gaussian variance, and K (x, r) is a representation of the image at different resolutions. The filtered image is subjected to Hessian approximate calculation as follows:

det(H_approx)＝G_xxG_yy-(0.9G_xy)²

And after the characteristic points are determined, calculating the main directions of the characteristic points, wherein the main direction of the SURF characteristic points is determined by harr wavelet characteristics in the field, namely counting the sum of horizontal harr wavelet characteristics and vertical harr wavelet characteristics of all points in a 60-degree fan range in the field, and taking the maximum value as the main direction of the characteristic points. Then, a square frame is taken around the feature point, the side length of the frame is 20c, c is the scale of the pyramid layer where the feature point is located, the frame is equally divided into 4 x 4 sub-regions, haar wavelet features in the horizontal direction and the vertical direction of the sub-regions are respectively calculated, and then a 64-dimensional vector is obtained, namely, a description operator of the feature point in the SURF algorithm.

The SURF characteristics of the initially selected tracking target are saved, the SURF characteristics of the detected moving target are extracted and matched, if the matching is successful, the target appearing in the other camera is determined to be the initial tracking target, the target is resampled and continuously tracked by using an improved video compression sensing real-time tracking method, if not, the moving target is abandoned, and the judgment on other moving targets is continued.

According to the method, the target is subjected to feature matching after the moving target is detected by utilizing background modeling, rather than searching for the target on each camera in real time, the feature points of the whole image are not required to be obtained during matching, the repeated calculation and matching of the features on the camera without the moving target are not required to be carried out continuously, the feature extraction and matching operation are only required to be carried out on a small region of interest, a large amount of calculation resources and time are saved, and the system is enabled to run more stably.

Flow of (II) multi-camera video compressed sensing real-time tracking method

The invention constructs a main process as a central control system of the whole network to bear the storage and forwarding of data and output the current position information of the tracking target, constructs a plurality of terminal processes as control processes of each camera device, bears the background modeling of the collected image, detects and matches the moving target, and tracks the target in a visual range. And establishing network communication between the main process and the terminal process, and transmitting the relevant characteristic parameters of the tracking target and the current position information. A video compression perception real-time tracking system under a plurality of simulated cameras is characterized in that a main process mainly receives and transmits messages through network links, uniformly manages a plurality of camera terminals, receives target tracking data and results sent by each camera terminal process, and forwards the data to other camera equipment for detecting target motion, so that targets can be continuously tracked when the targets appear in the system. Each camera terminal process constructs and updates a background model of the intelligent adjustment processing of the camera terminal process, a moving target in a monitoring range of the camera terminal process is detected, the camera which detects the movement of the target communicates related tracking data information with a main process to judge whether the moving target is the target to be tracked, once the target is confirmed, the target is continuously tracked and tracking information is sent to the main process, wherein the short-term target tracking method adopts a real-time tracking method of intelligent adjustment processing video compression sensing.

The multi-camera video compressed sensing real-time tracking method comprises a main process flow and a camera terminal flow.

1. The main process flow is shown in fig. 2. The main functions of the main process flow include: 1) init (): initializing a program function, wherein the program function is mainly used for initializing storage variables related to a tracking target, namely socket variables; 2) listen () and accept (): in socket programming, a server side adopts a correlation function, a socket () function is adopted to construct a socket, a bind () function is adopted to bind a monitored port, a list () function is adopted to monitor a link request, an accept () function is adopted to receive the link request, and a network link is constructed; 3) CreateThread (): constructing a new thread, wherein the constructed thread is that a main process can send a newly constructed socket link to the thread, and each thread bears and processes a message sent by a camera terminal process; 4) CheckMsg (): encapsulating recv () function, checking whether there is new message; 5) handle (): the method comprises the steps of processing information sent by a terminal system process, receiving and storing tracking target characteristic information sent by the terminal system, and controlling the detection state of each terminal system according to the position and the coordinate of a current camera; 6) SendMsg (): and sending the result information after the function processing of the Handle () to each terminal system process.

2. The camera terminal portion flow is shown in fig. 3. The terminal system process mainly comprises four parts, including: the device comprises a background modeling module, a moving target detection and matching module, a target tracking module and a communication module. When the process is started, the current network condition and the camera state are checked firstly, whether the main process can be connected or not and whether the camera works normally or not are confirmed, and if a problem occurs, the process is directly quitted; starting a background modeling module after confirming that no error exists, constructing a background model by using an image sequence input by a camera, waiting for a message sent by a main process, if the main process sends the characteristics of a tracking target, a system enters a motion detection module to check whether the current foreground image has the motion target, if so, matching the motion target with the SURF characteristics of the tracking target sent by the main process, setting a counter for repeatedly carrying out target detection and matching until the detection times reach the set maximum value, quitting the detection matching module to continue background updating, processing the message sent by the main process while the detection process is carried out, and informing that the main process has detected the target in other cameras, and similarly, switching to a background modeling module, if the tracking target is successfully matched, sending a message to a main system, and entering a target tracking module to track the target until the tracking target is lost.

The main functions of the camera terminal part flow include: 1) UpdateBkMat (): updating the foreground image into the background model according to the set updating proportion; 2) CheckMoving (): searching a moving target in a foreground image, carrying out difference on the foreground image and a constructed background image, carrying out criticizing and denoising on the difference image to obtain a plurality of moving target areas, discarding the areas with smaller areas, and leaving the remaining areas as the positions of the current moving target in the camera; 3) CheckTarget (): and whether the moving target in the matching foreground is the tracking target set by the current system or not. Matching a plurality of moving targets detected in the foreground image with SURF characteristics of tracking targets sent by a main process, and judging whether the moving targets are the targets currently tracked by the system or not; 4) CheckMsg (): processing the message sent by the main system, including receiving SURF characteristics of a tracking target sent by the main process, waking up a detection module of each terminal system when the tracking target loses the message, and judging whether the terminal system continues to detect the moving target in the foreground image when the tracking target appears the message; 5) SendMsg (): sending a message to a main system process to inform the main system whether the terminal checks out a tracking target or not and to track the current coordinate of the target; 6) CSTRrack (): the target is tracked in the current single-camera system, the target position successfully matched by the CheckTarget () function is used as a tracking frame by the function, and the target is tracked in a short term by adopting the video compressed sensing real-time tracking method of intelligent adjustment processing.

The invention discloses a video compressed sensing real-time tracking method, which is based on a compressed sensing theory, analyzes a target modeling mode of the video compressed sensing real-time tracking method, analyzes problems of the video compressed sensing real-time tracking method, obtains related problems of the video compressed sensing real-time tracking method, and mainly solves the problem that a tracking frame cannot be intelligently adjusted, provides a video compressed sensing real-time tracking method for intelligent adjustment processing, and solves the key technology of intelligent adjustment processing of the size of the tracking frame, and comprises the following steps: the method comprises the steps of image pyramid, multi-scale image adoption and learning, target matching frame selection, flow steps of the improved method, simulation experiment, and analysis of experiment results, and proves the effectiveness of the intelligent adjustment processing video compression sensing real-time tracking method. Aiming at the new requirement of video target tracking, the target tracking under a single camera is expanded into multiple cameras, and on the basis of an intelligent adjustment processing video compression sensing real-time tracking method, the technical key points of moving target detection and network topology structure selection are combined, a video compression sensing real-time tracking system under the multiple cameras is provided, then a detailed design scheme of the system is provided, the flows and steps of all parts of the system are provided, a simulation experiment is carried out, and the result shows that the video compression sensing real-time tracking method under the multiple cameras is practical, feasible, accurate and efficient.

Claims

1. The multi-camera video target tracking method under the complex environment is characterized in that the method is improved based on a video compressed sensing real-time tracking method, the video compressed sensing real-time tracking method for intelligent adjustment processing is provided, a tracked target image pyramid is constructed, targets under multiple scales are sampled and machine learning is carried out, and then intelligent adjustment processing of the sizes of tracked target frames is realized by constructing multi-scale search frames; the invention provides a video compressed sensing real-time tracking system under multiple cameras on the basis of a video compressed sensing real-time tracking method of intelligent adjustment processing, and a target tracking system of the multiple cameras is completed by introducing a network communication module, a moving target detection module and a matching module on the basis of a single-camera tracking system;

2. The multi-camera video target tracking method in the complex environment according to claim 1, wherein the target image pyramid: adding the possible change of the target into the samples for classification, adjusting the size of a target frame when the size of the target is changed remarkably, predicting the size change of the target by constructing an image pyramid, and learning corresponding positive and negative samples of the target; according to the method, an original image is filtered by a bilinear interpolation method, and an image pyramid prediction target change with small change amplitude is constructed;

u(x,y)＝ex+fy+sxy+g

3. The multi-camera video target tracking method in a complex environment according to claim 1, wherein the target sampling and learning based on the image pyramid is as follows: after an image pyramid is constructed, sampling targets under various scales, wherein the sampling process is based on a compressed sensing theory, and the sampling is expanded into the whole image pyramid;

J(x，y)＝sum(D(i，j))，0≤i≤x，0≤j≤y

sum(G)＝J_E(x，y)+J_G(x’，y’)-J_F(x，y’)-J_S(x’，y)

4. The multi-camera video target tracking method in the complex environment according to claim 1, wherein the video target detection: after positive and negative sample sampling of an original target and construction of a classifier are completed, extracting a next frame of image for target detection, wherein the target detection process is a process of screening and classifying candidate frames, finding possible existing positions of the target, sampling the positions, substituting the positions into the classifier for operation, and then determining the current position of the target;

Cyc(x，y)＝{(x，y)||(x+y)-(x₀+y₀)||＜t}

wherein (x)₀,y₀) The method comprises the steps that the target position is calculated in the last frame, the maximum range of target movement which can be detected by a detection target is determined by the size of a t value, a larger t value is set when the target movement speed is higher, a smaller t value is set when the target movement is slower, the size of a candidate frame is a key for detecting the size change of the target, the number of generated candidate frames with different sizes determines the current actual size of the detection target, the size and the number of the candidate frames are the same as the proportion and the position of the candidate frames when an image pyramid is constructed during sampling, and the size and the number of the candidate frames are guaranteed to be the same as the proportion andthe consistency of sampling and detection is verified;

5. The multi-camera video target tracking method under the complex environment according to claim 1, characterized in that the intelligent adjustment processing video compressed sensing real-time tracking method mainly comprises two parts, one part is an initialization part which initializes related data structures and parameters and generates sub-sampling frames and constructs classifiers after a user selects a tracking target frame, and the other part is a tracking part which repeatedly judges a current position of a target and updates the classifiers;

6. The method for tracking the video target of the multiple cameras in the complex environment according to claim 1, wherein the network topology structure design of the cameras is as follows: in the process of target tracking, if a target moves in only one camera, the continuity of appearance change is strong, and the target can be smoothly tracked by using initial characteristic data; however, when the target disappears from one camera and then appears in another camera, the appearance posture of the target generally changes greatly, so that a corresponding network is constructed, the characteristic information of the tracked target can be transmitted among the cameras, and the timely matching and continuous tracking of the tracked target are ensured;

7. The method for tracking the video target of the multiple cameras in the complex environment according to claim 1, wherein the video moving target detection comprises: constructing a target detection mechanism, namely only needing target matching when a moving target appears, wherein the moving target is detected to be a found moving target, then the moving target is matched with a target which is being tracked, and whether the moving target is the tracked target is judged;

current background ═ c × previous frame image + (1-c) × previous background

8. The multi-camera video target tracking method under the complex environment according to claim 1, wherein the moving target is precisely matched with: after a moving target is detected, matching the moving target, judging whether the moving target is a target which is tracked currently, searching a characteristic with illumination, angle and scale invariance to detect whether the moving target is the same target, wherein the operation speed meets the requirement of real-time calculation;

K(x，r)＝D(r)·J(x，r)

det(H_approx)＝G_xxG_yy-(0.9G_xy)²

9. The method for tracking the video target of the multiple cameras under the complex environment according to claim 8, wherein the main directions of the feature points are calculated after the feature points are determined, the SURF feature point main direction is determined by harr wavelet features in the field, that is, the sum of horizontal harr wavelet features and vertical harr wavelet features of all points in a sector range of 60 degrees in the field is counted, the maximum value is taken as the main direction of the feature point, then a square frame is taken around the feature point, the side length of the frame is 20c, c is the dimension of a pyramid layer where the feature point is located, the frame is equally divided into 4 x 4 sub-regions, haar wavelet features in the horizontal direction and the vertical direction are respectively calculated, and then a 64-dimensional vector is obtained, namely, the description operator of the feature point in the SURF algorithm;

10. The method for tracking the target of the multi-camera video under the complex environment according to claim 1, wherein in the flow of the multi-camera video compressed sensing real-time tracking method, the main functions of the main process flow comprise: 1) init (): initializing a program function, wherein the program function is mainly used for initializing storage variables related to a tracking target, namely socket variables; 2) listen () and accept (): in socket programming, a server side adopts a correlation function, a socket () function is adopted to construct a socket, a bind () function is adopted to bind a monitored port, a list () function is adopted to monitor a link request, an accept () function is adopted to receive the link request, and a network link is constructed; 3) CreateThread (): constructing a new thread, wherein the constructed thread is that a main process can send a newly constructed socket link to the thread, and each thread bears and processes a message sent by a camera terminal process; 4) CheckMsg (): encapsulating recv () function, checking whether there is new message; 5) handle (): the method comprises the steps of processing information sent by a terminal system process, receiving and storing tracking target characteristic information sent by the terminal system, and controlling the detection state of each terminal system according to the position and the coordinate of a current camera; 6) SendMsg (): and sending the result information after the function processing of the Handle () to each terminal system process.