CN112307897A

CN112307897A - Pet tracking method based on local feature recognition and adjacent frame matching in community monitoring scene

Info

Publication number: CN112307897A
Application number: CN202011029862.9A
Authority: CN
Inventors: 孙浩云; 张卫山; 尹广楹; 张大千; 徐亮; 管洪清
Original assignee: Qingdao Sui Zhi Information Technologies Co ltd
Current assignee: Qingdao Sui Zhi Information Technologies Co ltd
Priority date: 2020-09-27
Filing date: 2020-09-27
Publication date: 2021-02-02

Abstract

The invention relates to the technical field of image retrieval, artificial intelligence and deep learning, and particularly discloses a pet tracking method based on local feature recognition and adjacent frame matching in a community monitoring scene. Then, identifying pet individuals and corresponding head, body and limb parts in each frame of image by using the established ML-Adaboost classifier; kalman filtering is used to predict where the matching failed features are located so that the tracking algorithm can correlate pet trajectories in successive frames of images. The invention can still track the pet individual in the approximate direction under the condition that partial features are shielded, and is quicker and more accurate in the frame-by-frame matching process compared with the method for matching according to the similarity of the texture features.

Description

Pet tracking method based on local feature recognition and adjacent frame matching in community monitoring scene

Technical Field

The invention relates to the technical field of image retrieval, artificial intelligence and deep learning, in particular to a pet tracking method based on local feature recognition and adjacent frame matching in a community monitoring scene.

Background

The target tracking is a follow-up of target detection, and the main purpose of the target tracking technology is to identify a specific individual in multiple continuous images and track the motion track of the specific individual. Currently, target tracking is widely used in various industries, including tracking of other objects such as vehicles, in addition to tracking of pedestrians. For a target tracking technology, the type of a tracked target has certain universality, and the accuracy and efficiency of tracking depend on identification and matching measures adopted in the tracking process, namely, how to ensure that the same target can be matched in a plurality of frames of images and the motion track of the target can be described.

Generally speaking, if the same object can appear in each frame of a multi-frame image sequence, the identification and tracking of the object becomes very simple, and only the object in each frame image needs to be detected and correlated among the frames. However, the tracking of the target is often influenced by environmental changes, and in the tracking process, the target cannot be effectively identified due to illumination changes or shielding, and a tracker used in a tracking algorithm needs to infer a position where the target may exist through a plurality of frame images before the target disappears. In addition, the tracking of the target depends on the matching of the target in each frame of image, and the simplest matching mode is to analyze the pixel texture features of the target and compare the similarity of the target detected between frames, but this matching mode often needs to establish a certain similarity standard, i.e. how far the matching result reaches can consider the target in each frame of image to be the same. Once the multiple frames are matched for multiple targets, the matching method generates a large amount of calculation overhead.

Meanwhile, the detection precision of the target features during tracking often influences the effectiveness of the tracking process, and a good tracking algorithm should ensure that the target can still be tracked to a certain extent when partial features of the target are shielded. Aiming at the problems related to the existing target tracking algorithm, the invention provides a pet tracking method based on local feature recognition and adjacent frame matching in a community monitoring scene by taking pet tracking in the community scene as an example. The method cuts local features in a rectangular selected area, so that the frame-by-frame matching precision in the tracking process is higher, and the target can still be continuously tracked when the features of the target part are invisible. In order to overcome the defect of a matching method mainly based on similarity calculation of a texture value of a target, a Hungarian assignment algorithm is adopted to match the whole and corresponding local features of the same target among frames in the tracking process, and for a matching failure frame (hereinafter referred to as an abnormal frame) caused by environmental change or shielding, a Kalman filter is adopted to predict and calibrate the position of the target possibly existing in the abnormal frame according to the position information of an image of a frame before the abnormal frame, so that the tracking process is corrected. In addition, for the problem of target disappearance, the abnormal frame count is adopted for measurement, and if the abnormal frame count of the target reaches a certain threshold, the target is considered to be beyond the visual field of the camera.

Disclosure of Invention

In order to solve the problem of analyzing the defects of the existing target tracking technology, the invention aims to provide a pet tracking method based on local feature recognition and adjacent frame matching in a community monitoring scene. The method is mainly applied to tracking the pets around part of community residents in a complex community scene so as to record the motion trail information of the pets, and can provide information help for solving the problem of missing of the pets of the residents.

In order to solve the technical problems, the technical scheme of the invention is as follows:

a pet tracking method based on local feature recognition and adjacent frame matching in a community monitoring scene comprises the following steps: step 1: collecting images under community monitoring, labeling individual pets, and establishing an Adaboost classifier for the individual pets according to HOG characteristics;

step 2: labeling limbs, the head and the trunk of the pet, and establishing Adaboost classifiers for the three parts according to HOG characteristics;

and step 3: integrating the classifiers in the

steps

1 and 2 to form a Multi-layer Adaboost classifier (ML-Adaboost, hereinafter referred to as ML-Adaboost) for the individual pet;

and 4, step 4: extracting and collecting partial target information from continuous frame images, and continuously iterating the position information of the individual pet and local characteristics thereof in each frame image to the Kalman filter according to the front and back sequence so as to continuously correct parameters in the filter prediction function. The established Kalman filter can predict the position of a target possibly appearing in the next frame according to the target information of the current frame;

and 5: starting to track the target in the continuous frames, detecting individual pets and local characteristics thereof by using ML-Adaboost, and adding a new target appearing in the first frame into a tracking list;

step 6: and (3) sequentially iterating the continuous frame images, selecting two adjacent images before and after each iteration, and establishing a tracking relation between the same target and local characteristics thereof by using a Hungarian algorithm. If an abnormal frame exists in the middle, a frame which is closest to the abnormal frame and is successfully matched is used as a reference, a Kalman filter is used for predicting the position of a target which is possibly in the abnormal frame, and a relation is established between the abnormal frame and the frame;

and 7: for the new continuous frames, repeating the steps 4-6, continuously updating the Kalman filter and the tracking information, and judging whether the pet exceeds the monitoring range according to the count of the matching failure frames so as to update the tracking list;

further, in the step 1, the purpose of building the Adaboost classifier by using the HOG features is that good invariance can be ensured even if the geometric or illumination change of the image occurs, so that pet individuals can be effectively detected in the complex environment of the community on the basis of the good invariance.

Preferably, in step 2, Adaboost classifiers are established for the head, the limbs and the trunk of the pet respectively by taking the HOG features as reference, and in step 3, the classifier for the individual pet in step 1 and the classifier established for the local pet features in the step are combined to form ML-Adaboost. The ML-Adaboost classifier firstly detects individual pets, and then detects the local features of each pet on the basis of the detection result of the individual pets, and the ML-Adaboost is established for effectively identifying the local features of the individual pets and ensuring that the identified local features belong to correct individual pets.

Preferably, in the step 4, a kalman filter is established according to the existing part of the information of the individual pet position and the local characteristic position in the continuous frame images. The Kalman filter is established as follows: and in the prediction process of the optimal value of the next frame, firstly calculating the predicted value of the state variable through an error covariance matrix, secondly correcting the predicted value by using the optimal observed value of the current frame and Kalman gain, and finally obtaining the optimal value of the variable. In the process of predicting the pet position, a tracking object is defined as an individual pet and local characteristics of the individual pet, and corresponding state variables are respectively established by taking the centers of a detection frame of the individual pet and the local characteristic detection frame as tracking points. The state variables are defined as follows: (px (ki), py (ki), vx (ki), vy (ki)), each dimension representing a coordinate and a velocity value in a prescribed direction, respectively. Wherein k represents the k-th frame image, ki represents the detected pet part in the image, i has values of 1-4, respectively representing the whole pet, the head, the limbs and the trunk, and for the limbs, the state variable should be further divided into four variable values in the next level, represented as (px (k3j), py (k3j), vx (k3j) and vy (k3j)), to represent the state of the j-th limb of the pet. Because the related state variables comprise a plurality of state variables and belong to the same pet under the same frame, the training of the Kalman filter is performed in a parallelization mode. And finally, the trained Kalman filter can predict the positions of the overall and local features of the pet according to the current frame information.

Preferably, in the steps 5 and 6, the first step of target tracking is identification, and the individual pet and the corresponding local features are identified according to the established ML-Adaboost classifier. Then, based on the recognition result of the first frame, the recognized pet is added to the tracking list, where D1 is { D1, D2, …, dm }, and D is { e1, e2, e31, e32, e33, e34, e4} for each pet object, which respectively represent the whole body, head, four limbs (left front limb, right front limb, left rear limb, right rear limb) of the pet, and the position of the trunk. Generally speaking, the individual pet distance between frames is relatively short, so the pet matching problem between different frames can be seen as establishing a one-to-one mapping relationship between the tracking list Dk of the k-th frame and the tracking list Dk +1 of the k + 1-th frame, and minimizing the sum of the distances of all successfully matched mapping relationships, the distance in each mapping relationship being the distance between two elements thereof. The adjacent frame tracking method converts the problem into an assignment problem, and for the assignment problem, a Hungarian algorithm is generally adopted to solve the problem. And the pet features have a position corresponding relation, and when the mapping relation of the pet individuals is established, the Hungarian algorithm is used for matching the local features of the pet in the current frame with the local features of the same pet in the next frame for each pet individual. In addition, an object with tracking failure may exist in the tracking process, a kalman filter is applied to predict and update the position of the object in the next frame of image sequence for such an object, abnormal frames (i.e. frames with unsuccessful matching) are counted, and if the number of the abnormal frames exceeds a specified threshold, the object is deleted (the tracked object has moved out of the video range). The selection of the threshold is important, if the threshold is too large, the same target is missed, and if the threshold is too small, the temporarily blocked object is rechecked, and usually the abnormal frame threshold needs to be adjusted through continuous experiments. In addition, the object with unsuccessful matching may also be a new tracking object, and the object is added into the tracking list. Compared with the method for matching the texture feature similarity of the recognition object of the previous and later frames, the Hungarian algorithm based on the position of the recognition object has higher accuracy and efficiency, the matching accuracy of the former method depends on the setting of the similarity standard, and the texture similarity matching process of all objects between the two frames needs to pay a large amount of time cost and cannot meet the real-time requirement of target tracking.

By adopting the technical scheme, the pet tracking method based on local feature recognition and adjacent frame matching under the community monitoring scene has the following beneficial effects: compared with other tracking algorithms based on single features, the pet individual tracking method based on the texture feature similarity can still track pet individuals in the general direction under the condition that partial features are shielded, and compared with a method for matching according to the similarity of the texture features, the Hungarian assignment algorithm used in the frame-by-frame matching process is quicker and more accurate.

Drawings

FIG. 1 is a general flow chart of a pet tracking method according to the present invention;

FIG. 2 is a diagram of a process for tracking a target pet in a plurality of images in accordance with the present invention;

FIG. 3 is a detailed flowchart of each cross-frame matching in the tracking process of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in FIGS. 1-3, the invention tracks the whole pet and the corresponding local features, can keep a certain degree of tracking on the pet object even under the condition of partial feature occlusion, and accelerates the frame-by-frame matching process of the pet object and the local features thereof by applying a Hungarian assignment algorithm in the tracking process: step 1: images monitored by the community are collected, the pet individuals are labeled, and an Adaboost classifier for the pet individuals is established according to the HOG characteristics. Step 2: and marking the limbs, the head and the trunk of the pet, and establishing Adaboost classifiers aiming at local features for the three parts according to HOG features. And step 3: and integrating the classifiers in the step 1 and the step 2 to form a Multi-layer Adaboost classifier (ML-Adaboost) for the individual pet. And 4, step 4: extracting and collecting partial target information from continuous frame images, and continuously iterating the position information of the individual pet and local characteristics thereof in each frame image to the Kalman filter according to the front and back sequence so as to continuously correct parameters in the filter prediction function. The established Kalman filter can predict the position of a target possibly appearing in the next frame according to the target information of the current frame. And 5: and starting to track the targets in the continuous frames, detecting individual pets and local characteristics thereof by using ML-Adaboost, and adding the new target appearing in the first frame into a tracking list.

Step 6: and (3) sequentially iterating the continuous frame images, selecting two adjacent images before and after each iteration, and establishing a tracking relation between the same target and local characteristics thereof by using a Hungarian algorithm. If an abnormal frame exists in the middle, a frame which is closest to the abnormal frame and is successfully matched is used as a reference, a Kalman filter is used for predicting the position of a target which is possibly in the abnormal frame, and a relation is established between the abnormal frame and the frame. And 7: and (4) repeatedly adopting the steps 4-6 for the new continuous frames, continuously updating the Kalman filter and the tracking information, and judging whether the pet exceeds the monitoring range according to the count of the matching failure frames so as to update the tracking list.

It is to be understood that the present invention builds an Adaboost pet classifier based on individual pet and HOG characteristics, while building an Adaboost local feature classifier based on local characteristics of pets. The HOG features can still ensure good invariance under the condition that the image has geometric or illumination change, so the HOG features are used for constructing the classifier in the invention, so that the classifier can still identify the pet object and the local features thereof under the complex community environment. And the invention combines the classifier aiming at the individual pet and the classifier established aiming at the local features of the pet to form the ML-Adaboost. The ML-Adaboost classifier firstly detects individual pets, and then detects the local features of each pet on the basis of the detection result of the individual pets, and the ML-Adaboost is established for effectively identifying the local features of the individual pets and ensuring that the identified local features belong to correct individual pets.

It can be understood that the present invention uses a kalman filter to predict the position where the object may be in the next frame for matching the failed target. For a matching failure frame (hereinafter referred to as an abnormal frame) caused by environmental change or shielding, a Kalman filter is adopted to predict and calibrate the position of a target possibly existing in the abnormal frame according to the position information of an image of a frame before the abnormal frame, so that the tracking process is corrected. The Kalman filter is established according to partial information of individual pet positions and local characteristic positions in the existing continuous frame images, and the establishing process is summarized as follows: and in the prediction process of the optimal value of the next frame, firstly calculating the predicted value of the pet object and the position of the local feature thereof through an error covariance matrix, secondly correcting the predicted value by using the optimal observed value of the current frame and a Kalman gain, and finally obtaining the optimal value of the variable.

As can be understood, the invention converts the frame-by-frame matching problem in the tracking process into the assignment problem and solves the problem by using the Hungarian algorithm. Generally speaking, the individual pet distance between frames is relatively short, so the pet matching problem between different frames can be seen as establishing a one-to-one mapping relationship between the tracking list Dk of the k-th frame and the tracking list Dk +1 of the k + 1-th frame, and minimizing the sum of the distances of all successfully matched mapping relationships, the distance in each mapping relationship being the distance between two elements thereof. The adjacent frame tracking method converts the problem into an assignment problem, and for the assignment problem, a Hungarian algorithm is generally adopted to solve the problem. Compared with a frame-by-frame matching method based on texture feature similarity, the method is higher in real-time performance and accuracy, because the former method depends on the setting of similarity measurement standards, and needs to be compared pairwise, the calculation complexity is relatively higher, and the latter method carries out frame-by-frame matching on the basis of the position by using the characteristic that the relative displacement distance of the same object in a single frame is smaller, so that the influence of error tracking caused by unreliability of texture feature analysis can be avoided.

It can be understood that, in the tracking process, the invention firstly uses the ML-Adaboost classifier to successively identify all pet individuals and local features thereof in each frame of image. And then, matching the objects between two adjacent frames by using a Hungarian algorithm so as to improve the real-time performance and accuracy of matching. For the object with failed matching, the Kalman filter is used to correct the possible position of the object, so as to continuously track the object which is temporarily in the state of shielding or invisible. In addition, a match failure frame count is set for each tracked object, which is incremented in the event of a match failure and halved in the event of an object being matched again, and the object is considered to have left the monitoring range if the count exceeds a certain threshold. In addition, the setting of the threshold value requires adjustment through multiple experiments, and if the threshold value is too large, missed detection of the same target is caused, whereas if the threshold value is too small, re-detection of a temporarily blocked object is caused.

The invention particularly provides a pet tracking method based on local feature recognition and adjacent frame matching, aiming at the pet tracking problem in a community scene. Compared with the traditional tracking algorithm, the method has the advantages that a tracking mechanism is established for the individual pet and the local characteristics thereof, and certain tracking can be still performed under the condition that partial characteristics of the target are shielded. In addition, in order to enable the real-time performance and accuracy of target matching between frames to be higher, the Hungarian algorithm is adopted to convert the matching problem between frames to the assignment problem of the target position, and compared with a method for matching by using the similarity of texture features, the method is more reliable. And aiming at the problem of matching failure, the Kalman filtering is used for correcting the possible positions of the target in the abnormal frame. The pet tracking method provided by the invention aims to record the motion trail of the pet so as to provide information help for the problem that the community resident pet is lost.

FIG. 1 is a schematic diagram illustrating the operation of the pet tracking method according to the present invention. Firstly, Adaboost classifiers aiming at individual pets and local features are respectively established for the individual pets and the head, the trunk and the four limb parts of the individual pets, and an ML-Adaboost classifier is formed according to the whole-local sequence. According to the established ML-Adaboost classifier, the tracking algorithm can accurately identify individual pets and corresponding local features in each frame of image. And aiming at the identified and marked continuous frame images, establishing a target matching relationship from front to back frame by frame, wherein the target matching relationship comprises the matching relationship of individual pets and the matching relationship between local features. The matching mode between the two frames adopts a Hungarian algorithm, the problem corresponding to the target position is converted into an assignment problem and solved. If a matching failed target exists, the position of the matching failed target in the frame is predicted by using the established Kalman filtering, the matching failure count is increased, and if the failure count exceeds a set threshold value, the target is considered to leave the monitoring visual field. In addition, newly appeared targets are added into the tracking list again for real-time tracking.

The Kalman filter is a recursive filter, and the core idea of the algorithm model is to predict the optimal value of the next frame through the optimal value of the current frame, and in the prediction process of the optimal value of the next frame, firstly, the predicted value of a state variable is calculated through an error covariance matrix, secondly, the predicted value is corrected by using the optimal observed value of the current frame and Kalman gain, and finally, the optimal value of the variable is obtained. According to the application scene of the method, the state variables of the pet target are redefined. (px (ki), py (ki), vx (ki), vy (ki)), each dimension representing a coordinate and a velocity value in a prescribed direction, respectively. Wherein k represents the k-th frame image, ki represents the detected pet part in the image, i has values of 1-4, respectively representing the whole pet, the head, the limbs and the trunk, and for the limbs, the state variable should be further divided into four variable values in the next level, represented as (px (k3j), py (k3j), vx (k3j) and vy (k3j)), to represent the state of the j-th limb of the pet. Because the related state variables comprise a plurality of state variables and belong to the same pet under the same frame, the training of the Kalman filter is performed in a parallelization mode. And finally, the trained Kalman filter can predict the positions of the overall and local features of the pet according to the current frame information. The training process of the Kalman filter is as follows:

the model is first initialized, the covariance matrices Q and R, and A, H are taken as identity matrices, and uk is assumed to be white gaussian noise with variance of 1. The initial value of the state variable is the calibration coordinate value of the pet detection part, and the speed of the x axis and the speed of the y axis are 0; then, calculating a state variable and an error covariance according to a formula of a Kalman filter, wherein the calculation of the two values is based on uk, an optimal observed value of the previous frame of image and the corrected error covariance; and calculating a Kalman gain value according to the calculated state variable and error covariance of the new frame, wherein the Kalman gain value mainly has the function of adjusting the influence of the error between an observed value (the actual image identification position) and a predicted value (the state variable calculated in the previous step) on prediction, the optimal observed value is calculated according to the Kalman gain value and the predicted value, the error covariance is also corrected according to the Kalman gain value and the error covariance calculated in the previous step, the obtained two values are the final results (namely the optimal observed value and the corrected error covariance) predicted by the Kalman gain, and the two values are used for calculating the predicted value and the error of the next frame of image.

Fig. 2 is a process of establishing a track for a target pet in successive frame images. The first step of target tracking is recognition, and individual pets and corresponding local features are recognized according to the established ML-Adaboost classifier. Then, based on the recognition result of the first frame, the recognized pet is added to the tracking list, where D1 is { D1, D2, …, dm }, and D is { e1, e2, e31, e32, e33, e34, e4} for each pet object, which respectively represent the whole body, head, four limbs (left front limb, right front limb, left rear limb, right rear limb) of the pet, and the position of the trunk. Assuming that the pet individual set of the current frame is Dk, the pet individual set of the next frame to be matched is Dk +1, and the number of frames each pet fails to match is set as fi, the process of the tracking algorithm can be summarized as follows (the description of the tracking process is also given in fig. 3):

1. for a given continuous frame image, a pet is identified and labeled by using an ML-Adaboost classifier for all images, and for each frame image, a labeled pet object and local features thereof are respectively recorded into a corresponding tracking set D, wherein Dk represents an object possibly existing in the image of the k frame. Then, iterative matching is carried out from the first frame image

2. For the image of the current frame, if the image is not the last frame, the pet object set Dk of the current frame and the pet object set Dk +1 of the next frame are taken, a one-to-one mapping relation is established between the two sets by using a Hungarian algorithm, and the total distance of the mapping relation is minimum. If there are newly added frames, the pet is identified and labeled again using the ML-Adaboost classifier. If the image is the last frame, the tracking is ended.

3. The remaining condition of the matching objects in the two sets (namely, the matching failure objects) is judged. If the matching failure object exists in the Dk, the fi of the object is added by one, if the fi of the object exceeds a threshold value, the object is considered to exceed the monitoring range, and the object is deleted, otherwise, the position of the object possibly existing in the next frame is predicted by using Kalman filtering, and the object is updated into Dk + 1. If the matching failure object exists in Dk +1, the object is regarded as a new object, the object is reserved to the next frame, and the initial value of fi corresponding to the object is set to be 0. In addition, for an object successfully matched, if the object is the result of Kalman filtering prediction in Dk, fi is halved, and an object with failed matching appears under the camera again and is successfully matched.

4. Turning to the next frame, Dk +1 is taken as the object set Dk for the current frame, and back to the second step.

In summary, the method provided by the invention completes the tracking of the pet object and the local features in the continuous frame images. Compared with a matching tracking mode based on the similarity of the textural features, the tracking method for the pet in the complex community scene simultaneously considers the frame-by-frame matching of the whole and local features of the target, and can continuously and effectively track the pet under the condition that partial features of the pet are shielded. In addition, the method uses the Hungarian algorithm to carry out frame-by-frame matching on the basis of the target position, and meanwhile, the efficiency and the accuracy are improved. For the characteristic part which is matched and failed in the shielding or illumination abnormal environment, the Kalman filtering is adopted to predict the position of the characteristic which is matched and failed, and the Kalman filtering mainly has the function of completing the pet position in the abnormal frame so that the tracking algorithm can associate the pet track in the images of continuous frames.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A pet tracking method based on local feature recognition and adjacent frame matching in a community monitoring scene is characterized in that: the method comprises the following steps:

step 1: collecting images under community monitoring, labeling individual pets, and establishing an Adaboost classifier for the individual pets according to HOG characteristics;

and step 3: integrating the classifiers in the step 1 and the step 2 to form a multi-level Adaboost classifier for the individual pet;

and 4, step 4: extracting and collecting partial target information in continuous frame images, and continuously iterating the position information of individual pets and local characteristics of the pets in each frame image to a Kalman filter according to the sequence so as to continuously correct parameters in a filter prediction function; the established Kalman filter can predict the position of a target possibly appearing in the next frame according to the target information of the current frame;

step 6: iteration is carried out on the continuous frame images in sequence, two adjacent images before and after each iteration are selected, and a tracking relation between the same target and local features of the same target is established by a Hungary algorithm; if an abnormal frame exists in the middle, a frame which is closest to the abnormal frame and is successfully matched is used as a reference, a Kalman filter is used for predicting the position of a target which is possibly in the abnormal frame, and a relation is established between the abnormal frame and the frame;

and 7: and (4) repeatedly adopting the steps 4-6 for the new continuous frames, continuously updating the Kalman filter and the tracking information, and judging whether the pet exceeds the monitoring range according to the count of the matching failure frames so as to update the tracking list.

2. The pet tracking method based on local feature recognition and adjacent frame matching under the community monitoring scene according to claim 1, characterized in that: in step 1, the method further includes firstly detecting individual pets by the Adaboost classifier, and then detecting local features of each pet on the basis of the detection result of the individual pet.

3. The pet tracking method based on local feature recognition and adjacent frame matching under the community monitoring scene according to claim 1, characterized in that: for a target with failed matching, predicting the position of the target possibly in the next frame by using a Kalman filter; for a matching failure frame (hereinafter referred to as an abnormal frame) caused by environmental change or shielding, a Kalman filter is adopted to predict and calibrate the position of a target possibly existing in the abnormal frame according to the position information of an image of a frame before the abnormal frame, so that the tracking process is corrected.

4. The pet tracking method based on local feature recognition and adjacent frame matching under the community monitoring scene according to claim 3, characterized in that: the Kalman filter is established according to partial information of individual pet positions and local characteristic positions in the existing continuous frame images, and the establishing process is summarized as follows: for the selected partial continuous frame images, iteration is carried out from the first frame, the optimal value (the position where the object and the local feature are possible to appear) of the next frame is predicted through the optimal value (the predicted position of the object and the local feature or the calibrated position) of the current frame in each iteration process, in the prediction process of the optimal value of the next frame, the predicted value of the pet object and the position of the local feature of the pet object are firstly calculated through an error covariance matrix, then the predicted value is corrected through the optimal observed value and Kalman gain of the current frame, and finally the optimal value of the predicted value is obtained.

5. The pet tracking method based on local feature recognition and adjacent frame matching under the community monitoring scene according to claim 1, characterized in that: in the step 5, the method further comprises the steps of converting a frame-by-frame matching problem in the tracking process into an assignment problem, and solving by using a Hungarian algorithm.

6. The pet tracking method based on local feature recognition and adjacent frame matching under the community monitoring scene according to claim 1, characterized in that: in the tracking process, a classifier of the pet individual and a classifier established aiming at the pet local feature are combined to form ML-Adaboost, firstly, the ML-Adaboost classifier is used for successively identifying all the pet individuals and the local feature thereof in each frame of image, and then, the Hungarian algorithm is used for matching the object between two adjacent frames so as to improve the real-time performance and the accuracy of matching; for the object with failed matching, correcting the possible existing position of the object by using a Kalman filter so as to continuously track the object which is temporarily in an occlusion or invisible state; in addition, a matching failure frame count is set for each tracked object, the count is increased when the matching fails and is halved when the objects are matched again, and if the count exceeds a certain threshold value, the objects are considered to leave the monitoring range; in addition, the setting of the threshold value requires adjustment through multiple experiments, and if the threshold value is too large, missed detection of the same target is caused, whereas if the threshold value is too small, re-detection of a temporarily blocked object is caused.