CN110570454B

CN110570454B - Method and device for detecting foreign matter invasion

Info

Publication number: CN110570454B
Application number: CN201910657176.7A
Authority: CN
Inventors: 刘鑫; 蔡恒; 张继勇; 庄浩
Original assignee: Huarui Xinzhi Baoding Technology Co ltd; Huarui Xinzhi Technology Beijing Co ltd
Current assignee: Huarui Xinzhi Baoding Technology Co., Ltd; Huarui Xinzhi Technology (Beijing) Co., Ltd
Priority date: 2019-07-19
Filing date: 2019-07-19
Publication date: 2022-03-22
Anticipated expiration: 2039-07-19
Also published as: CN110570454A

Abstract

The application discloses a method for detecting foreign matter invasion, which comprises the following steps: receiving a current frame visible light image sent by acquisition equipment; determining a boundary frame corresponding to the foreign matter in the current frame visible light image through a pre-trained recognition model, and receiving the infrared video sent by the acquisition equipment; determining the boundary frame of the foreign matter in the current frame infrared image according to the coordinates of the boundary frame of the foreign matter determined in the current frame visible image; and tracking the foreign matter in the infrared video according to a boundary frame of the foreign matter in the infrared image of the current frame and a pre-constructed filter. By combining visible light and infrared detection, the deviation between the visible light and the infrared image can be corrected under the condition of multiple factors. When the target tracking is carried out, the infrared image is adopted instead of the characteristic data of the visible light image, so that the influence of illumination change is reduced, and the robustness is improved.

Description

Method and device for detecting foreign matter invasion

Technical Field

The application relates to the field of computers, in particular to a method and a device for detecting foreign matter invasion.

Background

At present, more and more companies and enterprises pay more and more attention to the safety problem in the field. With the widespread use of cameras and other devices, foreign object intrusion detection systems are becoming more and more desirable. The system can find that the foreign matters with potential safety hazards enter the monitoring area in time, thereby immediately assigning professionals to clean the foreign matters, and maximally reducing the losses such as circuit short-circuit equipment faults caused by hanging the foreign matters and the like.

In the prior art, a foreign object intrusion detection system processes a visible light image, when the illumination condition of visible light changes violently, the target characteristic changes greatly, and target tracking based on the visible light image may fail. Meanwhile, when the target is tracked, the target needs to be tracked in a visible light image, so that the consumption of hardware resources is high, and the requirement on hardware performance is also high.

Disclosure of Invention

In order to solve the above problem, the present application provides a method for detecting foreign object intrusion, including: receiving a current frame visible light image sent by acquisition equipment; determining a boundary frame corresponding to the foreign matter in the current frame of the visible light image through a pre-trained recognition model, wherein the boundary frame is input as the visible light image and output as the foreign matter in the visible light image when the recognition model is trained; receiving an infrared video sent by the acquisition equipment; the current frame infrared image corresponds to the current frame visible light image in the infrared video; determining the boundary frame of the foreign matter in the current frame infrared image according to the coordinates of the boundary frame of the foreign matter determined in the current frame visible image; and tracking the foreign matter in the infrared video according to a boundary frame of the foreign matter in the infrared image of the current frame and a pre-constructed filter.

In one example, the method further includes determining that the foreign object is located before the boundary frame in the current frame infrared image according to the coordinates of the boundary frame of the foreign object determined in the current frame visible light image, and the method further includes: and carrying out image registration on the current frame infrared image and the current frame visible light image.

In one example, the image registration of the current frame infrared image and the current frame visible light image specifically includes: predetermining an image deviation value between the infrared image and the visible light image; the image deviation value comprises an image scaling, an image deviation distance and an image precision coefficient; and carrying out image registration on the current frame infrared image and the current frame visible light image according to the deviation value.

In one example, the predetermining an image deviation value between the infrared image and the visible light image specifically includes: acquiring corresponding images of the visible light image and the infrared image on a calibration plate; by the formula

Calculating to obtain the image scaling, wherein f₁For image scaling, n is the number of dots on the calibration plate, A_nIs the coordinate of the first center of a circle on the infrared image, A_n-1Is the coordinate of the second center of a circle on the infrared image, B_nCoordinates of a circle center corresponding to the first circle center on the visible light image, B_nThe coordinates of the circle center corresponding to the second circle on the visible light image are obtained; by the formula

Calculating to obtain an image offset distance, wherein x_diffFor image shift distance in the x-axis, Y_diffFor the image shift distance on the y-axis,

to the coordinates of the first pixel point on the infrared image,

to the coordinates of the second pixel point on the infrared image,

the coordinates of the pixel points corresponding to the first pixel points on the visible light image,

the coordinates of the pixel points corresponding to the second pixel points on the visible light image are obtained; by the formula

Calculating to obtain the image precisionCoefficient of which σ_xAs coefficient of image precision, d_xIs the base length, f₂For the focal length of the acquisition device,/_pixIs the pixel size, D_optimalFor an optimum distance, D_targetIs the target distance.

In one example, tracking the foreign object in the infrared video according to a bounding box of the foreign object in the current frame infrared image and a pre-configured filter specifically includes: extracting position features with the size of 2 times from the current frame infrared image according to the position of the foreign matter in the boundary frame of the current frame infrared image; calculating a plurality of predicted positions of the alien material in the next frame of infrared image by a pre-configured filter; and calculating and determining the largest predicted position in the plurality of predicted positions as the position of the foreign object in the next infrared image.

In one example, the filter is specifically:

wherein l is the characteristic dimension, h^lIs a filter in l dimension, t is the vector feature of t image blocks extracted from the foreign object, lambda is a positive term coefficient, d is a feature vector dimension, k is any dimension from 1 to d, F^l、F^kThe fourier transform of the eigenvectors f, respectively the l-th and k-th dimensions, G the discrete fourier transform of the filter response values G, G being obtained by a gaussian function,

are respectively G, F^kThe complex conjugate of (a) and (b),

model A for the t-th l-dimension, B_tIs the t-th B model.

In one example, before receiving the infrared video transmitted by the acquisition device, the method further comprises: determining the category of the foreign matters to be one of pre-stored designated categories through the identification model, wherein the category of the foreign matters is divided in advance; when the recognition model is trained, the boundary box of the foreign matter in the visible light image is input, and the category of the foreign matter is output.

In one example, the method further comprises: when the time length of the movement stop of the foreign matter exceeds the preset time length, or the movement track of the foreign matter exceeds the size range of the visible light video or the infrared video, the tracking of the foreign matter in the infrared video is stopped.

In one example, the method further comprises: receiving other frames of visible light images sent by an acquisition device, wherein the imaging time of the other frames of visible light images is later than that of the current frame of visible light images; determining the number of the foreign matters in the other frames of visible images, wherein the number of the foreign matters is larger than the number of the foreign matters when the tracking of the foreign matters in the infrared video is stopped; and re-receiving the infrared video sent by the acquisition equipment so as to track the foreign matters in the other frames of visible light images.

On the other hand, the embodiment of the present application further provides a device for detecting intrusion of a foreign object, including: the first receiving module is used for receiving the current frame visible light image sent by the acquisition equipment; the recognition module is used for determining a boundary frame corresponding to the foreign matter in the current frame visible light image through a pre-trained recognition model, wherein the boundary frame is input as the visible light image and output as the boundary frame of the foreign matter in the visible light image when the recognition model is trained; the second receiving module is used for receiving the infrared video sent by the acquisition equipment; the current frame infrared image corresponds to the current frame visible light image in the infrared video; the processing module is used for determining the boundary frame of the foreign matter in the current frame infrared image according to the coordinates of the boundary frame of the foreign matter determined in the current frame visible image; and the tracking module tracks the foreign object in the infrared video according to the boundary frame of the foreign object in the current frame infrared image and a pre-constructed filter.

The calibration mode provided by the application can bring the following beneficial effects:

by combining visible light and infrared detection, the deviation between the visible light and the infrared image can be corrected under the condition of multiple factors. When the target tracking is carried out, the infrared image is adopted instead of the characteristic data of the visible light image, so that the influence of illumination change is reduced, and the robustness is improved. The infrared target tracking is used for a long time, compared with visible light tracking, hardware resources are saved, and background processing speed is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

FIG. 1 is a schematic flow chart illustrating a method for detecting intrusion of a foreign object according to an embodiment of the present disclosure;

FIG. 2 is a block diagram of an apparatus for detecting intrusion of a foreign object according to an embodiment of the present disclosure;

FIG. 3 is a general flowchart of a method for detecting intrusion of a foreign object according to an embodiment of the present application;

fig. 4 is a flowchart illustrating each frame of image of a method for detecting intrusion of a foreign object according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.

As shown in fig. 1 and fig. 3, a method for detecting intrusion of a foreign object according to an embodiment of the present application includes:

s101, receiving a current frame visible light image sent by a collection device.

The acquisition equipment is placed at the corresponding position in advance, and images can be acquired in real time. The acquisition equipment can be camera equipment, and camera equipment includes visible light camera and infrared camera, can gather visible light video and infrared video respectively in real time. When the acquisition device acquires the visible light image or video, the visible light image or video can be sent to the server. Of course, the transmission may be performed to a device having a corresponding processing function in addition to the server, and for convenience of description, the transmission to the server will be explained as an example.

In general, a camera only needs to collect a visible light image or video and send the visible light image or video to a server. During sending, the collected visible light image can be sent to the server in real time, that is, the collected visible light video is sent to the server in real time, but the occupation of hardware resources is large. Therefore, the visible light image can be sent to the server at regular time, that is, one frame of visible light image is sent to the server in the visible light video collected in real time at intervals of a preset time. For example, the visible light image is transmitted every 3 seconds. For convenience of description, the frame of visible light image is referred to as a current frame of visible light image.

When the server receives the current frame visible light image, it may first perform pre-processing, convert the image data into RGB array using corresponding tools, and set the image size to 448 × 448, so as to facilitate the corresponding processing of the image.

And S102, determining a boundary frame corresponding to the foreign matter in the current frame of the visible light image through a pre-trained recognition model, wherein the boundary frame is input as the visible light image and output as the boundary frame of the foreign matter in the visible light image when the recognition model is trained.

After the server receives the current frame visible light image, a boundary frame corresponding to the foreign matter can be determined through a pre-trained recognition model, so that the foreign matter can be tracked subsequently. When the recognition model is trained, the input is a visible light image, and the output is a bounding box of a foreign object in the image.

In particular, since in general, many foreign objects to be detected are small objects, for exampleBirds, plastic bags, ribbons, etc., so that yolo models can be collected for training. The method comprises the steps of firstly constructing a yolo neural network, wherein a yolo classic structure Darknet-53 can be adopted, namely a convolutional neural network is used for extracting features, and a yolo feature layer is used for obtaining a predicted value. The basic structure is 53 convolutional layers and small, medium and large 3 characteristic layers. Then, each visible light image of the labeled visible light image training set is input into the convolutional layer, segmented into the grid size of n × n, for example, 13 × 13, image features are extracted for each grid, and iterative training is performed on the convolutional layer and the feature layer. In an iterative process, the loss function can be defined as:

where loss is the loss function and s is the number of grids, which can be taken to be 13. coordErr is the coordinate error, iouErr is the cross over unit (IOU) error, clsrr is the classification error.

Further, the loss function may be defined as:

wherein x and y are central coordinates of the bounding box, w and h are width and height of the bounding box, C is confidence coefficient, and p is the category of the foreign matter;

when in use

When the number is 1, it represents that foreign matters exist in the jth prediction boundary box in the ith grid area,

when 0, no foreign matter exists;

for the predicted value of the corresponding parameter, B is the number of the set predicted bounding boxes, and the value can be 5, lambda_coordFor set coordinate deviation weightCan be set to 5, lambda_noobjThe IOU error correction weight may be set to 0.5. In addition, the IOU represents the overlapping area ratio of the two bounding boxes.

After the position of the boundary frame of the foreign object is determined, the type of the foreign object can be determined in the light image according to the position, and only when the type of the foreign object is the designated type, the infrared video is acquired to track the foreign object. Only the foreign matters of the specified category are tracked, and the hardware resource can not be wasted. And by confirming the type of the foreign matters, a mechanism aiming at the invasion of the foreign matters of the corresponding type can be more effectively formulated. Wherein the categories are pre-assigned, including, for example, plastic bags, kites, umbrellas, clothing, birds, insects, and the like. The type of the foreign object may be determined by a recognition model, or may be determined by other corresponding programs or devices, and is not limited herein.

Specifically, when determining the type of the foreign object, the preprocessed visible light image may be first divided into 13 × 13 mesh sizes. And then judging the grid in which the central point of the boundary frame corresponding to the foreign matter is positioned, wherein the cell is responsible for detecting the foreign matter. P of 5 bounding boxes will be predicted per cell_nAnd confidence C of the bounding box, P when the bounding box does not contain foreign matter_n0 when foreign matter is contained, P_nIs 1. Where the accuracy of the bounding box is characterized using the intersection ratio of the predicted box and the actual box, i.e., IOU, the confidence C may be defined as C-P_nIOU. For the same foreign object, 5 bounding boxes are output, each bounding box comprises several parameters of x, y, w, h and c, and the specific meaning of each parameter is shown by referring to the formula. And finally, calculating the IOU of every two of the 5 boundary boxes, if the IOU exceeds nms, discarding the boundary box with the smaller confidence coefficient C, and finally reserving 1 boundary box as a model detection output result. Where nms is a non-maximum suppression value, and may be set to 0.3.

S103, receiving the infrared video sent by the acquisition equipment; and the current frame infrared image corresponds to the current frame visible light image in the infrared video.

After the existence of the foreign matter is determined in the current frame visible light image, an acquisition request can be actively sent to the acquisition equipment, and then the infrared video sent by the acquisition equipment is received. The infrared video comprises a frame of infrared image corresponding to the current frame of visible light image, wherein the correspondence means that the two images shoot similar pictures at the same time point and the same position. Due to the different focal lengths of the visible light camera and the infrared camera and the reason of a hardware system, the final imaging effect of the visible light camera and the infrared camera is slightly different even if the same object is shot, so that the visible light camera and the infrared camera can be called as corresponding. For convenience of description, the frame infrared image may be referred to as a current frame infrared image.

In order to ensure that the positions of the detected points on the infrared image are consistent with the positions of the points on the visible light image, and facilitate subsequent processing of the image, the pixels of the infrared image and the visible light image can be aligned, i.e., image registration can be performed. The image registration refers to a process of matching and superimposing two or more images acquired at different times and under different imaging devices or under different conditions, such as weather, illumination, camera shooting position and angle.

Specifically, three deviation values of the image scaling, the image offset distance, and the image precision coefficient need to be calculated, and image registration is performed according to the three deviation values.

Image scaling may be used to align image dimensions. During calculation, a camera can be used for shooting a calibration board, visible light images and infrared images are respectively obtained on the calibration board, the distance between the centers of two circle points can be calculated according to the ratio, and the specific formula is as follows:

wherein f is₁For image scaling, n is the number of dots on the calibration plate, A_nIs the coordinate of any center of a circle on the infrared image, A_n-1Is the coordinate of another circle center on the infrared image, B_nCoordinates of a center of a circle corresponding to any one of the centers of the circle on the visible light image, B_nAnd the coordinates of the circle center corresponding to the other circle center on the visible light image.

Since the infrared ray itself is shifted from the visible light at the time of imaging, the shift distance can be calculated and corrected. After the infrared image and the visible light image are scaled to be uniform in size through the image scaling, the pixel difference of the corresponding point can be calculated according to the coordinate position of the dot in the calibration board at the pixel coordinate position of the infrared image and the pixel coordinate position of the visible light image, and the image offset distance can be calculated, wherein the specific formula is as follows:

wherein x is_diffFor image shift distance in the x-axis, Y_diffFor the image shift distance on the y-axis,

to the coordinates of the first pixel point on the infrared image,

to the coordinates of the second pixel point on the infrared image,

and the coordinates of the pixel points corresponding to the second pixel points on the visible light image are obtained.

In addition, the accuracy coefficient can be calculated according to hardware parameters, and the specific formula is as follows:

wherein σ_xAs coefficient of image precision, d_xIs the base length, f₂For the focal length of the acquisition device,/_pixIs the pixel size, D_optimalFor optimum distance, typically take infinity, D_targetTo the eyesAnd (5) marking the distance.

And S104, determining the boundary frame of the foreign matter in the current frame infrared image according to the coordinates of the boundary frame of the foreign matter determined in the current frame visible image.

After the infrared video is received, a corresponding boundary frame can be determined in the current frame infrared image according to the coordinates of the boundary frame of the foreign matter determined in the current frame visible image, and the boundary frame is used for follow-up tracking of the foreign matter. Of course, this step can be made more convenient and faster after the infrared image is image registered with the light image.

And S105, tracking the foreign matter in the infrared video according to a boundary frame of the foreign matter in the current frame infrared image and a pre-constructed filter.

After the boundary frame corresponding to the foreign object is determined on the current frame infrared image, the foreign object can be tracked in the infrared video through a pre-constructed filter.

In particular, the tracker may use a correlation filter algorithm. Firstly, according to the position of the foreign matter in the boundary frame of the current frame infrared image, extracting the position characteristic of 2 times of the size on the current frame infrared image. Then, a plurality of predicted positions of the foreign object in the next frame of infrared image are calculated through a position correlation filter, and the largest predicted position is calculated and obtained to be used as the position of the foreign object in the next frame of infrared image. In addition, 33 different scale features can be extracted according to the target scale of the current frame infrared image, a plurality of prediction scales of foreign matters of the next frame infrared image are calculated through a scale correlation filter, and the maximum prediction scale is calculated to be used as the scale of the next frame target.

The method comprises the following steps of selecting sample characteristics of foreign matters in a current frame infrared image, wherein the sample characteristics comprise position characteristics with the size being 2 times and 33 different scale characteristics, and the scale selection rule is as follows:

p, R denotes the width and height of the foreign object in the infrared image of the current frame, a is a scale factor, and is set to 1.03, SThe number of scales is set to 33 here.

Constructing a correlation filter, wherein the specific formula is as follows:

wherein l is the characteristic dimension, h^lThe filter is in a dimension of l, t is the vector characteristics of t image blocks extracted from foreign matters, lambda is a positive coefficient and is used for eliminating the influence of zero frequency components in an F characteristic frequency spectrum, the denominator is 0 when solving is avoided, d is the dimension of the characteristic vector, k is any dimension from 1 to d, F is the frequency of the image blocks, and the image blocks are subjected to image processing^l、F^kThe fourier transform of the eigenvectors f, respectively the l-th and k-th dimensions, G the discrete fourier transform of the filter response values G, G being obtained by a gaussian function,

are respectively G, F^kComplex conjugation of (a); in addition, h is usually split into a numerator A and a denominator B for iterative updating,

model A for the t-th l-dimension, B_tIs the t-th B model.

And acquiring a position model and a scale model of the current frame infrared image by inputting the image characteristic f of the next frame infrared image, and substituting the position model and the scale model into a formula to obtain the position model and the scale model of the next frame infrared image.

Specifically, the formula for finding the position model and the scale model of the next frame of infrared image is as follows:

wherein, Z is a sample feature extracted from the previous frame of infrared image, the sample feature may be a position sample feature and/or a scale sample feature, and other parameters refer to the above formula.

In one embodiment, after the foreign object is tracked in the infrared video, the motion trajectory of the foreign object can be generated in the corresponding visible light image or visible light video according to the tracking result, so as to be displayed to the user through the output device, and at this time, the user can observe the motion trajectory of the foreign object intrusion more conveniently.

In one embodiment, as shown in fig. 4, when tracking the foreign object, if the time length of the movement stop of the foreign object exceeds a preset time length, for example, 1s, it may be determined that the foreign object stays in the shooting area, and the infrared tracking is stopped. And if the motion trail of the foreign matter exceeds the size range of the visible light video or the infrared video, judging that the foreign matter leaves, and stopping the infrared tracking. When the subsequent server receives the visible light image sent by the acquisition device again, if the foreign object is still detected, the foreign object may be the foreign object in the detection stop process, and if the foreign object is tracked again, the resource waste is caused. Therefore, when the server receives other frames of visible light images sent by the acquisition equipment, the number of the foreign matters in the images can be judged, if the number of the foreign matters is larger than that of the foreign matters when the tracking is stopped, new foreign matters exist, and at the moment, the infrared video sent by the acquisition equipment can be received again to track the foreign matters. The other frame of visible light image refers to a visible light image with imaging time later than that of the current frame of visible light image. In addition, in the infrared tracking state, when the background detects the foreign object in the visible light image, the interval time may be lengthened, for example, compared to the original visible light image sent by the acquisition device that is received every n seconds, the visible light image is received every n +5 seconds.

As shown in fig. 2, an embodiment of the present application further provides an apparatus for detecting intrusion of a foreign object, including:

the first receiving module 201 receives a current frame visible light image sent by the acquisition device;

the identification module 202 is configured to determine a bounding box corresponding to a foreign object in the current frame visible light image through a pre-trained identification model, where the boundary box corresponding to the foreign object is input as the visible light image and output as the bounding box of the foreign object in the visible light image when the identification model is trained;

the second receiving module 203 receives the infrared video sent by the acquisition device; the current frame infrared image corresponds to the current frame visible light image in the infrared video;

the processing module 204 is configured to determine a boundary frame of the foreign object in the current frame infrared image according to the coordinate of the boundary frame of the foreign object determined in the current frame visible light image;

and the tracking module 205 is used for tracking the foreign object in the infrared video according to a bounding box of the foreign object in the infrared image of the current frame and a pre-constructed filter.

The above description is merely one or more embodiments of the present disclosure and is not intended to limit the present disclosure. Various modifications and alterations to one or more embodiments of the present description will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement or the like made within the spirit and principle of one or more embodiments of the present specification should be included in the scope of the claims of the present specification.

Claims

1. A method of detecting foreign object intrusion, comprising:

receiving a current frame visible light image sent by acquisition equipment;

determining a boundary frame corresponding to the foreign matter in the current frame of the visible light image through a pre-trained recognition model, wherein the boundary frame is input as the visible light image and output as the foreign matter in the visible light image when the recognition model is trained;

receiving an infrared video sent by the acquisition equipment; the current frame infrared image corresponds to the current frame visible light image in the infrared video;

determining the boundary frame of the foreign matter in the current frame infrared image according to the coordinates of the boundary frame of the foreign matter determined in the current frame visible image;

tracking the foreign matter in the infrared video according to a boundary frame of the foreign matter in the infrared image of the current frame and a pre-constructed filter;

according to the coordinates of the boundary frame of the foreign matter determined in the current frame visible light image, determining that the foreign matter is in front of the boundary frame in the current frame infrared image, wherein the method further comprises the following steps:

carrying out image registration on the current frame infrared image and the current frame visible light image;

performing image registration on the current frame infrared image and the current frame visible light image, specifically including:

predetermining an image deviation value between the infrared image and the visible light image; the image deviation value comprises an image scaling, an image deviation distance and an image precision coefficient;

according to the deviation value, carrying out image registration on the current frame infrared image and the current frame visible light image;

predetermining an image deviation value between the infrared image and the visible light image, which specifically comprises the following steps:

acquiring corresponding images of the visible light image and the infrared image on a calibration plate;

by the formula

Calculating to obtain the image scaling, wherein f₁For image scaling, n is the number of dots on the calibration plate, A_nIs the coordinate of the first center of a circle on the infrared image, A_n-1Is the coordinate of the second center of a circle on the infrared image, B_nCoordinates of a circle center corresponding to the first circle center on the visible light image, B_nThe coordinates of the circle center corresponding to the second circle on the visible light image are obtained;

by the formula

to the coordinates of the first pixel point on the infrared image,

to the coordinates of the second pixel point on the infrared image,

the coordinates of the pixel points corresponding to the second pixel points on the visible light image are obtained;

by the formula

Calculating to obtain an image precision coefficient, wherein sigma_xAs coefficient of image precision, d_xIs the base length, f₂For the focal length of the acquisition device,/_pixIs the pixel size, D_optimalFor an optimum distance, D_targetIs the target distance.

2. The method according to claim 1, wherein tracking the alien material in the infrared video according to a bounding box of the alien material in the infrared image of the current frame and a pre-configured filter, specifically comprises:

extracting position features with the size of 2 times from the current frame infrared image according to the position of the foreign matter in the boundary frame of the current frame infrared image;

calculating a plurality of predicted positions of the alien material in the next frame of infrared image by a pre-configured filter;

and calculating and determining the largest predicted position in the plurality of predicted positions as the position of the foreign object in the next infrared image.

3. The method according to claim 2, characterized in that the filter is in particular:

are respectively G, F^kThe complex conjugate of (a) and (b),

model A for the t-th l-dimension, B_tIs the t-th B model.

4. The method of claim 1, wherein prior to receiving the infrared video transmitted by the capture device, the method further comprises:

determining the category of the foreign matters to be one of pre-stored designated categories through the identification model, wherein the category of the foreign matters is divided in advance; when the recognition model is trained, the boundary box of the foreign matter in the visible light image is input, and the category of the foreign matter is output.

5. The method of claim 1, further comprising:

when the time length of the movement stop of the foreign matter exceeds the preset time length, or the movement track of the foreign matter exceeds the size range of the visible light video or the infrared video, the tracking of the foreign matter in the infrared video is stopped.

6. The method of claim 5, further comprising:

receiving other frames of visible light images sent by an acquisition device, wherein the imaging time of the other frames of visible light images is later than that of the current frame of visible light images;

determining the number of the foreign matters in the other frames of visible images, wherein the number of the foreign matters is larger than the number of the foreign matters when the tracking of the foreign matters in the infrared video is stopped;

and re-receiving the infrared video sent by the acquisition equipment so as to track the foreign matters in the other frames of visible light images.

7. An apparatus for detecting intrusion of a foreign object, comprising:

the first receiving module is used for receiving the current frame visible light image sent by the acquisition equipment;

the recognition module is used for determining a boundary frame corresponding to the foreign matter in the current frame visible light image through a pre-trained recognition model, wherein the boundary frame is input as the visible light image and output as the boundary frame of the foreign matter in the visible light image when the recognition model is trained;

the second receiving module is used for receiving the infrared video sent by the acquisition equipment; the current frame infrared image corresponds to the current frame visible light image in the infrared video;

the processing module is used for determining the boundary frame of the foreign matter in the current frame infrared image according to the coordinates of the boundary frame of the foreign matter determined in the current frame visible image;

by the formula

by the formula

(n ∈ (1,33)) calculating to obtain an image offset distance, wherein x_diffFor image shift distance in the x-axis, Y_diffFor the image shift distance on the y-axis,

to the coordinates of the first pixel point on the infrared image,

to the coordinates of the second pixel point on the infrared image,

by the formula

Calculating to obtain an image precision coefficient, wherein sigma_xAs coefficient of image precision, d_xIs the base length, f₂For the focal length of the acquisition device,/_pixIs the pixel size, D_optimalFor an optimum distance, D_targetIs the target distance;

and the tracking module tracks the foreign object in the infrared video according to the boundary frame of the foreign object in the current frame infrared image and a pre-constructed filter.