CN115830515A - Video target behavior identification method based on spatial grid - Google Patents
Video target behavior identification method based on spatial grid Download PDFInfo
- Publication number
- CN115830515A CN115830515A CN202310047339.6A CN202310047339A CN115830515A CN 115830515 A CN115830515 A CN 115830515A CN 202310047339 A CN202310047339 A CN 202310047339A CN 115830515 A CN115830515 A CN 115830515A
- Authority
- CN
- China
- Prior art keywords
- target
- gaussian
- pixel
- frame
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000001514 detection method Methods 0.000 claims abstract description 45
- 230000009471 action Effects 0.000 claims abstract description 6
- 230000002093 peripheral effect Effects 0.000 claims abstract description 5
- 230000006399 behavior Effects 0.000 claims description 40
- 238000009826 distribution Methods 0.000 claims description 36
- 239000000203 mixture Substances 0.000 claims description 28
- 238000005315 distribution function Methods 0.000 claims description 24
- 230000006870 function Effects 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000010586 diagram Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 230000003068 static effect Effects 0.000 claims description 6
- 238000004458 analytical method Methods 0.000 claims description 5
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 230000007123 defense Effects 0.000 claims description 2
- 238000011156 evaluation Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a video target behavior identification method based on a spatial grid, which comprises the steps of establishing a data set, wherein the data set comprises the type and the state of a target; identifying the type and the state of a target in a video frame through a target identification algorithm, and detecting a moving target in the video frame through a moving target detection algorithm; based on spatial grid positioning, the behavior and the action of the target in the video frame are analyzed through target detection and motion detection in combination with the peripheral situation of the grid. The invention aims at the scene of video target behavior identification based on spatial grids. The action of the target is identified through target detection identification and moving target detection, and the behavior of the target is identified through space grid positioning and combining the peripheral situation of the grid.
Description
Technical Field
The invention relates to the fields of geographic space rasterization processing and situation perception, in particular to a video target behavior identification method based on a spatial grid.
Background
Video target behavior identification based on spatial grids is a very useful method for studying battlefield target behavior. By analyzing the behavior of the battlefield target for a long time, the obtained data is more scientific, more objective and has more reference value. Although a great deal of research and innovation is carried out on the existing behavior analysis and identification method, most of the existing behavior analysis and identification methods are based on the traditional evaluation method, and the actual use requirements are not met in the aspects of accuracy, timeliness and practicability.
For example, CN111222487a in the prior art discloses a video target behavior recognition method and an electronic device, where the method includes: acquiring a video to be identified, wherein the video comprises an image frame of the video to be identified; acquiring one or more local target images through a target detection model; matching the obtained local target images through a target tracking model to obtain one or more target image sequences; performing quality scoring on the target image behaviors in each target image sequence through a target behavior quality scoring model to obtain a high-quality target image subsequence; and performing behavior recognition on the obtained high-quality target image subsequence through a behavior recognition model to obtain a behavior recognition result. The method only carries out behavior recognition on the high-quality target image subsequence in the video target image sequence, and on one hand, the influence of the low-quality target behavior recognition result on the whole video target behavior recognition result is eliminated; on the other hand, the efficiency of identifying the video target behaviors can be improved because only high-quality target behaviors are identified. However, the spatial information is not processed, so when the method is applied to analyzing battlefield target behaviors, the result is deviated, and the requirement cannot be met.
Therefore, there is a need to solve the above problems.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to provide a video target behavior recognition method based on a spatial grid.
The technical scheme is as follows: in order to achieve the above purpose, the invention discloses a video target behavior recognition method based on a spatial grid, which comprises the following steps:
(1) Establishing a data set, wherein the data set comprises the type and the state of a target;
(2) The type and the state of the target in the video frame are identified through a target identification algorithm,
(3) Detecting a moving object of the video frame by a moving object detection algorithm;
(4) Based on spatial grid positioning, the behavior and the action of the target in the video frame are analyzed through target detection and motion detection by combining the peripheral conditions of the grid.
The data set in the step (1) comprises the type and the state of the target, in the process of manufacturing the data set, the data set comprises the type of the target to be identified, and if the target is in a fighting posture, the target is noted to be in a fighting state in the label.
Preferably, the step (2) specifically comprises the following steps:
(2.1) detecting through a multi-scale feature map,
(2.2) directly carrying out classification and regression on the features extracted by carrying out convolution calculation on feature maps with different sizes through a convolution network;
and (2.3) training the network by adopting an a priori box.
Furthermore, the specific steps of detecting through the multi-scale feature map in the step (2.1) are as follows: the neural network structure used for calculation is divided into six layers of feature maps for carrying out image classification and regression, the feature maps of each layer are different in size, the feature map at the front end of the network is larger, the feature maps are smaller in the backward direction along with the addition of the pooling layer, the feature map with the larger scale is used for processing a smaller target, and the feature map with the smaller scale is used for processing a larger target.
Further, the specific steps of the prior frame adopted by the network training in the step (2.3) are as follows:
setting boxes with different sizes and aspect ratios by taking pixels of the feature map as centers, wherein each pixel is provided with a plurality of prior boxes with different sizes and aspect ratios for detecting targets with different sizes and aspect ratios; training a network model by using a prior frame which is most suitable for the detection target in the picture; the size of the prior frame is linearly increased, and the following formula is satisfied:wherein m is the number of feature maps, the value of m is 5,representing the ratio of the size of the kth prior box to the picture size,andrespectively representMinimum and maximum values of;
matching the generated prior frame with a real detection target follows 2 criteria, wherein the first criterion is to find the prior frame with the maximum coincidence degree with the real detection target in the picture in the feature map, represent the prior frame by an IOU (input output Unit), and then match the prior frame with the maximum IOU value with the real detection target; the second matching criterion is to avoid that the difference between the number of positive and negative samples is too large, and for the prior frame with the residual IOU value which is not the maximum, if the IOU value of the prior frame and the real target exceeds the set threshold value, the prior frame is considered to be matched with the real target; the final output of the network is the class confidence and position coordinate information of the predicted target, so the loss function is the weighted sum of the class confidence error and the predicted position error of the predicted target: wherein , n: represents the number of positive samples in the prior box;
: only 0 or 1 is obtained, if the value is equal to 1, the jth real target in the representative picture is matched with the ith prior frame, and the type of the real target is p;
c: a confidence level representing a target category;
l: representing a predicted value for a real target;
g: position information representing a real target;
pos is a positive sample set;
and (4) Neg: a negative sample set;
,: the coordinates of the center position of the prediction frame, the width of the prediction frame and the height of the prediction frame;
: predicting a predicted value of an mth detection target predicted by a jth prior frame in the image;
: the position of the m detection target predicted by the jth prior frame in the image is calculated according to a formulaExpressed as:
moreover, in the moving object detection in the step (3), the gray value of each pixel point is represented by a plurality of gaussian distributions, and each gaussian distribution function has different weights; if the pixel in the current video frame accords with the established Gaussian model, the pixel is considered as the background, otherwise, the pixel is considered as the foreground; and then updating parameters of the Gaussian model, sequencing different Gaussian distributions according to the priority, and selecting the consistent Gaussian distribution as a background model through a set threshold.
Further, the step (3) is calculated by detecting a moving objectThe method comprises the steps of carrying out moving object detection on a video frame, selecting K Gaussian distribution functions to represent the gray value of each pixel point in an image, and selecting M models as description backgrounds from the K Gaussian distributions; different weight values are given to different Gaussian distributionsWhere i represents a different Gaussian distribution, so i ≦ K; selecting proper weight values and threshold values, and when the weight values meet the threshold values, regarding the pixels meeting the Gaussian distribution as backgrounds, and regarding the rest as foregrounds; let the gray-scale value of the pixel value at a certain time t beExpressing its probability density function as a combination of K gaussian distribution functions, then:, wherein :represents the ith weight of the Gaussian mixture model at the time t, and the sum of all the weights is 1;
representing the mean value of the ith model pixel gray value of the mixed Gaussian model at the time t;
representing the covariance of the ith model pixel gray value of the Gaussian mixture model at time t;
arranging the K Gaussian distribution functions in a descending order, and then selecting the first M Gaussian distributions as backgrounds according to a preset threshold; when processing new image, comparing and matching pixel points on the image with the established Gaussian mixture model, if a certain pixel point is matched with the Gaussian mixture modelThe ith Gaussian distribution in the profile satisfies:
representing the mean value of the ith model pixel gray value of the Gaussian mixture model at the moment t;
representing the covariance of the ith model pixel gray value of the Gaussian mixture model at time t;
then the point is considered to match the ith gaussian distribution and the successfully matched function is updated with the following parameters:
, wherein ,(0≦≦ 1) represents the learning rate, the larger the value, the more frequent the background update in the video;
represents the ith weight of the Gaussian mixture model at the time t, and the sum of all the weights is 1;
representing the mean value of the ith model pixel gray value of the mixed Gaussian model at the time t;
representing the variance of the ith model pixel gray value of the Gaussian mixture model at the moment t;
if the pixel is not matched with the Gaussian distribution function, the parameters of the Gaussian distribution function do not need to be changed, and only the corresponding weight is updated, and the corresponding formula is as follows:
if the pixel is not matched with any Gaussian distribution function corresponding to the pixel, judging the pixel as a foreground, and replacing the Gaussian model with the minimum weight in the established model; the mean value of the replaced new Gaussian function is the gray value of the current pixel; normalizing the weight of the updated background model, and performing Gaussian distribution function according to the weightThe values are sorted in descending order; and then screening the foreground according to the set threshold value T, setting the first M Gaussian distributions meeting the conditions as the background, and setting the rest Gaussian distributions as the foreground.
Preferably, the combat zone corresponding to the high-scale single map range in the step (4) is composed of R × C low-scale single maps, and the combat zone and the combat basic zone are respectively divided, that is, one combat zone includes R × C combat basic zones, and boundary lines of the combat basic zones are connected to form an attack and defense line and a cooperative line; according to the positions of the combat zone and the combat basic zone to which the target belongs, the behavior of the target is judged by combining the sea, land and air environment and the surrounding three-dimensional environment analysis of the grid; if the target is detected at the same position of the previous frame and the current frame and no moving target is detected at the same position, the detected target is in a static state; if the target is detected in the previous frame and the target is detected near the same position in the current frame, and the moving target is detected in the area at the same time, the detected target is in a moving state; if the target is detected to be in a normal state in the previous frame, the target is detected to be in a fighting state in the same position of the current frame, and the moving target is detected to be in the fighting state in the same position, the detected target is in the fighting state; if the target is static or moving in the area, the target is identified as an intrusion behavior; if the target is fighting in our area, it is identified as an attack.
Has the advantages that: compared with the prior art, the invention has the following remarkable advantages: the invention combines the space grid with the target recognition target detection and the motion detection algorithm to analyze the battlefield target behavior, so that the result is more accurate, the speed is higher, and the invention is more suitable for the requirements of the current battlefield.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic diagram of multi-scale feature map detection according to the present invention;
FIG. 3 is a schematic diagram of action recognition in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
As shown in fig. 1, the invention relates to a video target behavior recognition method based on spatial grid, which comprises the following steps:
(1) Establishing a data set, wherein the data set comprises the type and the state of a target, the data set comprises the type and the state of the target, in the process of manufacturing the data set, the data set comprises the type of the target to be identified, and if the target is in a fighting posture, the target is noted to be in a fighting state in a label;
(2) The method for identifying the type and the state of the target in the video frame through the target identification algorithm specifically comprises the following steps:
(2.1) detecting through a multi-scale feature map, as shown in fig. 2, dividing a neural network structure for calculation into six layers of feature maps for carrying out image classification and regression, wherein the feature maps of each layer are different in size, the feature map at the front end of the network is larger, the feature map is smaller in the backward direction along with the addition of a pooling layer, a smaller target is processed by the feature map with a larger scale, and a larger target is processed by the feature map with a smaller scale;
(2.2) directly carrying out convolution calculation on the feature graphs with different sizes through a convolution network to extract features, and carrying out classification and regression on the features; when a general neural network is used for detecting a target, a convolution network is usually used for extracting the features of a picture, then the extracted features are sent into a full-connection network for classification or regression, but the invention directly uses the convolution network to perform convolution calculation on feature graphs with different sizes and extracts the features for classification and regression, and the method is shown in figure 2;
(2.3) the network training adopts a prior frame, the pixels of the characteristic diagram are taken as the center, the boxes with different sizes and length-width ratios are arranged, and each pixel is provided with a plurality of prior frames with different sizes and length-width ratios for detecting the targets with different sizes and length-width ratios; training a network model by using a prior frame which is most suitable for a detection target in the picture; the size of the prior frame is linearly increased, and the following formula is satisfied:wherein m is the number of characteristic maps, the value of m is 5,representing the ratio of the size of the kth prior box to the picture size,andrespectively representMinimum and maximum values of;
matching the generated prior frame with a real detection target follows 2 criteria, wherein the first criterion is to find the prior frame with the maximum coincidence degree with the real detection target in the picture in the feature map, represent the prior frame by an IOU (input output Unit), and then match the prior frame with the maximum IOU value with the real detection target; the second matching criterion is to avoid the difference between the positive and negative sample numbers, and the residual IOU value is not the maximum prior frame, if the IOU value with the real target exceeds the set valueA threshold value, which is also considered to match the prior frame with the real target; the final output of the network is the class confidence and position coordinate information of the predicted target, so the loss function is the weighted sum of the class confidence error and the predicted position error of the predicted target: wherein , n: represents the number of positive samples in the prior box;
: only 0 or 1 is obtained, if the value is equal to 1, the jth real target in the representative picture is matched with the ith prior frame, and the type of the real target is p;
c: a confidence level representing a target category;
l: representing a predicted value for a real target;
g: position information representing a real target;
pos is a positive sample set;
and (4) Neg: a negative sample set;
,: the coordinates of the center position of the prediction frame, the width of the prediction frame and the height of the prediction frame;
: predicting a predicted value of an mth detection target predicted by a jth prior frame in the image;
: the position of the m detection target predicted by the jth prior frame in the image is calculated according to a formulaExpressed as:
(3) Performing moving object detection on the video frame through a moving object detection algorithm, wherein in the moving object detection, the gray value of each pixel point is represented by a plurality of Gaussian distributions, and each Gaussian distribution function has different weights; if the pixel in the current video frame accords with the established Gaussian model, the pixel is considered as the background, otherwise, the pixel is considered as the foreground; then updating parameters of the Gaussian model, sequencing different Gaussian distributions according to priorities, and selecting the consistent Gaussian distribution through a set threshold value to serve as a background model;
performing moving object detection on a video frame through a moving object detection algorithm, selecting K Gaussian distribution functions to represent the gray value of each pixel point in an image, and selecting M models as description backgrounds from the K Gaussian distributions; different weight values are given to different Gaussian distributionsWhere i represents a different Gaussian distribution, so i ≦ K; selecting proper onesWhen the weight value meets the threshold value, the pixels meeting the Gaussian distribution are considered as the background, and the rest pixels are considered as the foreground; let the gray-scale value of the pixel value at a certain time t beExpressing its probability density function as a combination of K gaussian distribution functions, then:, wherein :represents the ith weight of the Gaussian mixture model at the time t, and the sum of all the weights is 1;
representing the mean value of the ith model pixel gray value of the Gaussian mixture model at the moment t;
representing the covariance of the ith model pixel gray value of the Gaussian mixture model at time t;
arranging the K Gaussian distribution functions in a descending order, and then selecting the first M Gaussian distributions as backgrounds according to a preset threshold; when a new image is processed, comparing and matching pixel points on the image with the established Gaussian mixture model, and if a certain pixel point meets the ith Gaussian distribution in the Gaussian mixture model:
representing the mean value of the ith model pixel gray value of the Gaussian mixture model at the moment t;
representing the covariance of the ith model pixel gray value of the Gaussian mixture model at time t;
then the point is considered to match the ith gaussian distribution and the successfully matched function is updated with the following parameters:
, wherein ,(0≦≦ 1) represents the learning rate, the larger the value, the more frequent the background update in the video;
represents the ith weight of the Gaussian mixture model at the time t, and the sum of all the weights is 1;
representing the mean value of the ith model pixel gray value of the Gaussian mixture model at the moment t;
representing the variance of the ith model pixel gray value of the Gaussian mixture model at the moment t;
if the pixel is not matched with the Gaussian distribution function, the parameters of the Gaussian distribution function do not need to be changed, and only the corresponding weight is updated, wherein the corresponding formula is as follows:
if the pixel is not matched with any Gaussian distribution function corresponding to the pixel, judging the pixel as a foreground, and replacing the Gaussian model with the minimum weight in the established model; the mean value of the replaced new Gaussian function is the gray value of the current pixel; normalizing the weight of the updated background model, and performing Gaussian distribution function according to the weightThe values are sorted in descending order; and then screening the foreground according to the set threshold value T, setting the front M Gaussian distributions meeting the conditions as the background, and setting the rest Gaussian distributions as the foreground.
(4) Based on spatial grid positioning, combining the peripheral conditions of grids, and analyzing the behavior and the action of a target in a video frame through target detection and motion detection;
the method comprises the following steps of 1, forming a combat region corresponding to the range of 100 ten thousand single maps by 144-frame 1; according to the positions of the combat zone and the combat basic zone to which the target belongs, the behavior of the target is judged by combining the sea, land and air environment and the surrounding three-dimensional environment analysis of the grid; the schematic diagram of motion recognition is shown in fig. 3, if an object is detected at the same position in the previous frame and the current frame and no moving object is detected at the same position, the detected object is in a static state; if the target is detected in the previous frame and the target is detected near the same position in the current frame, and the moving target is detected in the area at the same time, the detected target is in a moving state; if the target is detected to be in a normal state in the previous frame, the target is detected to be in a fighting state in the same position of the current frame, and the moving target is detected to be in the same position, the detected target is in the fighting state; if the target is static or moving in the area, the target is identified as an intrusion behavior; if the target fights in the area, the target is identified as an attack.
Claims (8)
1. A video target behavior identification method based on spatial grids is characterized by comprising the following steps:
(1) Establishing a data set, wherein the data set comprises the type and the state of a target;
(2) The type and the state of the target in the video frame are identified through a target identification algorithm,
(3) Detecting a moving object of the video frame by a moving object detection algorithm;
(4) Based on spatial grid positioning, the behavior and the action of the target in the video frame are analyzed through target detection and motion detection in combination with the peripheral situation of the grid.
2. The method for identifying the behavior of the video target based on the spatial grid as claimed in claim 1, wherein: the data set in the step (1) comprises the type and the state of the target, in the process of manufacturing the data set, the data set comprises the type of the target to be identified, and if the target is in a fighting posture, the target is noted to be in a fighting state in the label.
3. The method of claim 2, wherein the video object behavior recognition based on spatial grid is characterized in that: the step (2) specifically comprises the following steps:
(2.1) detecting through a multi-scale feature map,
(2.2) directly carrying out classification and regression on the features extracted by carrying out convolution calculation on feature maps with different sizes through a convolution network;
and (2.3) training the network by adopting an a priori box.
4. The method according to claim 3, wherein the video target behavior recognition method based on the spatial grid is characterized in that: the specific steps of detecting through the multi-scale characteristic diagram in the step (2.1) are as follows: the neural network structure used for calculation is divided into six layers of feature maps for carrying out image classification and regression, the feature maps of each layer are different in size, the feature map at the front end of the network is larger, the feature maps are smaller in the backward direction along with the addition of the pooling layer, the feature map with the larger scale is used for processing a smaller target, and the feature map with the smaller scale is used for processing a larger target.
5. The method according to claim 4, wherein the video target behavior recognition method based on the spatial grid is characterized in that: the specific steps of the network training in the step (2.3) adopting the prior frame are as follows:
setting boxes with different sizes and aspect ratios by taking pixels of the feature map as centers, wherein each pixel is provided with a plurality of prior boxes with different sizes and aspect ratios for detecting targets with different sizes and aspect ratios; training a network model by using a prior frame which is most suitable for a detection target in the picture; the size of the prior frame is linearly increased, and the following formula is satisfied:
wherein m is the number of feature maps, the value of m is 5,representing the ratio of the size of the kth prior box to the picture size,andrespectively representMinimum and maximum values of;
matching the generated prior frame with a real detection target follows 2 criteria, wherein the first criterion is to find the prior frame with the maximum coincidence degree with the real detection target in the picture in the feature map, represent the prior frame by an IOU (input output Unit), and then match the prior frame with the maximum IOU value with the real detection target; the second matching criterion is to avoid the difference between the number of positive and negative samples, and the residual IOU value is not the maximum prior frame, if the IOU value with the real target exceeds the set thresholdThe prior frame is also considered to match the real target; the final output of the network is the class confidence and position coordinate information of the predicted target, so the loss function is the weighted sum of the class confidence error and the predicted position error of the predicted target: wherein , n: represents the number of positive samples in the prior box;
: only 0 or 1 is obtained, if the value is equal to 1, the jth real target in the representative picture is matched with the ith prior frame, and the type of the real target is p;
c: a confidence level representing a target category;
l: representing a predicted value for a real target;
g: position information representing a real target;
pos is a positive sample set;
and (4) Neg: a negative sample set;
,: the coordinates of the center position of the prediction frame, the width of the prediction frame and the height of the prediction frame;
: predicting the predicted value of the mth detection target predicted by the jth prior frame in the image;
: the position of the m detection target predicted by the jth prior frame in the image is calculated according to a formulaExpressed as:
6. the method according to claim 5, wherein the video target behavior recognition method based on the spatial grid is characterized in that: in the moving object detection in the step (3), the gray value of each pixel point is represented by a plurality of Gaussian distributions, and each Gaussian distribution function has different weights; if the pixel in the current video frame accords with the established Gaussian model, the pixel is considered as the background, otherwise, the pixel is considered as the foreground; and then updating parameters of the Gaussian model, sequencing different Gaussian distributions according to the priority, and selecting the consistent Gaussian distribution as a background model through a set threshold.
7. According to claim6 the video target behavior recognition method based on the spatial grid is characterized in that: in the step (3), moving object detection is performed on the video frame through a moving object detection algorithm, K Gaussian distribution functions are selected to represent the gray value of each pixel point in an image, and M models are selected from the K Gaussian distributions to be used as models for describing the background; different weight values are given to different Gaussian distributionsWhere i represents a different Gaussian distribution, so i ≦ K; selecting proper weight values and threshold values, and when the weight values meet the threshold values, regarding the pixels meeting the Gaussian distribution as backgrounds, and regarding the rest of the pixels as foregrounds; let the gray-scale value of the pixel value at a certain time t beExpressing its probability density function as a combination of K gaussian distribution functions, then:, wherein :represents the ith weight of the Gaussian mixture model at the time t, and the sum of all the weights is 1;
representing the mean value of the ith model pixel gray value of the Gaussian mixture model at the moment t;
representing the covariance of the ith model pixel gray value of the Gaussian mixture model at time t;
arranging the K Gaussian distribution functions in a descending order, and then selecting the first M Gaussian distributions as backgrounds according to a preset threshold; when a new image is processed, comparing and matching pixel points on the image with the established Gaussian mixture model, and if a certain pixel point meets the ith Gaussian distribution in the Gaussian mixture model:
representing the mean value of the ith model pixel gray value of the Gaussian mixture model at the moment t;
representing the covariance of the ith model pixel gray value of the Gaussian mixture model at time t;
then the point is considered to match the ith gaussian distribution and the successfully matched function is updated with the following parameters:
, wherein ,(0≦≦ 1) represents the learning rate, the larger the value, the more frequent the background update in the video;
representing the mean value of the ith model pixel gray value of the Gaussian mixture model at the moment t;
representing the variance of the ith model pixel gray value of the Gaussian mixture model at the moment t;
if the pixel is not matched with the Gaussian distribution function, the parameters of the Gaussian distribution function do not need to be changed, and only the corresponding weight is updated, wherein the corresponding formula is as follows:
if the pixel is not matched with any Gaussian distribution function corresponding to the pixel, judging the pixel as a foreground, and replacing the Gaussian model with the minimum weight in the established model; the mean value of the replaced new Gaussian function is the gray value of the current pixel; normalizing the weight of the updated background model, and performing Gaussian distribution function according to the weightThe values are sorted in descending order; and then screening the foreground according to the set threshold value T, setting the front M Gaussian distributions meeting the conditions as the background, and setting the rest Gaussian distributions as the foreground.
8. The method according to claim 7, wherein the video target behavior recognition method based on spatial grid is characterized in that: the combat zone corresponding to the high-proportion single map range in the step (4) consists of R, C and low-proportion single maps, and the combat zone and the combat basic zone are respectively divided, namely one combat zone comprises R, C combat basic zones, and boundary lines of the combat basic zones are connected to form an attack and defense line and a cooperative line; according to the positions of the combat zone and the combat basic zone to which the target belongs, the behavior of the target is judged by combining the sea, land and air environment and the surrounding three-dimensional environment analysis of the grid; if the target is detected at the same position of the previous frame and the current frame and no moving target is detected at the same position, the detected target is in a static state; if the target is detected in the previous frame and the target is detected near the same position in the current frame, and the moving target is detected in the area at the same time, the detected target is in a moving state; if the target is detected to be in a normal state in the previous frame, the target is detected to be in a fighting state in the same position of the current frame, and the moving target is detected to be in the same position, the detected target is in the fighting state; if the target is static or moving in the area, the target is identified as an intrusion behavior; if the target fights in the area, the target is identified as an attack.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310047339.6A CN115830515B (en) | 2023-01-31 | 2023-01-31 | Video target behavior recognition method based on space grid |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310047339.6A CN115830515B (en) | 2023-01-31 | 2023-01-31 | Video target behavior recognition method based on space grid |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115830515A true CN115830515A (en) | 2023-03-21 |
CN115830515B CN115830515B (en) | 2023-05-02 |
Family
ID=85520637
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310047339.6A Active CN115830515B (en) | 2023-01-31 | 2023-01-31 | Video target behavior recognition method based on space grid |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115830515B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258332A (en) * | 2013-05-24 | 2013-08-21 | 浙江工商大学 | Moving object detection method resisting illumination variation |
CN111477034A (en) * | 2020-03-16 | 2020-07-31 | 中国电子科技集团公司第二十八研究所 | Large-scale airspace use plan conflict detection and release method based on grid model |
CN112070035A (en) * | 2020-09-11 | 2020-12-11 | 联通物联网有限责任公司 | Target tracking method and device based on video stream and storage medium |
CN115098993A (en) * | 2022-05-16 | 2022-09-23 | 南京航空航天大学 | Unmanned aerial vehicle conflict detection method and device for airspace digital grid and storage medium |
CN115493591A (en) * | 2022-06-13 | 2022-12-20 | 中国人民解放军海军航空大学 | Multi-route planning method |
CN115578668A (en) * | 2022-09-15 | 2023-01-06 | 浙江大华技术股份有限公司 | Target behavior recognition method, electronic device, and storage medium |
-
2023
- 2023-01-31 CN CN202310047339.6A patent/CN115830515B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258332A (en) * | 2013-05-24 | 2013-08-21 | 浙江工商大学 | Moving object detection method resisting illumination variation |
CN111477034A (en) * | 2020-03-16 | 2020-07-31 | 中国电子科技集团公司第二十八研究所 | Large-scale airspace use plan conflict detection and release method based on grid model |
CN112070035A (en) * | 2020-09-11 | 2020-12-11 | 联通物联网有限责任公司 | Target tracking method and device based on video stream and storage medium |
CN115098993A (en) * | 2022-05-16 | 2022-09-23 | 南京航空航天大学 | Unmanned aerial vehicle conflict detection method and device for airspace digital grid and storage medium |
CN115493591A (en) * | 2022-06-13 | 2022-12-20 | 中国人民解放军海军航空大学 | Multi-route planning method |
CN115578668A (en) * | 2022-09-15 | 2023-01-06 | 浙江大华技术股份有限公司 | Target behavior recognition method, electronic device, and storage medium |
Non-Patent Citations (2)
Title |
---|
机器学习算法那些事: "目标检测|SSD原理与实现" * |
杨超宇: "基于计算机视觉的目标检测跟踪及特征分类研究" * |
Also Published As
Publication number | Publication date |
---|---|
CN115830515B (en) | 2023-05-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109766830B (en) | Ship target identification system and method based on artificial intelligence image processing | |
CN111914664A (en) | Vehicle multi-target detection and track tracking method based on re-identification | |
CN111259930A (en) | General target detection method of self-adaptive attention guidance mechanism | |
CN111767882A (en) | Multi-mode pedestrian detection method based on improved YOLO model | |
CN107633226B (en) | Human body motion tracking feature processing method | |
CN113034548A (en) | Multi-target tracking method and system suitable for embedded terminal | |
CN113221787B (en) | Pedestrian multi-target tracking method based on multi-element difference fusion | |
CN106933816A (en) | Across camera lens object retrieval system and method based on global characteristics and local feature | |
CN112836639A (en) | Pedestrian multi-target tracking video identification method based on improved YOLOv3 model | |
CN112949572A (en) | Slim-YOLOv 3-based mask wearing condition detection method | |
CN112818905B (en) | Finite pixel vehicle target detection method based on attention and spatio-temporal information | |
CN110334584A (en) | A kind of gesture identification method based on the full convolutional network in region | |
CN111274964B (en) | Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle | |
CN110633727A (en) | Deep neural network ship target fine-grained identification method based on selective search | |
CN117333948A (en) | End-to-end multi-target broiler behavior identification method integrating space-time attention mechanism | |
CN116740652A (en) | Method and system for monitoring rust area expansion based on neural network model | |
CN113095332B (en) | Saliency region detection method based on feature learning | |
CN116309270A (en) | Binocular image-based transmission line typical defect identification method | |
CN115272778A (en) | Recyclable garbage classification method and system based on RPA and computer vision | |
CN115439926A (en) | Small sample abnormal behavior identification method based on key region and scene depth | |
CN115830515B (en) | Video target behavior recognition method based on space grid | |
CN114943873A (en) | Method and device for classifying abnormal behaviors of construction site personnel | |
CN114170625A (en) | Context-aware and noise-robust pedestrian searching method | |
CN114581769A (en) | Method for identifying houses under construction based on unsupervised clustering | |
CN111046861B (en) | Method for identifying infrared image, method for constructing identification model and application |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |