CN113393495B - High-altitude parabolic track identification method based on reinforcement learning - Google Patents
High-altitude parabolic track identification method based on reinforcement learning Download PDFInfo
- Publication number
- CN113393495B CN113393495B CN202110685692.8A CN202110685692A CN113393495B CN 113393495 B CN113393495 B CN 113393495B CN 202110685692 A CN202110685692 A CN 202110685692A CN 113393495 B CN113393495 B CN 113393495B
- Authority
- CN
- China
- Prior art keywords
- altitude parabolic
- image
- model
- reinforcement learning
- altitude
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000002787 reinforcement Effects 0.000 title claims abstract description 48
- 238000012549 training Methods 0.000 claims abstract description 51
- 238000013500 data storage Methods 0.000 claims abstract description 32
- 238000003860 storage Methods 0.000 claims abstract description 18
- 238000007781 pre-processing Methods 0.000 claims abstract description 14
- 230000009471 action Effects 0.000 claims description 95
- 238000004088 simulation Methods 0.000 claims description 28
- 230000008569 process Effects 0.000 claims description 18
- 238000001514 detection method Methods 0.000 claims description 14
- 238000001914 filtration Methods 0.000 claims description 13
- 238000009826 distribution Methods 0.000 claims description 9
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 7
- 230000009466 transformation Effects 0.000 claims description 7
- 230000008030 elimination Effects 0.000 claims description 6
- 238000003379 elimination reaction Methods 0.000 claims description 6
- 238000007500 overflow downdraw method Methods 0.000 claims description 5
- 230000008859 change Effects 0.000 claims description 4
- 239000003795 chemical substances by application Substances 0.000 description 10
- 230000006870 function Effects 0.000 description 10
- 230000033001 locomotion Effects 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 8
- 230000006399 behavior Effects 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 239000000523 sample Substances 0.000 description 7
- 230000000694 effects Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 230000006378 damage Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000007704 transition Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000002542 deteriorative effect Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 235000002566 Capsicum Nutrition 0.000 description 1
- 239000006002 Pepper Substances 0.000 description 1
- 235000016761 Piper aduncum Nutrition 0.000 description 1
- 235000017804 Piper guineense Nutrition 0.000 description 1
- 244000203593 Piper nigrum Species 0.000 description 1
- 235000008184 Piper nigrum Nutrition 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/251—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/02—Affine transformations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/277—Analysis of motion involving stochastic approaches, e.g. using Kalman filters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
- G06T2207/20032—Median filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20172—Image enhancement details
- G06T2207/20182—Noise reduction or smoothing in the temporal domain; Spatio-temporal filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a high-altitude parabolic track identification method based on reinforcement learning. The method comprises the following steps: acquiring a high-altitude parabolic track image of a monitored window area through an image sensor; preprocessing the high-altitude parabolic track image to obtain preprocessed image information; judging whether the image sensor is shielded or not according to the preprocessed image information; when the image sensor is judged not to be shielded, inputting the preprocessed image information into a processor, acquiring a pre-training target model after reinforcement learning by the processor, and performing high-altitude parabolic recognition on the preprocessed image information through the pre-training target model to obtain high-altitude parabolic recognition result information; and the processor stores the high-altitude parabolic recognition result information into a data storage unit, a cloud server and a storage so as to train and update the pre-training target model. According to the method, the high-altitude parabolic track is identified through the reinforcement learning model, and the identification accuracy is improved.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a high-altitude parabolic track identification method based on reinforcement learning.
Background
As the economies of scale further develop, the population of cities gathers, the production and living environment of mankind is filled with various uncertainties and risks, the high altitude parabola is called "pain over cities", which cannot be easily controlled and stopped once started, and is in a rapidly developing situation, and once the behavior is started and reaches the standard, the behavior is difficult to be controlled and stopped immediately, so that the behavior can rapidly spread in a very short time, and great damage is caused to public safety. Particularly, since recent years, civil and criminal cases about high-altitude parabolic behaviors are increasing, and newspapers in various places report events about high-altitude parabolic injuries in a dispute, so that people have a dispute and call for asking for strict regulations on high-altitude parabolic behaviors in order to ensure the 'top safety' of people. In the background, the highest courtyard publishes the opinion on justice-based proper high-altitude parabolic and falling cases of the highest people's court, and the crime and punishment are harmed by a dangerous method as long as the social public safety is endangered even though the actual damage result is not caused.
For traditional reinforcement learning, the typical problem is the Markov Decision Process (MDP). The markov decision process contains a set of states S and actions a. The transition of the states is determined by the probability P, the reward R and a compromise parameter gamma. The probability P reflects the relationship between transitions and rewards for state transitions, which depend only on the state and action of the last time step. Reinforcement learning defines an environment for an Agent (a software and hardware system) to implement certain actions to maximize rewards. The basis for the optimization behavior of an Agent is defined by Bellman's equation, a method widely used to solve practical optimization problems. Reinforcement learning is good enough for the environment when all reachable states are controllable and can be stored in computer RAM (random access memory). However, when the number of states in the environment exceeds the capacity of modern computers, the standard reinforcement learning mode is less effective. Moreover, in a real environment, the agent must face the problems of continuous state, continuous variables and continuous control (action). Therefore, the standard, well-defined reinforcement learning Q-table is replaced by a deep neural network, i.e., a Q-network, which can map environmental states to agent actions. The network architecture, selection of network hyper-parameters and learning are all done in the training phase (learning of Q network weights). DQN (Deep Q Network, reinforcement learning) allows agents to explore unstructured environments and gain knowledge that, over time, can mimic human behavior. We use DQN algorithm to solve this problem of continuous state (non-discrete), continuous variable and continuous control (action) in high altitude parabolic trajectory recognition systems.
At present, there are already high altitude parabolic trajectory prediction patents in the market: a high-altitude parabolic detection method, equipment, a storage medium (patent number: CN111931599A) and a high-altitude parabolic radar wave visual fusion monitoring and early warning system (patent number: CN201922207460.2) are provided. The former is to calculate the motion state of an object by an image processing algorithm based on SUV (standard absorption value) or the like to realize prediction, and the latter is to monitor a high-altitude parabolic trajectory by using a radar system. Therefore, there are few ideas in analyzing and predicting the high-altitude parabolic trajectory from the perspective of the intelligent prediction algorithm in the market.
Disclosure of Invention
The invention aims to provide a high-altitude parabolic track identification method based on reinforcement learning so as to accurately identify a high-altitude parabolic track.
In order to achieve the purpose, the invention is realized by the following technical scheme:
a high-altitude parabolic track identification method based on reinforcement learning comprises the following steps:
s1, acquiring a high-altitude parabolic track image of the monitored window area through an image sensor;
s2, preprocessing the high-altitude parabolic track image to obtain preprocessed image information;
s3, judging whether the image sensor is blocked according to the preprocessed image information;
s4, when the image sensor is judged not to be shielded, the preprocessed image information is input to a processor, the processor obtains a pre-training target model after reinforcement learning, and high-altitude parabolic recognition is carried out on the preprocessed image information through the pre-training target model to obtain high-altitude parabolic recognition result information;
and S5, the processor stores the high altitude parabolic recognition result information into a data storage unit, a cloud server and a storage to train and update the pre-training target model.
Optionally, the S2 includes:
s2.1, converting the high-altitude parabolic image collected by the image sensor into a low-dimensional gray image;
s2.2, carrying out affine transformation on the gray level image;
s2.3, carrying out noise elimination on the gray image after affine transformation in a spatial filtering and time domain filtering mode;
and S2.4, acquiring a target detection frame of the moving object in each frame of image after noise elimination by adopting a background difference and inter-frame difference fusion method, and predicting the target detection frame of the moving object in the next frame of image according to the target detection frame in the previous frame of image through Kalman filtering to obtain the preprocessed image information.
Optionally, the S3 includes:
s3.1, acquiring pixel values and distribution characteristics in the preprocessed image information;
s3.2, judging whether the image sensor is shielded or not according to the size and the distribution characteristics of the pixel values in the preprocessed image information;
and S3.3, when the image sensor is judged to be shielded, storing the preprocessed image information into the cloud server and the storage.
Optionally, after the S2 and before the S3, the method further comprises: and storing the preprocessed image information into the cloud server and a storage.
Optionally, the step of obtaining the pre-trained target model in S4 includes:
s4.1, initializing an action model and a target model before pre-training;
s4.2, establishing a simulation environment, and transmitting the optimal action parameters to the simulation environment by the action model;
s4.3, the simulation environment simulates according to the optimal action parameters to obtain simulated action parameters, and stores the simulated action parameters to the data storage unit;
s4.4, the action model acquires the simulated action parameters from the data storage unit so as to train and update the action model;
and S4.5, copying the latest simulated action parameters to the target model after the action model is trained for C times to train and update the target model to obtain the pre-trained target model, wherein C is an integer greater than or equal to 2.
Optionally, the optimal action parameters in S4.2 include: the high-altitude parabolic track image, the high-altitude parabolic predicted track and the target model parameters.
Optionally, the simulated operation parameters in S4.3 include: the high altitude parabolic track image of the current state, the current high altitude parabolic predicted track, the current reward obtaining and the high altitude parabolic track image of the next state.
Optionally, the step of establishing a simulation environment in S4.2 includes:
s4.2.1, acquiring physical characteristics, dynamic characteristics and surrounding environment characteristics of a high-altitude parabolic moving object;
s4.2.2, analyzing the physical characteristics, dynamic characteristics and surrounding environment characteristics of the moving object of the high-altitude parabola according to the air resistance and wind speed variables of the environment of the high-altitude parabola to establish the simulation environment.
Optionally, the action model and the target model continuously obtain high-altitude parabolic track prediction error information in an updating process, so as to change a prediction strategy according to the error information and an error value of an adjacent frame high-altitude parabolic track image.
Optionally, the S5 further includes: and comparing the high-altitude parabolic recognition result information with an actual high-altitude parabolic track to obtain actual prediction error information, and feeding back the actual prediction error information to the data storage unit.
The invention has at least one of the following beneficial effects:
the method starts from the angle of an intelligent prediction algorithm and the idea of predicting the high-altitude parabolic track, and the high-altitude parabolic track is recognized through a reinforcement learning model, so that the recognition accuracy rate is improved. In the high-altitude parabolic track recognition method based on reinforcement learning, the processor acquires a pre-training target model after reinforcement learning, so that high-altitude parabolic recognition is performed on pre-processing image information through the pre-training target model, the pre-training target model does not need to train a data set labeled manually and can improve high-altitude parabolic track prediction accuracy, and the processor stores high-altitude parabolic recognition result information into a data storage unit, a cloud server and a storage, so that the pre-training target model is trained and updated, the high-altitude parabolic track prediction accuracy can be further improved, the data storage unit can improve the data utilization rate, samples participating in network training can meet the requirement of independent and same distribution, and the training stability is improved.
Furthermore, in the high-altitude parabolic track recognition method based on reinforcement learning provided by the invention, after the action model is updated every C times, the latest simulated action parameters are copied to the target model to train and update the target model, so that the stability of model training is ensured, and the action model and the target model continuously acquire high-altitude parabolic track prediction error information in the updating process, so that the prediction strategy is changed according to the error information and the error value of the adjacent frame high-altitude parabolic track image, and the high-altitude parabolic track prediction accuracy can be effectively improved.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
Fig. 1 is a flowchart of a high-altitude parabolic trajectory recognition method based on reinforcement learning according to this embodiment;
fig. 2 is a specific working schematic diagram of the high-altitude parabolic trajectory identification method based on reinforcement learning according to the present embodiment;
fig. 3 is a schematic diagram of a reinforcement learning model architecture provided in this embodiment.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The invention relates to a high-altitude parabolic track identification method based on reinforcement learning. The method is applied to a high-altitude parabolic track recognition system based on reinforcement learning. The system mainly comprises a simulation environment module, a data storage unit, an action model, a target model, a DQN error function module, an image acquisition module, a preprocessing module, an image storage module, a shielding prediction module, a cloud server and a memory module.
The high-altitude parabolic trajectory recognition method based on reinforcement learning of the present embodiment is described below with reference to the drawings.
Referring to fig. 1, the high-altitude parabolic trajectory identification method based on reinforcement learning according to the present embodiment includes the following steps:
and S1, acquiring a high-altitude parabolic track image of the monitored window area through an image sensor.
Specifically, an image sensor is installed at a proper position of the monitored window to collect image information of the monitored window. In order to reduce monitoring dead angles as much as possible, a plurality of image sensors with different angles are designed for the same window to acquire information, so that the probability that a malicious parabolic user avoids a camera in the process of parabolic movement is reduced.
And S2, preprocessing the high-altitude parabolic track image to obtain preprocessed image information.
Wherein the S2 includes:
s2.1, converting the high-altitude parabolic image collected by the image sensor into a low-dimensional gray image;
s2.2, carrying out affine transformation on the gray level image;
s2.3, carrying out noise elimination on the gray image after affine transformation in a spatial filtering and time domain filtering mode;
and S2.4, acquiring a target detection frame of the moving object in each frame of image after noise elimination by adopting a background difference and inter-frame difference fusion method, and predicting the target detection frame of the moving object in the next frame of image according to the target detection frame in the previous frame of image through Kalman filtering to obtain the preprocessed image information.
Specifically, as shown with reference to FIG. 2, the collected data information is passed to a preprocessing module. And converting the color image collected by the image sensor into a low-dimensional gray-scale image. The converted image still retains main information, meanwhile, the data processing burden is reduced, affine transformation of the image is carried out, scale, stretching, rotation and translation change is carried out, the image information suitable for being predicted by using a training model is processed, salt and pepper noise and Gaussian noise are eliminated through spatial filtering and time domain filtering, and then a target detection frame of a moving object in each frame of image is obtained through a background difference and interframe difference fusion method. The background difference method can better keep the whole foreground of the target; the frame difference method has high detection sensitivity, and the background difference foreground in a window around the frame difference foreground is reserved on the basis of the frame difference foreground, so that a target detection frame of a moving object is detected and obtained more completely by using a background difference and interframe difference fusion method, and the target prediction frame of the moving object in a next frame image can be predicted by using Kalman filtering according to the target detection frame in the previous frame image, so that the preprocessed image information is obtained.
As one example, information collected by the image sensor is passed to a pre-processing module, which performs image pre-processing on the collected image information. The native size of the collected image is 210 × 160, with 128 colors per pixel, which is converted to a grayscale image of 84 × 84 dimensions. The transformed image still retains the main information while reducing the burden of data processing.
It should be noted that, since the trajectory of the high altitude parabola is continuous, the Agent can only obtain 1 frame of information from the environment at each moment, and such static image information is difficult to represent the dynamic motion information of the thrown object. To this end, the recognition algorithm will collect the first N frames from the current time and combine this information as input to the model. The state information collected within a certain time is obtained, and the reinforcement learning model can learn more accurate action value.
And S3, judging whether the image sensor is blocked according to the preprocessed image information.
Wherein the S3 includes:
s3.1, acquiring pixel values and distribution characteristics in the preprocessed image information;
s3.2, judging whether the image sensor is shielded or not according to the size and the distribution characteristics of the pixel values in the preprocessed image information;
and S3.3, when the image sensor is judged to be shielded, storing the preprocessed image information into the cloud server and the storage.
Specifically, referring to fig. 2, after the trajectory information of the high altitude parabola is processed by the preprocessing module, it needs to be predicted to be blocked, and whether the image sensor is blocked or not is judged by judging the size and distribution characteristics of the pixel value of the preprocessed image information, and the prediction result can be transmitted to the cloud server and the storage. The preprocessed image information is also transmitted to an image storage module, so that historical basis and experience are provided for the arrival of the subsequent similar recognition task.
After the S2 and before the S3, the method further comprises: and storing the preprocessed image information into the cloud server and a storage.
And S4, when the image sensor is judged not to be shielded, inputting the preprocessed image information into a processor, acquiring a pre-training target model after reinforcement learning by the processor, and performing high-altitude parabolic recognition on the preprocessed image information through the pre-training target model to obtain high-altitude parabolic recognition result information.
Specifically, when the image sensor is judged not to be shielded, the preprocessed image information obtained by the preprocessing module is transmitted to the processor for data processing, and the main high-altitude falling object trajectory prediction and recognition task is realized by data transmission and processing with the established pre-training target model. And judging and predicting the real-occurring track through the existing prediction experience of the pre-trained target model.
The step of obtaining the pre-training target model in S4 includes:
s4.1, initializing an action model and a target model before pre-training;
s4.2, establishing a simulation environment, and transmitting the optimal action parameters to the simulation environment by the action model;
s4.3, the simulation environment simulates according to the optimal action parameters to obtain simulated action parameters, and stores the simulated action parameters to the data storage unit;
s4.4, the action model acquires the simulated action parameters from the data storage unit so as to train and update the action model;
and S4.5, copying the latest simulated action parameters to the target model after the action model is trained for C times to train and update the target model to obtain the pre-trained target model, wherein C is an integer greater than or equal to 2.
Specifically, when initializing an action model and a target model before pre-training, parameters to be optimized need to be extracted from the model, where s represents a high-altitude parabolic track image to be identified, a represents a high-altitude parabolic predicted track, r represents accuracy of a prediction result, i.e., an obtained reward, t represents a t-th time step, G represents an accumulated reward, γ represents a decay factor of the reward, k represents an accumulated reward calculated from the k-th step, and a definition function Q is as follows:
state:St=f(Ht),At=h(St) (2)
the loss function is defined as, where θ represents the model parameters:
L(θ)=E[(TargetQ-Q(s,a;θ))2] (3)
the objective function is:
the objective model solution value objective function formula is as follows:
wherein theta is-And (3) further expanding the formula as a parameter of the Target Network, wherein j represents a state number, and the formula is as follows:
yj=rj+1+γQ(sj+1,argmaxa'Q(sj+1,a';θ-);θ-) (6)
the updating method comprises the following steps:
and continuously interacting with the action model in the subsequent training process, and feeding back to the action model:
at=argmaxaQ(φ(st),a;θ) (8)
the structure of the model mainly adoptsThe output of the model is a vector of length | A |, each value in the vector representing a value estimate for the corresponding action. Thus, only one calculation is needed to find the value of all actions, and the time for evaluating the value is the same no matter how many actions exist.
The step of establishing the simulation environment in S4.2 includes:
s4.2.1, acquiring physical characteristics, dynamic characteristics and surrounding environment characteristics of a high-altitude parabolic moving object;
s4.2.2, analyzing the physical characteristics, dynamic characteristics and surrounding environment characteristics of the moving object of the high-altitude parabola according to the air resistance and wind speed variables of the environment of the high-altitude parabola to establish the simulation environment.
Specifically, a virtual environment can be constructed according to the real environment characteristics of the high-altitude parabolic time, and materials are provided for model training. A motion trail model is established mainly according to physical characteristics of object motion in high-altitude parabolic motion and dynamic characteristics in combination with surrounding environment, and a simulation environment is established by considering variables such as air resistance, wind speed and the like. Therefore, the method can be attached to the real high-altitude parabolic scene as much as possible, and provides the most accurate materials for the training of the model.
When the action model interacts with the simulation environment, the action model transmits the optimal action argmaxQ (s, a, theta) to the simulation environment. Wherein s is a high-altitude parabolic track image, a is a high-altitude parabolic predicted track, and theta is a target model parameter.
And S4.3, the simulation environment simulates according to the optimal action parameters to obtain simulated action parameters, and stores the simulated action parameters to the data storage unit.
The simulated action parameters in S4.3 include: the high altitude parabolic track image of the current state, the current high altitude parabolic predicted track, the current reward obtaining and the high altitude parabolic track image of the next state.
Specifically, the simulation environment transmits the current state s to the action model, and stores the current state s, the current action a, the currently obtained reward r, and the next state s' in the data storage unit.
It should be noted that in the training process, the recognition algorithm can make a decision from a random scene, and if we make a decision from a fixed scene each time, the Agent always makes a decision on these same frames, which obviously is not beneficial to exploring more frames for learning. In order to enhance the exploratory property without deteriorating the model effect, the Agent is enabled to perform random actions in a short period of time from the beginning, so that different scene samples can be obtained to the maximum extent.
And S4.4, the action model acquires the simulated action parameters from the data storage unit so as to train and update the action model.
Specifically, the motion model acquires (s, a, r, s') data from the data storage unit and updates the model. The data storage unit stores sample data information, simulation prediction information and results, and is set to store 100 ten thousand samples, so that samples in a long period of time can be stored. When the value function is trained, a certain number of samples can be taken out from the value function, and training is carried out according to the information recorded by the samples. In general, the data storage unit includes both the process of collecting samples and sampling samples. The collected samples are stored in the structure in chronological order, and if the data storage unit is already full of samples, the new samples will overwrite the samples that are the oldest in time. The action model acquires information from the data storage unit to realize information transfer and adaptive updating with the simulation environment.
And S4.5, copying the latest simulated action parameters to the target model after the action model is trained for C times to train and update the target model to obtain the pre-trained target model, wherein C is an integer greater than or equal to 2.
Specifically, the action model copies the model parameters to the target model every C updates. If the latest sample is taken every time, the algorithm is similar to online learning, and the data storage unit uniformly and randomly samples a batch of samples from the cache for learning so as to satisfy that the sequence obtained by interaction has certain correlation in the time dimension. In the future, the learned value function can represent the expectation of long-term benefits under the action of the current state, however, the sequence obtained by each interaction only represents one sampling track under the action of the current state, and cannot represent all possible tracks, so that the estimated result has a certain difference from the expected result. This gap accumulates more and more as the interaction time is lengthened. And the model is prone to large fluctuations. Therefore, after uniform sampling is adopted, the sample of each training usually comes from a plurality of interactive sequences, so that the fluctuation of a single sequence is greatly reduced, and the training effect is greatly stabilized. Meanwhile, one sample can be trained for multiple times, and the utilization rate of the sample is improved. Therefore, the model parameters are copied to the target model every time the model parameters are updated for C times, so that the instability of data is reduced, and the utilization rate of the data is improved.
Optionally, the action model and the target model continuously obtain high-altitude parabolic track prediction error information in an updating process, so as to change a prediction strategy according to the error information and an error value of an adjacent frame high-altitude parabolic track image.
Specifically, the action model and the target model receive information from the DQN error module during the update process. And continuously drawing the error information of the DQN error module by the action model and the target model in the updating process, changing an optimization strategy by the action model and the target model according to the numerical value of the error information and the error numerical value of the adjacent frame picture, and further updating so as to improve the model prediction accuracy. The DQN error module also stores the reward information r in a data storage unit for subsequent random, repetitive training and goal updating to provide data support.
As an example, the obtaining step of the pre-training target model may specifically include:
when training begins, the action model and the target model use the same parameters, and in the training process, the action model is responsible for interacting with the simulation environment to obtain an interaction sample. In the Learning process, the target value obtained by Q-Learning is calculated from the target model, and then compared with the estimated value of the motion model to obtain the target value and update the motion model. And each time the training completes a certain number of iterations, the parameters of the action model are synchronized to the target model, so that the next stage of learning can be carried out. By using the target model, the model that calculates the value of the target will be fixed over a period of time so that the model can mitigate the volatility of the model.
The S5 further includes: and comparing the high-altitude parabolic recognition result information with an actual high-altitude parabolic track to obtain actual prediction error information, and feeding back the actual prediction error information to the data storage unit.
Specifically, the system can judge whether an error is generated between the high-altitude parabolic recognition result information and the actual high-altitude parabolic track, and feeds back a comparison result to the data storage unit, so that an actual prediction effect is provided for a subsequent model training module, and further optimization and upgrading of the deep reinforcement learning system are promoted.
In order to make the construction process of the reinforcement learning model in the invention clear to those skilled in the art. The construction of the reinforcement learning model will be described in detail below.
FIG. 3 is a schematic diagram of an architecture of a reinforcement learning model. Strong learning model in this embodimentThe output of (a) is a vector of length | A |, the vectorEach value in (a) represents a value estimate for the corresponding action.
The main body of the model adopts a structure of four layers of convolutional neural networks: s represents the image of the high-altitude parabolic track to be identified, a represents the predicted track of the high-altitude parabolic track, and r represents the accuracy of the prediction result. For four convolutional layers:
the number of channels output for stride by the convolution kernel of the first layer convolutional layer is 32, and then the ReLU nonlinear layer is applied. The convolution kernel of the second convolutional layer is stride output with the number of channels of 64, after which the ReLU nonlinear layer is applied. The convolution kernel of the third convolutional layer is stride output with the number of channels of 64, and then the ReLU nonlinear layer is applied. The fourth layer is a fully connected layer, with output dimension 512, after which the ReLU non-linear layer is applied. And finally, obtaining value estimation of the corresponding action by the full connection layer.
The reinforcement learning model in this embodiment adopts greedy's strategy, which generates actions at random with a probability of 100% at first, and this probability will decay continuously with the continuous training, and eventually to 10%. That is, there is a 90% probability of executing the current optimal strategy. In this way, the strategy mainly used for exploration is gradually changed into the strategy mainly used for utilization, and the two strategies are well combined.
It should be noted that, when the reinforcement learning model is subjected to simulation training, if the test is performed from the same scene each time, the Agent always makes a decision on the same frames, which obviously is not beneficial to us to explore more frames for learning. In order to enhance the exploratory property without deteriorating the model effect, the Agent can be set to perform random actions in a short period of time from the beginning of the game, so that different scene samples can be obtained to the maximum extent.
When processing frame images collected by an image sensor, because the pictures between adjacent frames have great similarity, the same action can be generally adopted for the very similar pictures, so that the judgment of a certain number of frames is skipped, the space-time complexity of an algorithm is reduced, and the repeated processing of redundant data is avoided.
Meanwhile, due to the fact that the variance of the reward value is large, in order to enable a reinforcement learning model to better fit long-term return, the score needs to be compressed into a range which is good for the model to process, and the return obtained in each round is compressed to be between-1 and 1.
In summary, as shown in fig. 2, the main steps of the high-altitude parabolic trajectory identification method based on reinforcement learning can be divided into two stages:
in the model training phase: initializing an action model and a target model; the action model interacts with the simulation environment, the action model transmits the optimal action argmaxQ (s, a, theta) to the simulation environment, the simulation environment transmits the current state s to the action model, and stores the current state s, the current action a, the currently obtained reward r and the next state s' in the data storage unit; the action model acquires (s, a, r, s') data from the data storage unit and updates the model; copying the model parameters to the target model for each C times of updating of the action model; the action model and the target model receive information from the DQN error module in the updating process; the DQN error module stores the reward information r to the data storage unit.
In the model application phase: the image acquisition module acquires image information in a real scene and transmits the image information to the preprocessing module; the data preprocessing model carries out image cutting and median filtering operation, and key areas of the images are extracted; after the image is preprocessed, the image is stored in a cloud server and a storage through an image storage module; the preprocessed image is transmitted to a shielding prediction module, whether the camera is shielded in practical application is judged, and the result is transmitted to a cloud server and a storage; the processor judges the preprocessed image information by using the model trained in the model training stage, predicts to obtain a high-altitude parabolic track, and transmits a related result to the data storage unit to further train and update the target model.
In the high-altitude parabolic track recognition method based on reinforcement learning, the processor acquires a pre-training target model after reinforcement learning, so that high-altitude parabolic recognition is performed on pre-processing image information through the pre-training target model, the pre-training target model does not need to train a data set labeled manually and can improve high-altitude parabolic track prediction accuracy, and the processor stores high-altitude parabolic recognition result information into a data storage unit, a cloud server and a storage, so that the pre-training target model is trained and updated, the high-altitude parabolic track prediction accuracy can be further improved, the data storage unit can improve the data utilization rate, samples participating in network training can meet the requirement of independent and same distribution, and the training stability is improved.
Furthermore, in the high-altitude parabolic track recognition method based on reinforcement learning provided by the invention, after the action model is updated every C times, the latest simulated action parameters are copied to the target model to train and update the target model, so that the stability of model training is ensured, and the action model and the target model continuously acquire high-altitude parabolic track prediction error information in the updating process, so that the prediction strategy is changed according to the error information and the error value of the adjacent frame high-altitude parabolic track image, and the high-altitude parabolic track prediction accuracy can be effectively improved.
It should be noted that the logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.
Claims (8)
1. A high-altitude parabolic track identification method based on reinforcement learning is characterized by comprising the following steps:
s1, acquiring a high-altitude parabolic track image of the monitored window area through an image sensor;
s2, preprocessing the high-altitude parabolic track image to obtain preprocessed image information;
s3, judging whether the image sensor is blocked according to the preprocessed image information;
s4, when the image sensor is judged not to be shielded, the preprocessed image information is input to a processor, the processor obtains a pre-training target model after reinforcement learning, and high-altitude parabolic recognition is carried out on the preprocessed image information through the pre-training target model to obtain high-altitude parabolic recognition result information;
s5, the processor stores the high altitude parabolic recognition result information into a data storage unit, a cloud server and a storage to train and update the pre-training target model;
the step of obtaining the pre-training target model in S4 includes:
s4.1, initializing an action model and a target model before pre-training;
s4.2, establishing a simulation environment, and transmitting the optimal action parameters to the simulation environment by the action model;
s4.3, the simulation environment simulates according to the optimal action parameters to obtain simulated action parameters, and stores the simulated action parameters to the data storage unit;
s4.4, the action model acquires the simulated action parameters from the data storage unit so as to train and update the action model;
s4.5, copying the latest simulated action parameters to the target model after the action model is trained for C times to train and update the target model to obtain the pre-trained target model, wherein C is an integer greater than or equal to 2;
the step of establishing the simulation environment in S4.2 includes:
s4.2.1, acquiring physical characteristics, dynamic characteristics and surrounding environment characteristics of a high-altitude parabolic moving object;
s4.2.2, analyzing the physical characteristics, dynamic characteristics and surrounding environment characteristics of the moving object of the high-altitude parabola according to the air resistance and wind speed variables of the environment of the high-altitude parabola to establish the simulation environment.
2. The reinforcement learning-based high-altitude parabolic trajectory recognition method according to claim 1, wherein the S2 includes:
s2.1, converting the high-altitude parabolic image collected by the image sensor into a low-dimensional gray image;
s2.2, carrying out affine transformation on the gray level image;
s2.3, carrying out noise elimination on the gray image after affine transformation in a spatial filtering and time domain filtering mode;
and S2.4, acquiring a target detection frame of the moving object in each frame of image after noise elimination by adopting a background difference and inter-frame difference fusion method, and predicting the target detection frame of the moving object in the next frame of image according to the target detection frame in the previous frame of image through Kalman filtering to obtain the preprocessed image information.
3. The reinforcement learning-based high-altitude parabolic trajectory recognition method according to claim 1, wherein the S3 includes:
s3.1, acquiring pixel values and distribution characteristics in the preprocessed image information;
s3.2, judging whether the image sensor is shielded or not according to the size and the distribution characteristics of the pixel values in the preprocessed image information;
and S3.3, when the image sensor is judged to be shielded, storing the preprocessed image information into the cloud server and the storage.
4. The reinforcement learning-based high-altitude parabolic trajectory recognition method according to claim 1, wherein after the S2 and before the S3, the method further comprises: and storing the preprocessed image information into the cloud server and a storage.
5. The reinforcement learning-based high-altitude parabolic track recognition method according to claim 1, wherein the optimal action parameters in S4.2 include: the high-altitude parabolic track image, the high-altitude parabolic predicted track and the target model parameters.
6. The reinforcement learning-based high-altitude parabolic track recognition method according to claim 1, wherein the simulated action parameters in S4.3 include: the high altitude parabolic track image of the current state, the current high altitude parabolic predicted track, the current reward obtaining and the high altitude parabolic track image of the next state.
7. The reinforcement learning-based high-altitude parabolic track recognition method as claimed in claim 1, wherein the action model and the target model continuously obtain high-altitude parabolic track prediction error information in an updating process, so as to change a prediction strategy according to the error information and an error value of an adjacent frame high-altitude parabolic track image.
8. The reinforcement learning-based high-altitude parabolic trajectory recognition method according to claim 1, wherein the S5 further includes: and comparing the high-altitude parabolic recognition result information with an actual high-altitude parabolic track to obtain actual prediction error information, and feeding back the actual prediction error information to the data storage unit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110685692.8A CN113393495B (en) | 2021-06-21 | 2021-06-21 | High-altitude parabolic track identification method based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110685692.8A CN113393495B (en) | 2021-06-21 | 2021-06-21 | High-altitude parabolic track identification method based on reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113393495A CN113393495A (en) | 2021-09-14 |
CN113393495B true CN113393495B (en) | 2022-02-01 |
Family
ID=77623201
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110685692.8A Active CN113393495B (en) | 2021-06-21 | 2021-06-21 | High-altitude parabolic track identification method based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113393495B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116597340B (en) * | 2023-04-12 | 2023-10-10 | 深圳市明源云科技有限公司 | High altitude parabolic position prediction method, electronic device and readable storage medium |
CN116977931A (en) * | 2023-07-31 | 2023-10-31 | 深圳市星河智善科技有限公司 | High-altitude parabolic identification method based on deep learning |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10175697B1 (en) * | 2017-12-21 | 2019-01-08 | Luminar Technologies, Inc. | Object identification and labeling tool for training autonomous vehicle controllers |
CN110084414A (en) * | 2019-04-18 | 2019-08-02 | 成都蓉奥科技有限公司 | A kind of blank pipe anti-collision method based on the study of K secondary control deeply |
CN111415389A (en) * | 2020-03-18 | 2020-07-14 | 清华大学 | Label-free six-dimensional object posture prediction method and device based on reinforcement learning |
CN111618847A (en) * | 2020-04-22 | 2020-09-04 | 南通大学 | Mechanical arm autonomous grabbing method based on deep reinforcement learning and dynamic motion elements |
CN112257557A (en) * | 2020-10-20 | 2021-01-22 | 中国电子科技集团公司第五十八研究所 | High-altitude parabolic detection and identification method and system based on machine vision |
CN112269390A (en) * | 2020-10-15 | 2021-01-26 | 北京理工大学 | Small celestial body surface fixed-point attachment trajectory planning method considering bounce |
CN112818599A (en) * | 2021-01-29 | 2021-05-18 | 四川大学 | Air control method based on reinforcement learning and four-dimensional track |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9679258B2 (en) * | 2013-10-08 | 2017-06-13 | Google Inc. | Methods and apparatus for reinforcement learning |
US10204097B2 (en) * | 2016-08-16 | 2019-02-12 | Microsoft Technology Licensing, Llc | Efficient dialogue policy learning |
US11295174B2 (en) * | 2018-11-05 | 2022-04-05 | Royal Bank Of Canada | Opponent modeling with asynchronous methods in deep RL |
KR20200080396A (en) * | 2018-12-18 | 2020-07-07 | 삼성전자주식회사 | Autonomous driving method and apparatus thereof |
CN109521774B (en) * | 2018-12-27 | 2023-04-07 | 南京芊玥机器人科技有限公司 | Spraying robot track optimization method based on reinforcement learning |
CN110458281B (en) * | 2019-08-02 | 2021-09-03 | 中科新松有限公司 | Method and system for predicting deep reinforcement learning rotation speed of table tennis robot |
CN111263332A (en) * | 2020-03-02 | 2020-06-09 | 湖北工业大学 | Unmanned aerial vehicle track and power joint optimization method based on deep reinforcement learning |
-
2021
- 2021-06-21 CN CN202110685692.8A patent/CN113393495B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10175697B1 (en) * | 2017-12-21 | 2019-01-08 | Luminar Technologies, Inc. | Object identification and labeling tool for training autonomous vehicle controllers |
CN110084414A (en) * | 2019-04-18 | 2019-08-02 | 成都蓉奥科技有限公司 | A kind of blank pipe anti-collision method based on the study of K secondary control deeply |
CN111415389A (en) * | 2020-03-18 | 2020-07-14 | 清华大学 | Label-free six-dimensional object posture prediction method and device based on reinforcement learning |
CN111618847A (en) * | 2020-04-22 | 2020-09-04 | 南通大学 | Mechanical arm autonomous grabbing method based on deep reinforcement learning and dynamic motion elements |
CN112269390A (en) * | 2020-10-15 | 2021-01-26 | 北京理工大学 | Small celestial body surface fixed-point attachment trajectory planning method considering bounce |
CN112257557A (en) * | 2020-10-20 | 2021-01-22 | 中国电子科技集团公司第五十八研究所 | High-altitude parabolic detection and identification method and system based on machine vision |
CN112818599A (en) * | 2021-01-29 | 2021-05-18 | 四川大学 | Air control method based on reinforcement learning and four-dimensional track |
Non-Patent Citations (3)
Title |
---|
"Playing Atari with Deep Reinforcement Learning";Volodymyr Mnih et al;《arXiv》;20131219;全文 * |
"基于深度强化学习的机械臂抓捕控制研究";黄伟伟;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》;20210115;全文 * |
"深度强化学习综述";刘全等;《计算机学报》;20180131;第41卷(第1期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113393495A (en) | 2021-09-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113392935B (en) | Multi-agent deep reinforcement learning strategy optimization method based on attention mechanism | |
JP6877630B2 (en) | How and system to detect actions | |
CN109344725B (en) | Multi-pedestrian online tracking method based on space-time attention mechanism | |
CN108222749B (en) | Intelligent automatic door control method based on image analysis | |
CN113393495B (en) | High-altitude parabolic track identification method based on reinforcement learning | |
CN111178183B (en) | Face detection method and related device | |
Leibfried et al. | A deep learning approach for joint video frame and reward prediction in atari games | |
US20140143183A1 (en) | Hierarchical model for human activity recognition | |
Gao et al. | Object tracking using firefly algorithm | |
CN112037263B (en) | Surgical tool tracking system based on convolutional neural network and long-term and short-term memory network | |
CN110413838A (en) | A kind of unsupervised video frequency abstract model and its method for building up | |
CN114241511B (en) | Weak supervision pedestrian detection method, system, medium, equipment and processing terminal | |
CN110009060A (en) | A kind of robustness long-term follow method based on correlation filtering and target detection | |
CN110287829A (en) | A kind of video face identification method of combination depth Q study and attention model | |
CN112184767A (en) | Method, device, equipment and storage medium for tracking moving object track | |
CN111626198A (en) | Pedestrian motion detection method based on Body Pix in automatic driving scene | |
CN109544584B (en) | Method and system for realizing inspection image stabilization precision measurement | |
CN108898221B (en) | Joint learning method of characteristics and strategies based on state characteristics and subsequent characteristics | |
CN111160170B (en) | Self-learning human behavior recognition and anomaly detection method | |
CN111833375B (en) | Method and system for tracking animal group track | |
CN112418149A (en) | Abnormal behavior detection method based on deep convolutional neural network | |
CN113033582B (en) | Model training method, feature extraction method and device | |
KR102563346B1 (en) | System for monitoring of structural and method ithereof | |
CN115331162A (en) | Cross-scale infrared pedestrian detection method, system, medium, equipment and terminal | |
CN114913098A (en) | Image processing hyper-parameter optimization method, system, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |