CN115063676A - Ship target classification method based on AIS data - Google Patents

Ship target classification method based on AIS data Download PDF

Info

Publication number
CN115063676A
CN115063676A CN202210594360.3A CN202210594360A CN115063676A CN 115063676 A CN115063676 A CN 115063676A CN 202210594360 A CN202210594360 A CN 202210594360A CN 115063676 A CN115063676 A CN 115063676A
Authority
CN
China
Prior art keywords
model
layer
ais data
ship
ship target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210594360.3A
Other languages
Chinese (zh)
Inventor
王宇君
郭健
李可欣
李宗明
缪坤
陈辉
徐立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN202210594360.3A priority Critical patent/CN115063676A/en
Publication of CN115063676A publication Critical patent/CN115063676A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a ship target classification method based on AIS data, and belongs to the technical field of track classification methods. According to the method, a first input feature vector and a second input feature vector for ship target classification are constructed, the first input feature vector and the second input feature vector are simple, can be directly obtained from AIS data, and can give consideration to time-space features, so that human intervention and complex feature engineering are avoided; and then, a ship target classification model of a CNN-BiGRU combined model is adopted, the obtained first input characteristic vector and second input characteristic vector are respectively input into the CNN model and the BiGRU model, and the CNN-BiGRU combined model is combined with space-time characteristics to classify and identify the ship target.

Description

Ship target classification method based on AIS data
Technical Field
The invention relates to a ship target classification method based on AIS data, and belongs to the technical field of track classification methods.
Background
The trajectory classification is always an important research on traffic engineering and traffic geography. The rapid updating of technologies such as mobile internet, satellite positioning and the like and the wide popularization of the '21 st century maritime silk road' advocate accelerate the vigorous development of global maritime transport business, and the ship track data is increased day by day, wherein a novel open ship data transmission system AIS is an important source of maritime ship track data. The classification of the ship targets based on the trajectory data can be used for analyzing the motion characteristics and rules of different ships, helps to mine the internal relation among the ships, and has important application values in the aspects of identifying abnormal ships, providing decision support for shipping analysis and ship scheduling, promoting the development of intelligent marine traffic and the like. At present, the problem is mainly researched by manually extracting multi-dimensional features from a ship track section and constructing a single model based on the multi-dimensional features to mine shallow spatial information in the features so as to realize ship target classification. However, the sea water area is wide, the types and structures of ships tend to be diversified, and the classification of the ships includes the space dependence and the time dependence of the ship target motion, and is a relatively complex nonlinear mathematical model. Therefore, how to simplify such a complex ship target classification task and construct an effective classification model is one of the main challenges in this field.
In early research, the trajectory is classified mainly by manually extracting features and applying a conventional Machine learning method, and common methods include a Decision Tree model (DT), a Support Vector Machine (SVM), a Tree-based integrated learning model, and the like. Some scholars have conducted intensive research on classification of ship targets based on AIS data. The method comprises the steps that a fishing boat operation mode is classified and identified by extracting speed and course characteristics and applying a BP neural network model; it has also been proposed that cargo ships and fishing ships can be classified by extracting 17-dimensional ship motion characteristics and using logistic regression models; on the basis of extracting the multi-dimensional ship motion characteristics, adding geographic characteristics and ship sizes as auxiliary characteristics, and classifying ships by using a Random Forest model (RF); and classifying the multiple types of ships by extracting 119-dimensional ship motion characteristics and applying an XGboost model. On one hand, the method depends on the characteristic space which is constructed manually and is fussy, and the classification result is easily influenced by subjective cognition; on the other hand, the traditional machine learning method only considers the spatial dependence, neglects the influence of the time characteristics on the classification of the ship target, and cannot process the complex nonlinear relation of AIS data due to the shallow structure, so that certain limitation exists.
In the context of geospatial artificial intelligence (GeoAI), technological advances in the AI field have brought new opportunities for the intelligent development of geospatial-related field research. The deep learning method does not need to extract complex features, can better fit the problem of nonlinearity, and is gradually used by learners to solve the problem of trajectory classification in recent years. There are spatial dimension features for mining AIS data through Convolutional Neural Networks (CNN) to classify fishing, passenger, cargo and oil tankers. There is a time dimension feature for extracting AIS data through a Recurrent Neural Network (RNN) to classify high-speed ships, oil tankers, passenger ships, sailing ships, and fishing ships. The above methods consider spatial dependency and temporal dependency, respectively, but the models are discrete, i.e. learning of spatial and temporal features is not taken into account. The method comprises the steps of constructing a distribution characteristic vector and a time sequence characteristic vector which are respectively used for a 1-dimensional convolutional neural network and a Long Short Term Memory (LSTM), and then fusing ship target classification results of the distribution characteristic vector and the Long Short Term Memory (LSTM) through weighting voting.
Disclosure of Invention
The invention aims to provide a ship target classification method based on AIS data, and aims to solve the problem of low classification accuracy of the existing ship target classification method based on AIS data.
The invention provides a ship target classification method based on AIS data for solving the technical problems, which comprises the following steps:
1) acquiring AIS data, preprocessing the AIS data, extracting the speed, the course, the bow direction and the acceleration from the AIS data to construct a first input feature vector for learning the relation and the overall features between the parts of each track segment; extracting the speed, the course, the bow direction and the time interval as second input feature vectors;
2) constructing a ship target classification model, wherein the ship target classification model comprises a CNN model, a BiGRU model, a fusion layer and a full connection layer; the CNN model is used for processing the first input feature vector to obtain high-level features representing spatial information; the BiGRU model is used for processing the second input feature vector to obtain high-level features representing time sequence information; the fusion layer is used for fusing and summarizing the obtained high-level features representing the spatial information and the high-level features representing the time sequence information; and the full connection layer calculates the distribution probability of each type of ship target according to the fused and summarized features so as to realize the classification of the ship targets.
The method comprises the steps of firstly, constructing a first input feature vector and a second input feature vector for ship target classification, wherein the first input feature vector and the second input feature vector are simple, can be directly obtained from AIS data, and can also take the time-space feature into consideration, so that the manual intervention and the complex feature engineering are avoided; and then, a ship target classification model of a CNN-BiGRU combined model is adopted, the obtained first input characteristic vector and second input characteristic vector are respectively input into the CNN model and the BiGRU model, and the CNN-BiGRU combined model is combined with space-time characteristics to classify and identify the ship target.
Further, the extraction process of the first input feature vector in step 1) is as follows:
segmenting the acquired AIS data to obtain a track segment corresponding to each ship;
and acquiring the navigational speed, the heading and the bow direction of each track point in the track segment according to the set track segment length, and calculating the acceleration of the corresponding track point according to the speed and the time interval of the adjacent track points.
In order to facilitate later CNN model training, the lengths of all track segments are limited to be fixed, and meanwhile, the acquired first input feature vector can learn the relation and the overall features between the local parts of the track segments.
The extraction process of the second input feature vector in the step 1) is as follows:
segmenting the acquired AIS data to obtain a track segment corresponding to each ship;
and acquiring the navigational speed, the course and the ship heading direction of each track point in the track segment according to the set track segment length, and acquiring the time interval of adjacent track points.
In order to facilitate later-stage BiGRU type training, the lengths of all track sections are limited to be fixed, and meanwhile, the obtained second input feature vector can help a subsequent model to mine rules and characteristics of motion information of different ships in time sequence, and the correlation learning of the model on the change of time and the motion information is enhanced.
Further, when the AIS data is preprocessed, the AIS data is cleaned, and the cleaning includes deleting key field missing, time repetition and track point records exceeding a set range.
According to the invention, the AIS data is cleaned, so that the problem of low classification precision caused by noise of the AIS data is avoided, and an accurate and reliable data source is provided for the classification of subsequent ship targets.
Further, the BiGRU model includes two gating cycle units in opposite directions.
According to the invention, the opposite gate control circulation units in 2 directions are overlapped up and down, so that the long-term dependence characteristics of the track data on the past and the future can be obtained, and the problems of gradient disappearance and gradient explosion existing in the long-term dependence problem of RNN processing sequence data are solved.
Further, the gate control cycle unit comprises an input layer, a hidden layer and an output layer, wherein the hidden layer comprises an updating gate and a resetting gate, and the updating gate and the resetting gate are used for jointly determining whether the historical information can be reserved and transmitted.
Further, the CNN model comprises an input layer, a convolution layer and a pooling layer, wherein the input layer is used for acquiring a first input feature vector; the convolution layers are a plurality of, the output of each convolution layer is activated by a ReLu function and then used as the input of the next layer, and the pooling layer is arranged behind each convolution layer.
According to the CNN model, ReLu function activation processing is added after each convolution layer, so that rapid convergence can be realized and nonlinear feature learning of a network can be improved; while adding one pooling layer after each convolutional layer in order to reduce the amount of computation and prevent overfitting.
Further, the CNN model further includes a Dropout layer, and the Dropout layer is disposed after the last pooling layer.
The invention can relieve the over-fitting problem of the model by adding the Dropout layer after the last pooling layer, and meanwhile, in order to take account of the precision and the complexity, the Dropout layer is only added after the last pooling layer.
Further, when the ship target classification model is trained, the CNN model is trained, and when the CNN model training is finished, the CNN model and the BiGRU model are integrally trained.
The invention trains the CNN single model first and then trains the whole model, thereby improving the training efficiency and shortening the training time.
Drawings
FIG. 1 is a flow chart of the AIS data based ship target classification method of the present invention;
FIG. 2 is a schematic diagram of a first input feature vector structure constructed according to the present invention;
FIG. 3 is a diagram of a BiGRU model structure in the ship target classification model according to the present invention;
FIG. 4 is a schematic structural diagram of a ship target classification model constructed by the invention;
FIG. 5 is a schematic view of a visualization of selected experimental data in a simulation experiment according to the present invention;
FIG. 6 is an overall structure diagram of a classification model of a ship target selected in a simulation experiment according to the present invention;
FIG. 7-a is a schematic diagram of the variation of the loss value of the training and testing of models of different iteration times in the simulation experiment of the present invention;
FIG. 7-b is a schematic diagram of the accuracy of training and testing of models with different iteration numbers in the simulation experiment of the present invention.
Detailed Description
The following further describes embodiments of the present invention with reference to the drawings.
As shown in fig. 1, the AIS data obtained first is preprocessed, and the speed, heading, bow direction and acceleration are extracted from the preprocessed AIS data to construct a first input feature vector for learning the connection and integral features between the local parts of each track segment; extracting the speed, the course, the bow direction and the time interval as second input feature vectors; and then constructing a ship target classification model, wherein the ship target classification model comprises an optimal Convolutional Neural Network (CNN) model, a BiGRU model, a fusion layer and a full connection layer, and is shown in figure 4. Standardizing the first input feature vector, inputting the first input feature vector into an optimal Convolutional Neural Network (CNN) model, and standardizing the second feature vector, and inputting the second feature vector into a BiGRU model; respectively mining spatial and temporal features contained in the track sequence data by using the two models; fusing the high-level features learned by the two models through a fusion (Merge) layer; and finally, the gathered features are delivered to a full connection layer with an activation function of softmax to calculate the distribution probability of each type of ship target so as to realize the classification of the ship targets.
1. And acquiring AIS data and preprocessing the AIS data.
AIS data is a sequence of samples with time, location, etc. information. The original AIS data comprises 16 fields in each row of records, and the fields mainly comprise ship dynamic information, ship static information and ship navigation information. As shown in table 1, the following information recorded in each row is taken in the present experiment to participate in classification of the ship target.
TABLE 1
Figure BDA0003667165420000051
Figure BDA0003667165420000061
Due to the reasons that the crew operates the equipment improperly, the equipment fails, the signal transmission distance is limited and the like, the AIS data is noisy, so that the AIS data needs to be cleaned, namely, track point records with missing key fields, repeated time and exceeding a normal range are deleted, and accurate and reliable data sources are provided for classification of ship targets. On the basis, the AIS data is processed in a segmented mode, a plurality of travel track sections containing ship motion information are obtained, namely, track sections corresponding to each ship are obtained according to MMSI segmentation; then, in order to make the classification problem closer to the real situation, in addition to the track points in the sailing state, the track points in the working states of "engaged in fishing" and "restrained managed operability" (controllability limited) are considered, wherein the controllability limited generally means that the maneuvering characteristics of the operating ship are limited, and a track segment composed of the continuous track points is extracted; and finally, screening track sections containing 30 data points or more to ensure that each section of track contains enough information.
2. An input feature vector is constructed.
The preprocessed AIS data comprises 3 basic space-time information of time, longitude and latitude, and 3 motion information of speed, heading and bow direction. The basic space-time information cannot intuitively express the type of the ship target, and the classification model cannot establish effective mapping from the space-time information to the ship target. Therefore, before the model classifies the ship target, it is necessary to construct an input feature vector and convert the data into a form that the classification model can easily understand. The feature types extracted from the track segment can be divided into track point features, track segment features and self-contained attributes of the application track data. In order to avoid extracting excessive statistics, increase manual calculation amount and depend on personal experience cognition too much, and fully exert the capability of a deep learning model for independently learning basic features and deep-level features in a ship track section, the invention combines track point features extracted from AIS data basic space-time information and 3 types of motion information of AIS data to respectively construct a CNN model input feature vector (namely, a first input feature vector) and a BiGRU model input feature vector (namely, a second input feature vector).
The method selects 4 effective attributes including navigational speed, course, ship heading direction and acceleration to construct the input characteristic vector of the CNN model, and is used for learning the relation and the integral characteristic between the parts of each track segment. The navigation speed, the course and the ship fore direction are motion information of AIS data, so that the acceleration of the track point is calculated only according to the basic time-space information, and the calculation formula is as follows:
Figure BDA0003667165420000071
wherein a is n Represents the acceleration of the nth track point in m/s 2 ;V n Representing the speed of the track point, and the unit is m/s; t is t n Representing the time interval of adjacent track points, and the unit is s; d n Representing the distance between adjacent track points, and obtaining d by a Haversene spherical distance calculation formula, wherein r is the radius of the earth, and 6371.393km is taken;
Figure BDA0003667165420000072
Δλ′=|λ′ n -λ′ n-1 |,
Figure BDA0003667165420000073
the positions of the trace point n and the trace point n-1 are respectively in unit of rad.
CNN modeThe input feature vector of the type is composed of 3 dimensions of height, width and depth (channel). The input layer comprises a set of independent samples D, each sample D i Represents a ship track segment, comprising navigational speed, acceleration, heading and 4 channels in the direction of the bow. As shown in fig. 2, each channel has a shape of 1 × L, L being the length of each track segment, i.e., the number of AIS track points in the track segment, and thus the input vector has a shape of 1 × L × 4. In order to facilitate later-stage CNN model training, the sizes of input vectors need to be unified, so that the lengths of all track segments are limited to be a fixed size L, longer track segments are segmented, and shorter track segments are filled with zero values.
The BiGRU model is formed by stacking 2 Gate controlled loop units (GRUs) up and down, and aims to acquire long-term dependence characteristics of track data on the past and the future. Therefore, the invention adopts 4 attributes of navigational speed, course, ship heading direction and time interval to construct the time sequence feature vector of the BiGRU. The speed, the course and the heading direction of the ship can change along with the time, so that the model is favorable for mining the rules and the characteristics of motion information of different ships in time sequence; on the basis, the time interval is selected to strengthen the associated learning of the model on the change of the time and the motion information.
The BiGRU input feature vector is formed by 2 dimensions of features and time steps (time), wherein the time steps are time sequence feature vectors of input time, generally depend on the length of a ship track segment, and all track segment lengths are limited to be a fixed size L for simplifying calculation. T is a group of independent samples of a BiGRU input layer, and an input feature vector T formed by single ship track segment samples i As shown in equation (2).
Figure BDA0003667165420000081
3. And constructing a ship classification model.
The ship classification model constructed by the invention adopts a CNN-BiGRU model as shown in figure 4, and the main body of the model consists of two parts: (1) extracting spatial features in AIS data through a CNN to depict the spatial dependence of the track segment; (2) the temporal dependencies of the trace segments are characterized by the BiGRU capturing timing features in the AIS data. The CNN-BiGRU model is formed by combining an optimal convolutional neural network model and a bidirectional gating circulating unit, and model structures of the CNN, the BiGRU and the CNN-BiGRU are introduced below respectively.
1) CNN model
The CNN model adopted by the invention comprises an input layer, an output layer, a pooling layer, a full-connection layer and an output layer, wherein the parameters of each layer are configured as follows:
an input layer: put the sample set D into the input layer, use CNN network to every D i And (5) extracting the spatial features.
And (3) rolling layers: the invention adopts convolution kernel with shape of 1 × 3 × C, where C represents the number of channels of each layer of input vector. The 3-dimensional output shape of each convolution layer is controlled by 3 hyper-parameters, namely the number of convolution kernels (filter) for determining the depth of the output shape; the length S of each stepping of the convolution kernel in the input vector; zero padding for controlling the output layer size; in order to ensure that the input and output shapes of each layer are the same, S-1 and zero value padding are adopted, and the convolution kernel number is adjusted and determined according to actual conditions. In order to realize rapid convergence and improve the nonlinear characteristic learning of the network, the output of each convolution layer needs to be activated by the ReLu function and then used as the input of the next layer.
A pooling layer: to reduce the amount of computation and prevent overfitting, the present invention periodically inserts a Max Pooling layer (Max Pooling) with a convolution kernel shape of 1 × 2 and S ═ 1 between each convolution layer. The output after passing through the maximum pooling layer is shown in equation (3).
Figure BDA0003667165420000082
In the formula (I), the compound is shown in the specification,
Figure BDA0003667165420000083
using c for the l-th layer l The convolution kernel is used for convolution and pooling to obtain output, l is the depth of the CNN model, and is in the range of {1,2,3, … }, c l ∈{1,2,3,…,C l },C l Is the number of convolution kernels; pool (. cndot.) is a pooling operation; x is the number of l,a Inputting a vector for the ith feature of the ith layer;
Figure BDA0003667165420000092
performing convolution operation; an activation function of the relu convolution kernel;
Figure BDA0003667165420000093
and
Figure BDA0003667165420000094
respectively the first layer and the second layer l Weight values and bias vectors for each convolution kernel.
Fully Connected layers (FC): each neuron of the FC is connected to all neurons of the previous layer, and the output data is subjected to a flattening operation by element-by-element multiplication calculation. And (4) as shown in the formula (4), except the last layer of FC, the other FCs are used for extracting features, and the finally extracted high-level features perform a classification task on the last layer of FC, and the number of output neurons is ensured to be the same as the number of classification labels by means of a softmax activation function, so that probability distribution of each type of ship is generated.
Figure BDA0003667165420000091
Wherein O is l The high-level characteristics are obtained after flattening operation is carried out on the output of the l-1 layer; y is CNN Probability distribution vectors for all vessels output for CNN, softmax being the activation function of the convolution kernel, ω CNN And b CNN Respectively, the weight value and the bias vector for that layer.
In order to alleviate the overfitting problem of the model, a Dropout layer can be added behind the pooling layer and the fully connected layer for extracting the features, and the Dropout layer can be increased or decreased according to actual conditions and is generally arranged behind the pooling layer.
2) BiGRU model
The GRU can only predict the output of the next moment according to the time sequence information of the past moment, but the output of the current moment is not only related to the past state but also possibly related to the future state, and the bidirectional structure can provide complete 'context' information about each point in the input sequence to the output layer, so the time sequence characteristics in the AIS data are extracted by adopting the bidirectional gating circulation Unit, the structure of the BiGRU designed by the invention is shown in FIG. 3, the parameter configuration of each layer is as follows:
an input layer: putting the sample set T into an input layer, and utilizing a BiGRU network to perform on each T i And (5) extracting time sequence characteristics.
A BiGRU layer: for each time t, the input is simultaneously provided to two GRUs in opposite directions, and the output is jointly determined by the two unidirectional GRUs; as shown in the right diagram of fig. 3, the GRU is composed of an input layer, a hidden layer and an output layer, wherein the hidden layer includes an Update Gate (Update Gate) and a Reset Gate (Reset Gate), and 2 gates together determine whether history information can be retained and transferred. Time sequence characteristic input matrix T i The memory information H of the corresponding hidden layer is:
H=(H 1 ,H 2 ,…,H t ,…,H L ) (5)
in the formula, H 1 ~H L The memory information obtained by the GRU neural network in the 1 st to L time intervals respectively. At time t, from the current input X t And hidden output of forward state at time t-1
Figure BDA0003667165420000106
The outputs of the reset gate and the update gate in the GRU neural network can be calculated to be R respectively t And Z t The formula is as follows:
R t =σ(X t W xr +H t-1 W hr +b r ) (6)
Z t =σ(X t W xz +H t-1 W hz +b z ) (7)
wherein σ is an activation function; w xr And W xz Selecting weights for the reset gate and the update gate; b r And b z The offset vectors are selected for the reset gate and the update gate, respectively. Based on R t And Z t Candidate hidden states may be computed
Figure BDA0003667165420000107
And current hidden state forward output
Figure BDA0003667165420000108
The formula is as follows:
Figure BDA0003667165420000101
Figure BDA0003667165420000102
wherein, tanh is an activation function; w is a group of xh And b h For selectively memorizing the current input X t The selected weights and bias vectors; multiplication of corresponding elements in the operation matrix;
Figure BDA0003667165420000103
showing that the selective memory of the important information of the current node is performed, (1-Z) t )⊙H t-1 Indicating that the otherwise hidden state of unimportant information is selectively forgotten.
Similarly, from the current input X t Hidden output of backward state at time t +1
Figure BDA0003667165420000104
Backward output capable of obtaining current hidden state
Figure BDA0003667165420000105
And finally, splicing the forward output and the backward output of each time step into the final output of the BiGRU layer.
Dropout layer: to reduce the overfitting problem, the present invention adds a Dropout layer for regularization and culls the input and output connections from the neural network with a probability P of 0.5.
Full connection layer: according to the method, the FC is added behind the BiGRU layer for extracting features, and the FC is added at last for classifying, so that the probability distribution of the current time sequence feature vector corresponding to each type of ship is output.
In order to avoid the problem that a single model cannot learn the spatial and temporal characteristics of AIS data simultaneously, on the basis of the model, the invention provides a CNN-BiGRU combined model structure, and as shown in FIG. 4, an optimal convolutional neural network and a bidirectional gating cyclic unit are respectively used as two branches of the combined model. When the combination is carried out, the last full connection layer in the CNN model is removed, and the last full connection layer in the BiGRU model is removed; and on the basis, a fusion layer (Merge) layer and a full connection layer are added.
The specific treatment steps are as follows: firstly, feature vectors D are respectively divided i And T i After standardization processing, the data are used as input data of CNN and BiGRU; then, respectively mining spatial and temporal characteristics contained in the track sequence data by using 2 models; secondly, fusing the high-level features learned by the 2 models through a Merge layer; and finally, the gathered features are given to a full connection layer with the activation function of softmax to calculate the distribution probability of each type of ship target.
When the CNN-BiGRU combined model is trained, the CNN model may be trained first, and then the overall training may be performed.
Experimental verification
In order to further illustrate the classification effect of the AIS data-based ship target classification method, experimental verification is carried out on actual AIS data to which the method is applied.
1) Acquiring experimental data and preprocessing
The experiment takes local oceans between the west longitude 126-138 degrees and the north latitude 10-85 degrees on the west side of north america as a research area, experimental data are from AIS data of National Oceanic and Atmospheric Administration (NOAA) 2015 year all the year, and partial data are visualized as shown in FIG. 5. The region mainly includes cargo ships, fishing ships, passenger ships, and tugboats, and thus the ships are classified into the above 4 types.
The AIS raw data comprises 31805155 track point records, and 192226 ship tracks are obtained after preprocessing, wherein the cargo ship, the fishing ship, the passenger ship and the tug respectively account for 33.08%, 19.80%, 20.15% and 26.97% of the total number. Then, a fixed length L ═ 200, a CNN input feature vector consisting of 4 channels of speed, heading, and acceleration and having a shape of 1 × 200 × 4, and a BiGRU input feature vector consisting of 4 channels of speed, heading, and time interval and having a shape of 200 × 4 were constructed, respectively. In order to accelerate the convergence rate of deep learning model training, mean variance normalization processing is carried out on the 2 input feature vectors. And finally, dividing a sample set consisting of 2 input feature vectors into a training set and a testing set according to a ratio of 7:3, and using the training set and the testing set for construction and training of a classifier and verification and evaluation of classification effects.
The experimental operating environment is Ubuntu20.04 Intel i9-10900k, GPU RTX3090, and is based on a Keras deep learning library, and Tensorflow is the rear end. In order to obtain the optimal CNN and further build a CNN-BiGRU model, parameters need to be optimized through model training, and the loss function value is minimized. This experiment used the classified Cross Entropy (Cross-Entropy) as a loss function to calculate the error of the output layer. Adam is a technique well suited for large data sets and parameter optimization and has found wide application in deep learning approaches. Therefore, in the back propagation process, the model parameters are updated by using the default Adam optimizer, learn 0.001, β 1 =0.9,β 2 =0.999,ε=10 -8 . The volume of batch processed data (bath size) was set to 64, while in order to avoid overfitting problems, Early Stopping (Early Stopping) was used in the training process to determine the optimal number of iterations (epoch).
2) Selecting evaluation criteria
The classification effect of the deep learning model is comprehensively evaluated by using the accuracy, the precision, the recall rate, the f-score and the confusion matrix. The accuracy (A) is the ratio of the number of correct classification samples in the test set samples to the total number of all test set samples, and the index can most intuitively evaluate the classification performance of the model. In the multi-classification task, the following 4 classification results are obtained by performing matching analysis on the labels of the test set and the prediction results: true positive, true negative, false positive and false negative. The accuracy (P) is the ratio of the number of the positive samples predicted by the model to the number of the actual positive samples in view of the classification result; the recall (R) is the ratio of the number of samples identified by the correct classification among all positive samples, considered from the sample-by-sample perspective; the F-score (F) is the harmony of the precision rate and the recall rate, and is combined into an evaluation index according to the same importance of the precision rate and the recall rate. The confusion matrix visualizes the above classification effect by an n × n matrix, each column representing the prediction category, the total number of each column representing the number of predicted positive samples, each row representing the actual category, and the total number of each row representing the actual number of positive samples of the category. Is calculated by the formula
Figure BDA0003667165420000131
In the formula, T positive Number of true positive samples; f positive The number of false positive samples is counted; t is negative The number of true negative samples; f negative The number of false negative samples.
3) CNN training results
In order to find an optimal CNN model suitable for ship target classification, complex and tedious calculation optimization of hyper-parameters needs to be avoided on one hand, and the experiment gradually increases the number of network layers and the number of convolution kernels according to the classification effect of the model; on the other hand, considering that the neural network is easy to generate an over-fitting problem due to the large weighting amount and the complicated input-output relationship, and Dropout is the most practical and widely applied regularization method for overcoming the over-fitting problem in the CNN, an attempt is made to add Dropout layers to construct an optimal CNN model in the training process.
The various structural configurations of CNN and the corresponding training results are shown in table 2. The number of convolution layers of the models A to C is increased from 2 layers to 6 layers, meanwhile, the number of convolution kernels is increased to capture more deep-level features, and the accuracy of the models is improved by 0.77%. The model D adds a full connection layer to generate an overfitting phenomenon, so that the precision on the test set does not rise or fall. In order to evaluate the effect of the maximum pooling layer, a maximum pooling layer is added behind each group of convolutional layers in the model E, the test precision is improved compared with that of the model D, but the overfitting phenomenon still exists. In order to relieve the over-fitting problem of the model, a Dropout layer is added on the back of the maximum pooling layer and the full connection layer for extracting the features of the model F, so that the test precision is reduced compared with that of the model E, and an under-fitting phenomenon occurs because the model is simplified by using too many Dropout layers, and the classification error is increased due to the loss of a large number of features. Thus, the Dropout layers are properly arranged to balance the over-fit and under-fit problems, and model G removes the Dropout layers from the group 1 and group 2 convolutional layers, with a maximum test accuracy of 78.74%. In order to further verify that the model G is the optimal CNN model, a group of convolution layers and a full connection layer are respectively added to the model H and the model I on the basis of the model G, and although a deeper model is created, the test precision is not improved, but the complexity and the calculation cost of the model are increased, so that the training time is increased. Therefore, the optimal CNN structure is model G, taking classification accuracy and classification efficiency into comprehensive consideration.
TABLE 2
Figure BDA0003667165420000141
4) CNN-BiGRU training results
Combining the optimal CNN model with the BiGRU model results in the CNN-BiGRU model structure shown in fig. 6. To obtain the optimal number of iterations epoch, epoch is set to 80, and model performance is calculated on the training set and the test set for each iteration. When the accuracy rate of the model increases by less than 0.1% for 10 consecutive times, the training is stopped and the optimal number of iterations is obtained. As shown in fig. 7-a and 7-b, the test accuracy stabilized around 79% -80% after about 19 training rounds, the test loss function value remained around 0.51-0.52, and the highest test accuracy of 80.6% and the minimum loss function value of 0.507 were reached at an epoch 34. In 10 times of training after the model reaches the highest precision, the test precision of the model is almost kept unchanged and is not obviously reduced along with the increase of the iteration times, which shows that the CNN-BiGRU model does not have the over-fitting problem and can better fit the ship target by excavating the space-time characteristics contained in the input vector.
To evaluate the performance of the CNN-BiGRU model, table 3 includes the confusion matrix, classification accuracy, recall, and f-score of the CNN-BiGRU model. From the overall classification condition, the prediction results of various types of ships are distributed along the diagonal line of the confusion matrix, all classification indexes exceed 62%, and the combination model can basically and accurately identify the ship target. From the local classification condition, the cargo ship has the characteristics of maximum sample data volume, fixed air route, constant-speed navigation in most time and the like, and the classification precision and the recall rate are as high as 94.4 percent and 94.1 percent; the characteristics of the passenger ship are similar to those of the cargo ship, the model classification precision can reach 86.6%, but the minimum sample data amount can cause that the combined model can not learn the motion rule of the passenger ship more fully, and the classification recall rate is only 73.7%; compared with the prior 2 ships, the tug boat and the fishing boat have strong maneuvering flexibility, and the invention considers that the track points of the ships in the operation state bring certain difficulty for classification, so the degree of confusion of the tug boat and the fishing boat is high, the classification precision and the recall rate of the fishing boat are 66.6 percent and 62.5 percent at the lowest, and the classification precision and the recall rate of the tug boat with more samples are 71.1 percent and 76.3 percent. The above results show that the classification performance of the CNN-BiGRU combined model has high correlation with the number of sample instances, and more instances can be considered to be added for identifying passenger ships and distinguishing tugs from fishing ships.
TABLE 3
Figure BDA0003667165420000151
5) Comparative experiment
To further evaluate the feasibility and effectiveness of the CNN-BiGRU model employed in the present invention, a set of comparative experiments were constructed using the same training and testing trajectory segments as the CNN-BiGRU model. On one hand, 4 machine learning methods commonly used for multi-classification tasks are selected for comparison, including K-Nearest Neighbor KNN (K-Nearest Neighbor), SVM, DT and RF, and training and evaluation of all models are based on scinit-leann machine learning library. Because the machine learning method needs to input the manually extracted track segment characteristics, in order to fully cover the motion characteristics of the ship, the invention extracts 5 statistics of the maximum navigational speed, the average navigational speed, the maximum acceleration, the average course change and the total sailing distance of each ship track segment from 4 types of characteristics of navigational speed, acceleration, course and distance as the characteristic space. And (3) searching the optimal parameters of the classifier by using a grid search method and a 5-fold cross validation method on a training set, ensuring that the training is the optimal model and the data can be well fitted. The heaviest parameters of the 4 machine learning models are the number n _ neighbors of the KNN, the maximum depth max _ depth of the penalty coefficient C, DT of the SVM and the number n _ estimators of the decision tree of the RF respectively. And finally, evaluating the machine learning model after the parameters are adjusted on the test set. On the other hand, deep learning models are selected for comparison, and a single deep learning model comprises an LSTM, a CNN-G model of the learning track space characteristics in the combined model and a BiGRU model of the learning track time characteristics; the combined deep learning model constructs a CNN-LSTM model. And inputting the feature vectors which are the same as the combined model, determining the optimal iteration times through a training set and an early-stop method to obtain an optimal model, and evaluating the optimal model on a test set.
Table 4 contains the optimal parameters of each model within the specified range, the accuracy A and the weighted average accuracy of each model
Figure BDA0003667165420000161
Weighted average recall
Figure BDA0003667165420000162
And weighted average f-fraction
Figure BDA0003667165420000163
The classification performance evaluation results on 4 indexes in total. On the whole, 4 evaluation indexes of the CNN-BiGRU adopted by the method are superior to those of other 6 models, and the accuracy rate is nearly 15% higher than that of the other models; locally, the classification effect of the deep learning method is superior to that of the machineCompared with the LSTM model which shows the worst performance in the deep learning method and the RF model which shows the best performance in the machine learning method, the accuracy rate of the LSTM model is 13.3 percent; the classification effect of the combined model is superior to that of a single deep learning model, compared with BiGRU and LSTM which respectively learn the CNN and the time characteristics of the space characteristics of AIS data, the combined model can simultaneously learn the space characteristics and the time characteristics to assist in classification of the ship target, the accuracy is 1.9% higher than the average accuracy of 3 deep learning models, and the classification precision is effectively improved; the classification effect of the CNN-BiGRU is better than that of the CNN-LSTM, and the BiGRU in the combined model can realize bidirectional mining on the AIS track sequence compared with the LSTM, so that the classification effect of the combined model is improved.
From the analysis, the CNN-BiGRU model adopted by the invention is superior to the traditional machine learning algorithm depending on artificial feature engineering in the capability of extracting high-level features through a multilayer nonlinear processing unit, and the combination of the CNN model and the BiGRU model effectively realizes the simultaneous mining of space and time features from AIS data, thereby further improving the classification precision of ship targets.
The method comprises the steps of firstly cleaning AIS data and dividing a ship track into samples with fixed lengths; then, constructing input feature vectors consisting of 4 channels for the CNN and the BiGRU respectively according to the characteristics of each model; secondly, training an optimal CNN model, and combining the optimal CNN and BiGRU to obtain a CNN-BiGRU model; and finally, training a CNN-BiGRU model based on the 2 input feature vectors and classifying the ship target.
Experimental results show that the CNN-BiGRU combined model constructed by the method can effectively realize accurate classification and identification of different ship targets, and is particularly suitable for cargo ships, so that the classification effect is better. Compared with the traditional machine learning method comprising KNN, SVM, DT and RF, on one hand, only simple input feature vectors are required to be constructed based on AIS data, so that artificial intervention and complex feature engineering are avoided, on the other hand, the deep learning method has the advantages of self-learning, extraction of advanced features representing the ship motion rule contained in the AIS data and the like, so that the classification effect of the CNN-BiGRU model on the ship target is superior to that of the machine learning method; compared with the deep learning method comprising the optimal CNN, BiGRU, LSTM and CNN-LSTM, the CNN-BiGRU combined model can be used for classifying and identifying the ship target by combining the space-time characteristics, and the precision of ship target classification is effectively improved.

Claims (9)

1. A ship target classification method based on AIS data is characterized by comprising the following steps:
1) acquiring AIS data, preprocessing the AIS data, extracting the speed, the course, the bow direction and the acceleration from the AIS data to construct a first input feature vector for learning the connection and integral features among the local parts of each track segment; extracting the speed, the course, the bow direction and the time interval as second input feature vectors;
2) constructing a ship target classification model, wherein the ship target classification model comprises a CNN model, a BiGRU model, a fusion layer and a full connection layer; the CNN model is used for processing the first input feature vector to obtain high-level features representing spatial information; the BiGRU model is used for processing the second input feature vector to obtain high-level features representing time sequence information; the fusion layer is used for fusing and summarizing the obtained high-level features representing the spatial information and the high-level features representing the time sequence information; and the full connection layer calculates the distribution probability of each type of ship target according to the fused and summarized features so as to realize the classification of the ship targets.
2. The AIS data-based ship target classification method according to claim 1, wherein the extraction process of the first input feature vector in the step 1) is as follows:
segmenting the acquired AIS data to obtain a track segment corresponding to each ship;
and acquiring the navigational speed, the heading and the bow direction of each track point in the track segment according to the set track segment length, and calculating the acceleration of the corresponding track point according to the speed and the time interval of the adjacent track points.
3. The AIS data-based ship target classification method according to claim 1, wherein the extraction process of the second input feature vector in the step 1) is as follows:
segmenting the acquired AIS data to obtain a track segment corresponding to each ship;
and acquiring the navigational speed, the course and the ship heading direction of each track point in the track segment according to the set track segment length, and acquiring the time interval of adjacent track points.
4. The AIS data-based ship target classification method according to claim 2 or 3, characterized in that when the AIS data is preprocessed, the AIS data is cleaned, and the cleaning comprises deleting missing key fields, time repetition and track point records beyond a set range.
5. The AIS data based vessel object classification method according to claim 1 wherein the BiGRU model includes two gated loop units in opposite directions.
6. The AIS data based ship target classification method according to claim 5 wherein the gated loop unit includes an input layer, a hidden layer and an output layer, the hidden layer including an update gate and a reset gate, the update gate and the reset gate being used together to determine whether historical information can be retained and transferred.
7. The AIS data-based ship target classification method according to claim 1, wherein the CNN model comprises an input layer, a convolutional layer and a pooling layer, the input layer is used for obtaining a first input feature vector; the convolution layers are several, and the pooling layer is arranged behind each convolution layer.
8. The AIS data-based ship target classification method according to claim 7, wherein the CNN model further includes a Dropout layer, and the Dropout layer is disposed after the last pooling layer.
9. The AIS data-based ship target classification method according to claim 1, wherein the ship target classification model is trained by first training a CNN model, and then integrally training the CNN model and the BiGRU model when the CNN model training is finished.
CN202210594360.3A 2022-05-27 2022-05-27 Ship target classification method based on AIS data Pending CN115063676A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210594360.3A CN115063676A (en) 2022-05-27 2022-05-27 Ship target classification method based on AIS data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210594360.3A CN115063676A (en) 2022-05-27 2022-05-27 Ship target classification method based on AIS data

Publications (1)

Publication Number Publication Date
CN115063676A true CN115063676A (en) 2022-09-16

Family

ID=83197693

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210594360.3A Pending CN115063676A (en) 2022-05-27 2022-05-27 Ship target classification method based on AIS data

Country Status (1)

Country Link
CN (1) CN115063676A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115345257A (en) * 2022-09-22 2022-11-15 中山大学 Flight trajectory classification model training method, classification method, device and storage medium
CN115841004A (en) * 2023-02-24 2023-03-24 北京科技大学 Strip steel hot rolling process mechanical property soft measurement method and device based on multidimensional data
CN116150618A (en) * 2023-02-02 2023-05-23 中国水产科学研究院东海水产研究所 Fishing boat operation type identification method based on deep learning neural network
CN116738324A (en) * 2023-08-11 2023-09-12 太极计算机股份有限公司 Model training method and identification method for single-towing operation behavior of fishing boat
CN117713912A (en) * 2024-02-05 2024-03-15 成都大公博创信息技术有限公司 CVCNN-BiGRU-based star link terminal signal identification method and device
CN117893919A (en) * 2024-01-12 2024-04-16 西南交通大学 Method for generating high-space-time resolution leaf area index product in cloudy and foggy region

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115345257A (en) * 2022-09-22 2022-11-15 中山大学 Flight trajectory classification model training method, classification method, device and storage medium
CN116150618A (en) * 2023-02-02 2023-05-23 中国水产科学研究院东海水产研究所 Fishing boat operation type identification method based on deep learning neural network
CN115841004A (en) * 2023-02-24 2023-03-24 北京科技大学 Strip steel hot rolling process mechanical property soft measurement method and device based on multidimensional data
CN116738324A (en) * 2023-08-11 2023-09-12 太极计算机股份有限公司 Model training method and identification method for single-towing operation behavior of fishing boat
CN116738324B (en) * 2023-08-11 2023-12-22 太极计算机股份有限公司 Model training method and identification method for single-towing operation behavior of fishing boat
CN117893919A (en) * 2024-01-12 2024-04-16 西南交通大学 Method for generating high-space-time resolution leaf area index product in cloudy and foggy region
CN117713912A (en) * 2024-02-05 2024-03-15 成都大公博创信息技术有限公司 CVCNN-BiGRU-based star link terminal signal identification method and device

Similar Documents

Publication Publication Date Title
CN115063676A (en) Ship target classification method based on AIS data
Lou et al. Prediction of ocean wave height suitable for ship autopilot
CN112906858A (en) Real-time prediction method for ship motion trail
CN111382686B (en) Lane line detection method based on semi-supervised generation confrontation network
CN113780395A (en) Mass high-dimensional AIS trajectory data clustering method
CN111931602A (en) Multi-stream segmented network human body action identification method and system based on attention mechanism
CN113297936B (en) Volleyball group behavior identification method based on local graph convolution network
Groba et al. Integrating forecasting in metaheuristic methods to solve dynamic routing problems: Evidence from the logistic processes of tuna vessels
CN116110022B (en) Lightweight traffic sign detection method and system based on response knowledge distillation
CN116563680B (en) Remote sensing image feature fusion method based on Gaussian mixture model and electronic equipment
CN115512152A (en) Ship track classification method and system combining CNN (CNN) neural network and LSTM neural network
CN114626598A (en) Multi-modal trajectory prediction method based on semantic environment modeling
CN114882293A (en) Random forest and ship target classification method based on AIS data feature optimization
CN114942951A (en) Fishing vessel fishing behavior analysis method based on AIS data
Du et al. Autonomous landing scene recognition based on transfer learning for drones
CN117636183A (en) Small sample remote sensing image classification method based on self-supervision pre-training
CN116341612A (en) AUV drift track prediction method based on ABiLSTM-QSOA network
Gunawan et al. Long Short-Term Memory Approach for Predicting Air Temperature In Indonesia
EP3965021B1 (en) A method of using clustering-based regularization in training a deep neural network to classify images
CN114358247A (en) Intelligent agent behavior interpretation method based on causal relationship inference
CN112687294A (en) Vehicle-mounted noise identification method
CN112015894A (en) Text single classification method and system based on deep learning
Chen et al. A bidirectional context-aware and multi-scale fusion hybrid network for short-term traffic flow prediction
CN117784615B (en) Fire control system fault prediction method based on IMPA-RF
Smitha et al. Efficient moving vehicle detection for intelligent traffic surveillance system using optimal probabilistic neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination