CN113569921A - Ship classification and identification method and device based on GNN - Google Patents
Ship classification and identification method and device based on GNN Download PDFInfo
- Publication number
- CN113569921A CN113569921A CN202110766734.0A CN202110766734A CN113569921A CN 113569921 A CN113569921 A CN 113569921A CN 202110766734 A CN202110766734 A CN 202110766734A CN 113569921 A CN113569921 A CN 113569921A
- Authority
- CN
- China
- Prior art keywords
- matrix
- gnn
- data
- track
- ship
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 239000011159 matrix material Substances 0.000 claims abstract description 101
- 238000012360 testing method Methods 0.000 claims abstract description 41
- 238000012549 training Methods 0.000 claims abstract description 35
- 238000000547 structure data Methods 0.000 claims abstract description 23
- 238000003062 neural network model Methods 0.000 claims abstract description 9
- 230000002159 abnormal effect Effects 0.000 claims description 7
- 238000004140 cleaning Methods 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 2
- 241001092040 Crataegus Species 0.000 claims 2
- 235000009917 Crataegus X brevipes Nutrition 0.000 claims 2
- 235000013204 Crataegus X haemacarpa Nutrition 0.000 claims 2
- 235000009685 Crataegus X maligna Nutrition 0.000 claims 2
- 235000009444 Crataegus X rubrocarnea Nutrition 0.000 claims 2
- 235000009486 Crataegus bullatus Nutrition 0.000 claims 2
- 235000017181 Crataegus chrysocarpa Nutrition 0.000 claims 2
- 235000009682 Crataegus limnophila Nutrition 0.000 claims 2
- 235000004423 Crataegus monogyna Nutrition 0.000 claims 2
- 235000002313 Crataegus paludosa Nutrition 0.000 claims 2
- 235000009840 Crataegus x incaedua Nutrition 0.000 claims 2
- 238000012163 sequencing technique Methods 0.000 claims 2
- 238000010801 machine learning Methods 0.000 abstract description 4
- 238000013528 artificial neural network Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a ship classification and identification method and device based on GNN, wherein the method comprises the following steps: extracting the characteristics of ship AIS data, and constructing a sample total set, wherein the sample total set is a three-dimensional matrix; converting the sample collection into graph structure data, and dividing the sample collection into a training set and a testing set; the GNN network model is trained through a training set, the characteristics of ship AIS data of all samples to be tested in a test set are input into the trained GNN network model to test the validity of the GNN network, the GNN network passing the test is used for classifying ships to be classified, and the GNN network model is a GNN neural network model with two layers of graph convolution layers. According to the scheme of the invention, the ship track is utilized, the spatial characteristics can be effectively extracted for machine learning, and the accuracy of classification and identification of the ship track can be improved.
Description
Technical Field
The invention relates to the field of pattern recognition, in particular to a ship classification recognition method and device based on GNN.
Background
Ship classification has wide application in both military and civilian fields, such as detection of illegal ships by relevant departments, vigilance of terrorism at sea, fighting of smuggling, and the like. At present, the method for researching ship type classification at home and abroad mainly takes traditional radar identification and optical identification as main parts, but has limitations, for example, the optical identification depends on video monitoring equipment, the visual field range is limited, the action distance is short, the influence of weather factors such as rain and fog is easy, and the limitation is large under weather conditions such as high humidity and low cloud at sea. Although the radar identification is slightly influenced by the environment, the problems of visibility and indistinguishability exist, and co-frequency interference clutter is easily generated in a complex electromagnetic environment. The scheme of ship classification and identification according to the AIS is little affected by weather, can automatically identify the state of the ship in all weather, has high data acquisition precision, and can acquire static data such as the voyage number and the attributes of the ship, so the AIS has important significance for ship classification and identification. The AIS data has the characteristics of large data volume and wide coverage area, and certain challenges are brought to classification and identification.
The traditional research method for ship classification mainly comprises a clustering algorithm based on the distance between track points, a machine learning algorithm after manually extracting features and a neural network classification method. In recent years, compared with a method combining artificial feature extraction and machine learning, a deep neural network gradually becomes a research hotspot. At present, the neural networks for classifying ship tracks mainly comprise traditional cyclic neural networks and convolutional neural networks such as CNN, MCDCNN, 1DCNN and RNN, but data processed by the convolutional neural networks are in a matrix form, and belong to Euclidean structures on the basis of a matrix formed by arranging samples, if the characteristics of the samples are regarded as vertexes, the vertexes of the traditional neural networks are independent, and the connection between the vertexes is not utilized; the recurrent neural network is modeled based on a time sequence, and has the defects that the sample characteristics are insufficient, the connection between the characteristics can not be recommended by using different samples, the characteristic learning is incomplete, and the classification result is not ideal.
Disclosure of Invention
In order to solve the technical problems, the invention provides a ship classification and identification method and device based on GNN, and the method and device are used for solving the technical problems that in the prior art, data relation is not utilized, and the classification result is not ideal enough.
According to a first aspect of the present invention, there is provided a GNN-based vessel classification identification method, the method comprising the steps of:
step S101: extracting the characteristics of ship AIS data, and constructing a sample total set, wherein the sample total set is a three-dimensional matrix; converting the sample collection into graph structure data, and dividing the sample collection into a training set and a testing set;
step S102: training a GNN network model by a training set, inputting the characteristics of ship AIS data of all samples to be tested in a test set into the trained GNN network model to test the validity of the GNN network, and classifying ships to be classified by using the GNN network passing the test;
each track is taken as a sample, and the total set of samples is a three-dimensional matrix; the first dimension of the three-dimensional matrix is the track number S ═ S of the AIS data1,…,Si,…,Snum}; the second dimension is the trajectory S of the AIS dataiThe number of upper trace points N; the third dimension is the attribute of each track point, including IMO, h, v, t-stamp, lat and lon; wherein IMO is ship IMO code, h is ship fore direction characteristic, v is speed, t-stamp is timestamp, lat is track point latitude value, lon is track point longitude value;
converting the sample collection into graph structure data G (V, Edge), wherein V is a vertex, and Edge is an Edge connecting the vertex; the chart structure data takes the bow-direction characteristic h of the track point as a vertex characteristic to construct a vertex characteristic matrix M; calculating the weight of the edge connecting the vertexes according to the speed characteristic v, and constructing an adjacency matrix B;
the GNN network model is a GNN neural network model with two graph convolution layers.
According to a second aspect of the present invention, there is provided a GNN-based vessel classification identifying apparatus, the apparatus comprising:
a feature acquisition module: the method comprises the steps of configuring to extract features of ship AIS data and constructing a sample total set, wherein the sample total set is a three-dimensional matrix; converting the sample collection into graph structure data, and dividing the sample collection into a training set and a testing set;
a classification module: training a GNN network model by a training set, inputting the characteristics of ship AIS data of all samples to be tested in a test set into the trained GNN network model to test the validity of the GNN network, and classifying ships to be classified by using the GNN network passing the test;
each track is taken as a sample, and the total set of samples is a three-dimensional matrix; the first dimension of the three-dimensional matrix is the track number S ═ S of the AIS data1,…,Si,…,Snum}; the second dimension is the trajectory S of the AIS dataiThe number of upper trace points N; the third dimension is the attribute of each track point, including IMO, h, v, t-stamp, lat and lon; wherein IMO is ship IMO code, h is ship fore direction characteristic, v is speed, t-stamp is timestamp, lat is track point latitude value, lon is track point longitude value;
converting the sample collection into graph structure data G (V, Edge), wherein V is a vertex, and Edge is an Edge connecting the vertex; the chart structure data takes the bow-direction characteristic h of the track point as a vertex characteristic to construct a vertex characteristic matrix M; calculating the weight of the edge connecting the vertexes according to the speed characteristic v, and constructing an adjacency matrix B;
the GNN network model is a GNN neural network model with two graph convolution layers.
According to a third aspect of the present invention, there is provided a GNN-based vessel classification recognition system, comprising:
a processor for executing a plurality of instructions;
a memory to store a plurality of instructions;
wherein the instructions are configured to be stored by the memory and loaded and executed by the processor to perform the GNN based vessel classification identification method as described above.
According to a fourth aspect of the present invention, there is provided a computer readable storage medium having a plurality of instructions stored therein; the plurality of instructions for loading and executing by the processor the GNN based vessel classification identification method as described above.
According to the scheme of the invention, because the traditional neural network can only process the Euclidean structures which are regularly arranged, the characteristic relation among different vertexes cannot be effectively utilized, the ship track has the time-space domain characteristics, and the sample characteristics belong to the non-Euclidean structures which are irregularly arranged. According to the method, the connection relation information between the track points and the track points is extracted by using the time-space domain characteristics, such as position characteristics, distance characteristics, speed characteristics and the like, included in the ship track, a topological correlation network is established, and the space characteristics can be effectively extracted for machine learning. Firstly, mapping of track point data and vertexes and edges in a graph structure is established, vertex key features are extracted, weights of opposite sides are assigned to establish an adjacency matrix, the track point data are changed into the graph data structure, GNN is input to train, the method for classifying and identifying the ship types overcomes the defects, and the accuracy of classifying and identifying the ship tracks can be improved.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical solutions of the present invention more clearly understood and to implement them in accordance with the contents of the description, the following detailed description is given with reference to the preferred embodiments of the present invention and the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention. In the drawings:
FIG. 1 is a flow chart of the GNN-based ship classification identification method according to the present invention;
FIG. 2 is a graph data diagram of one embodiment of the present invention;
fig. 3 is a schematic structural diagram of a GNN network model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of GNN network model training according to an embodiment of the present invention;
fig. 5 is a block diagram of a GNN-based ship classification recognition apparatus according to an embodiment of the present invention.
Detailed Description
First, a GNN-based ship classification recognition method according to an embodiment of the present invention will be described with reference to fig. 1. As shown in fig. 1, the method comprises the steps of:
step S101: extracting the characteristics of ship AIS data, and constructing a sample total set, wherein the sample total set is a three-dimensional matrix; converting the sample collection into graph structure data, and dividing the sample collection into a training set and a testing set;
step S102: training a GNN network model by a training set, inputting the characteristics of ship AIS data of all samples to be tested in a test set into the trained GNN network model to test the validity of the GNN network, and classifying ships to be classified by using the GNN network passing the test;
each track is taken as a sample, and the total set of samples is a three-dimensional matrix; the first dimension of the three-dimensional matrix is the track number S ═ S of the AIS data1,…,Si,…,Snum}; the second dimension is the trajectory S of the AIS dataiThe number of upper trace points N; the third dimension is the attribute of each track point, including IMO, h, v, t-stamp, lat and lon; wherein IMO is ship IMO code, h is ship fore direction characteristic, v is speed, t-stamp is timestamp, lat is track point latitude value, lon is track point longitude value;
converting the sample collection into graph structure data G (V, Edge), wherein V is a vertex, and Edge is an Edge connecting the vertex; the chart structure data takes the bow-direction characteristic h of the track point as a vertex characteristic to construct a vertex characteristic matrix M; calculating the weight of the edge connecting the vertexes according to the speed characteristic v, and constructing an adjacency matrix B;
the GNN network model is a GNN neural network model with two graph convolution layers.
Graph Neural Networks (GNNs) are a deep learning model for processing Graph data, can effectively utilize feature connections among different samples, and are popular in the fields of social Networks, knowledge maps, molecular chemistry and the like.
As shown in FIG. 2, the Graph (Graph) is composed of edges (edges) and vertices (vertices), where the edges are denoted by e and the vertices are denoted by v. In the figure, each vertex contains respective features, the features of the vertices can be represented by a matrix M with X Y dimensions, edges represent the relationship between the respective vertices, and a matrix B with X dimensions can be formed and is called an adjacent matrix. M and B are the input of the graph neural network model.
Before the step S101, a step S100 is included;
the step S100: determining the proportion of a training set and a test set in a sample total set;
in this embodiment, the ratio of the data in the training set and the test set is 8: 2, the training set is used to train the network model, and the test set is input into the network to classify the data to be tested.
The step S101: extracting the characteristics of ship AIS data, and constructing a sample total set, wherein the sample total set is a three-dimensional matrix; converting the sample collection into graph structure data, and dividing the sample collection into a training set and a testing set, including:
the vessel IMO encodes the same AIS data, having at least one trajectory SiEach locus SiHaving a plurality of track points, each track SiThe trace points in the sequence are arranged according to the time stamp sequence; from each track SiAcquiring continuous N track points which accord with a preset rule; the continuous N track points form a track Si', extracting the trajectory Si' the attributes of each track point comprise IMO, h, v, t-stamp, lat and lon; wherein IMO is ship IMO code, h is ship fore direction, v is speed, t-stamp is time stamp, lat is track point latitude value, lon is track point longitude value, wherein,i is more than or equal to 1 and less than or equal to Num, Num is the total number of tracks of the same IMO, and the extracted AIS data is characterized by a three-dimensional matrix MNum*N*6And extracting features of all samples, constructing a sample total set, and dividing the sample set into a training set and a testing set.
In this embodiment, the preset rule is: and setting a preset time interval threshold, wherein the time interval of every two adjacent track points in the N continuous track points is smaller than the preset time interval threshold.
Because port AIS data acquisition time interval is irregular, some adjacent track points are spaced for several seconds, some are spaced for several minutes even ten minutes, the longer the time interval of data acquisition, the worse the sample quality, therefore it is necessary to select a proper and as small as possible time interval threshold TT (time threshold) to ensure data reliability. The number N of the trace points contained in the data segment determines the classification and identification accuracy, and the more N, the higher the identification accuracy, so that the proper N is selected as much as possible, and the effectiveness of the sample is ensured. Generally, the larger the time interval threshold, the larger the number of trace points in the sample segment, i.e. the sample reliability and validity is a pair of spears, so it is necessary to find a balance state, which makes the sample have more N on the premise of smaller TT. Through tests, in this embodiment, after many experimental comparisons, the parameter TT is 20 (in seconds), and N is 160, which can meet the requirement.
Converting ship feature data into topological structure diagram data, and constructing an adjacency matrix by using feature description vertexes and edges. Only by selecting proper characteristic data as the input of the neural network, the effectiveness of ship track classification can be improved. In the embodiment, the ship heading characteristic is used as a vertex characteristic, and the speed characteristic is used as the weight of the edge to construct the adjacency matrix.
In this embodiment, one trace is taken as one sample, and one sample is converted into one vertex of the graph structure.
Converting the sample collection into graph structure data, including:
step S1011: determining the receptive field of the vertices in the graph structure, comprising:
Wherein,is a track SiThe nth locus point and the locus SjThe hardship distance between the nth trace points.
Sorting the Hash distances in the order from small to large, reserving the data with the Hash distance value of the first 5% to ensure the strong connection relation between the vertexes, and setting a relation strength threshold, for example, setting the data exactly equal to the 5 th% as the relation strength threshold. And (4) expressing the strong connection relation as 1 and the weak connection relation as 0, and constructing a relation matrix R based on the spatial distance connection strength characteristics to determine the receptive field of the vertex. The dimension of the relation matrix I is X × X, where X represents the number of samples, i.e., the number of vertices of the graph.
Where Thr represents the relationship strength threshold,is the average hardship distance between samples to be measured.
In this embodiment, a point with a small hardship distance indicates that the distance characteristic connection relationship between vertices is strong, and a point with a large hardship distance indicates that the distance characteristic connection relationship between vertices is weak.
Step S1012: calculating the two-norm of the average navigational speed difference of any two samples according to the average navigational speed ave _ v of all track points in the samples, taking the two-norm as the weight of the edge in the graph structure to obtain a weight matrix E, wherein the dimension of the weight matrix E is X X X,
whereinIs a track SiThe average speed of the flight of the aircraft,represents the track SjAverage speed of flight.
Step S1013: constructing an adjacent matrix B based on the weight matrix E:
multiplying the point of a relation matrix R based on the space distance connection strength characteristics by a weight matrix E of edges to obtain an adjacent matrix B with the dimension of X multiplied by X,
B=R·E
and normalizing the adjacency matrix B:
where min (B) is the minimum value in matrix B, max (B) is the maximum value in the matrix, and B (i, j) is the normalized adjacency matrix;
step S1014: and extracting the bow-direction characteristics of the track points as vertex characteristics, and constructing a vertex characteristic matrix M with the dimension of X multiplied by 1.
The step S102: training a GNN network model by a training set, inputting the characteristics of ship AIS data of all samples to be tested in a test set into the trained GNN network model to test the validity of the GNN network, and classifying ships to be classified by utilizing the GNN network passing the test, wherein the steps comprise:
the GNN network model is a GNN neural network model with two graph convolution layers, and further comprises the following steps:
the GNN network structure in this embodiment is shown in fig. 3, where dots represent vertices of the graph and have different labels, and the input graph structure data passes through the GNN network structure and outputs classification results with different labels.
In this embodiment, the input/output relationship of the first layer graph convolution layer is as follows:
in the formula, hjIs the eigenvalue of the vertex j in the input data, hiFor the eigenvalues of the output data vertices i, σ is the activation function, nj∈Neigh(ni) Denotes the field of view, W, of vertex i1τIs the convolution kernel of the first layer map convolution layer,normalized for laplace matrix:
LAPRAS[i,j]=A-1/2BijA1/2
b is a normalized adjacent matrix, and I is a unit matrix; a is a degree matrix of B, and the formula is Aij=∑jBij。
The input and output relationship of the second layer graph convolution layer is as follows:
in the formula, hjIs the eigenvalue of the vertex j in the input data, hiFor the eigenvalues of the output data vertices i, σ is the activation function, nj∈Neigh(ni) Denotes the field of view, W, of vertex i2τA convolution kernel for the second layer of map convolutional layer,normalization for laplace matrix;
LAPRAS[i,j]=A-1/2BijA1/2
b is a normalized adjacent matrix, and I is a unit matrix; a is a degree matrix of B, and the formula is Aij=∑jBij。
Is the addition of the normalized adjacency matrix and the identity matrix,is composed ofWherein the sum of each column of elements forms a diagonal matrix,for the ith row element of the diagonal matrix,representing the ith row and ith column elements of the diagonal matrix,to the minus half power of the diagonal matrix,to the power of half of the diagonal matrix.
As shown in fig. 4, the GNN network model training process includes:
step S301: acquiring a training set in a total sample set, selecting AIS sample data of all types of ships in the training set, and extracting the characteristics of the AIS data of each sample data;
in this embodiment, 80% of the total sample set is used as the training set, and 20% is used as the test set.
Wherein the extracting the characteristics of the AIS data of each sample data includes:
the vessel IMO encodes the same AIS data, having at least one trajectory SiEach locus SiHaving a plurality of track points, each track SiThe trace points in the sequence are arranged according to the time stamp sequence; from each track SiAcquiring continuous N track points which accord with a preset rule; the continuous N track points form a track Si', extracting the trajectory Si' the attributes of each track point comprise IMO, h, v, t-stamp, lat and lon; wherein IMO is ship IMO code, h is ship heading, v is speed, t-stamp is timestamp, lat is track point latitude value, lon is track point longitude value, i is more than or equal to 1 and less than or equal to Num, Num is total track number of the same IMO, and the extracted AIS data is characterized by a three-dimensional matrix MNum*N*6。
Step S302: inputting the characteristics of the AIS data of each sample data in the training set into the GNN network until the preset conditions for stopping training are met, and obtaining a trained GNN network;
step S303: inputting the characteristics of the ship AIS data of all samples to be tested in the test set into the trained GNN network model for vertex classification so as to test the validity of the GNN network, and classifying the ships to be classified by using the GNN network passed by the test.
In this embodiment, the training process of the neural network is to train a convolution kernel.
Further, before step S100, ship AIS data preprocessing is performed, including:
step S1: constructing a ship characteristic information table, comprising: constructing a ship feature database, extracting 6 fields of IMO, timestamp, bow direction, navigational speed, track point latitude and track point longitude from AIS data as values, taking the IMO of the ship as a primary key value, namely saving the track feature of each ship according to the IMO number, and arranging the track point data of each IMO according to the timestamp sequence;
step S2: data cleaning is carried out on ship AIS data, and the method comprises the following steps:
and discarding dirty data meeting the data cleaning conditions through data analysis, wherein the dirty data comprises abnormal position data and redundant position data, the abnormal position data refers to that the distance difference between the two adjacent track point data is greater than a first preset distance threshold when the time interval is smaller than a first preset time interval, and the redundant position data refers to that the characteristic attributes of the two adjacent track point data are completely the same.
The data analyzed and processed in this embodiment is ship track data in a certain fixed sea area, and dirty data needs to be discarded in order to achieve an ideal classification effect.
The judgment strategy is as follows:
1. and for the data with the same key, calculating the time interval and the distance interval between the (i + 1) th track point and the ith track point, wherein the distance calculation formula is a Haverine formula in consideration of the curvature of the earth.
The distance obtained by calculation with Haversene formula is called Ha's distance for short.
Wherein l represents the distance between two tracing points, and R represents the earth radius, which is generally 6371 Km; x is the number oflat1Denotes x1Latitude of the point, ylon1Denotes y1Longitude of point, xlat2Denotes x2Latitude of the point, ylon2Denotes y2The longitude, φ, of a point is the input to the Haverine equation.
And if the Ha's distance between two track points in a short time interval is too large, the (i + 1) th to the nth track points of the IMO ship are abnormal data and need to be discarded, wherein n represents the number of all track points of the IMO ship in the same time window.
2. If the (i + 1) th track point is completely the same as the ith track point, the (i + 1) th track point is redundant data and needs to be discarded.
An embodiment of the present invention further provides a GNN-based ship classification and identification apparatus, as shown in fig. 5, the apparatus includes:
a feature acquisition module: the method comprises the steps of configuring to extract features of ship AIS data and constructing a sample total set, wherein the sample total set is a three-dimensional matrix; converting the sample collection into graph structure data, and dividing the sample collection into a training set and a testing set;
a classification module: training a GNN network model by a training set, inputting the characteristics of ship AIS data of all samples to be tested in a test set into the trained GNN network model to test the validity of the GNN network, and classifying ships to be classified by using the GNN network passing the test;
each track is taken as a sample, and the total set of samples is a three-dimensional matrix; the first dimension of the three-dimensional matrix is the track number S ═ S of the AIS data1,…,Si,…,Snum}; the second dimension is the trajectory S of the AIS dataiThe number of upper trace points N; the third dimension is the attribute of each track point, including IMO, h, v, t-stamp, lat and lon; wherein IMO is ship IMO code, h is ship fore direction characteristic, v is speed, t-stamp is timestamp, lat is track point latitude value, lon is track point longitude value;
converting the sample collection into graph structure data G (V, Edge), wherein V is a vertex, and Edge is an Edge connecting the vertex; the chart structure data takes the bow-direction characteristic h of the track point as a vertex characteristic to construct a vertex characteristic matrix M; calculating the weight of the edge connecting the vertexes according to the speed characteristic v, and constructing an adjacency matrix B;
the GNN network model is a GNN neural network model with two graph convolution layers.
The embodiment of the invention further provides a GNN-based ship classification and identification system, which comprises:
a processor for executing a plurality of instructions;
a memory to store a plurality of instructions;
wherein the instructions are configured to be stored by the memory and loaded and executed by the processor to perform the GNN based vessel classification identification method as described above.
The embodiment of the invention further provides a computer readable storage medium, wherein a plurality of instructions are stored in the storage medium; the plurality of instructions for loading and executing by the processor the GNN based vessel classification identification method as described above.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions in actual implementation, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a physical machine Server, or a network cloud Server, etc., and needs to install a Windows or Windows Server operating system) to perform some steps of the method according to various embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and any simple modification, equivalent change and modification made to the above embodiment according to the technical spirit of the present invention are still within the scope of the technical solution of the present invention.
Claims (8)
1. A GNN-based ship classification identification method is characterized by comprising the following steps:
step S101: extracting the characteristics of ship AIS data, and constructing a sample total set, wherein the sample total set is a three-dimensional matrix; converting the sample collection into graph structure data, and dividing the sample collection into a training set and a testing set;
step S102: training a GNN network model by a training set, inputting the characteristics of ship AIS data of all samples to be tested in a test set into the trained GNN network model to test the validity of the GNN network, and classifying ships to be classified by using the GNN network passing the test;
each track is taken as a sample, and the total set of samples is a three-dimensional matrix; the first dimension of the three-dimensional matrix is the track number S ═ S of the AIS data1,…,Si,…,Snum}; the second dimension is the trajectory S of the AIS dataiThe number of upper trace points N; the third dimension is the attribute of each track point, including IMO, h, v, t-stamp, lat and lon; wherein IMO is ship IMO code, h is ship fore direction characteristic, v is speed, t-stamp is timestamp, lat is track point latitude value, lon is track point longitude value;
converting the sample collection into graph structure data G (V, Edge), wherein V is a vertex, and Edge is an Edge connecting the vertex; the chart structure data takes the bow-direction characteristic h of the track point as a vertex characteristic to construct a vertex characteristic matrix M; calculating the weight of the edge connecting the vertexes according to the speed characteristic v, and constructing an adjacency matrix B;
the GNN network model is a GNN neural network model with two graph convolution layers.
2. The GNN-based vessel classification identifying method according to claim 1, wherein the converting the sample collection into graph structure data comprises:
step S1011: determining the receptive field of the vertices in the graph structure, comprising:
Wherein,is a track SiThe nth locus point and the locus SjThe Hawthorn distance between the middle nth track points;
setting a relation strength threshold, sequencing the Hash distances from small to large, wherein the distance characteristics between the representing vertexes with the Hash distances smaller than the relation strength threshold are in a strong connection relation, and the distance characteristics between the representing vertexes with the Hash distances larger than the relation strength threshold are in a weak connection relation; expressing the strong connection relation by 1 and the weak connection relation by 0, and constructing a relation matrix R based on space distance connection strength characteristics to determine the receptive field of a vertex; the dimension of the relationship matrix I is X × X, X representing the number of samples, i.e. the number of vertices of the graph:
where Thr represents the relationship strength threshold,the average Ha's distance between samples to be measured;
step S1012: calculating the two-norm of the average navigational speed difference of any two samples according to the average navigational speed ave _ v of all track points in the samples, taking the two-norm as the weight of the edge in the graph structure to obtain a weight matrix E, wherein the dimension of the weight matrix E is X X X,
whereinIs a track SiThe average speed of the flight of the aircraft,represents the track SjAverage speed of flight of;
step S1013: constructing an adjacent matrix B based on the weight matrix E:
multiplying the point of a relation matrix R based on the space distance connection strength characteristics by a weight matrix E of edges to obtain an adjacent matrix B with the dimension of X multiplied by X,
B=R·E
and normalizing the adjacency matrix B:
where min (B) is the minimum value in matrix B, max (B) is the maximum value in the matrix, and B (i, j) is the normalized adjacency matrix;
step S1014: and extracting the bow-direction characteristics of the track points as vertex characteristics, and constructing a vertex characteristic matrix M with the dimension of X multiplied by 1.
3. The GNN-based vessel classification and identification method according to claim 2, wherein before the step S101, the vessel AIS data is subjected to data cleaning, which includes:
and discarding dirty data meeting the data cleaning condition, wherein the dirty data comprises abnormal position data and redundant position data, the abnormal position data refers to that the distance difference between the two adjacent track point data is greater than a first preset distance threshold when the time interval is smaller than a first preset time interval, and the redundant position data refers to that the characteristic attributes of the two adjacent track point data are completely the same.
4. A GNN-based vessel classification and identification apparatus, the apparatus comprising:
a feature acquisition module: the method comprises the steps of configuring to extract features of ship AIS data and constructing a sample total set, wherein the sample total set is a three-dimensional matrix; converting the sample collection into graph structure data, and dividing the sample collection into a training set and a testing set;
a classification module: training a GNN network model by a training set, inputting the characteristics of ship AIS data of all samples to be tested in a test set into the trained GNN network model to test the validity of the GNN network, and classifying ships to be classified by using the GNN network passing the test;
each track is taken as a sample, and the total set of samples is a three-dimensional matrix; the first dimension of the three-dimensional matrix is the track number S ═ S of the AIS data1,…,Si,…,Snum}; the second dimension is the trajectory S of the AIS dataiThe number of upper trace points N; the third dimension is the attribute of each track point, including IMO, h, v, t-stamp, lat and lon; wherein IMO is ship IMO code, h is ship fore direction characteristic, v is speed, t-stamp is timestamp, lat is track point latitude value, lon is track point longitude value;
converting the sample collection into graph structure data G (V, Edge), wherein V is a vertex, and Edge is an Edge connecting the vertex; the chart structure data takes the bow-direction characteristic h of the track point as a vertex characteristic to construct a vertex characteristic matrix M; calculating the weight of the edge connecting the vertexes according to the speed characteristic v, and constructing an adjacency matrix B;
the GNN network model is a GNN neural network model with two graph convolution layers.
5. The GNN-based vessel classification and identification apparatus according to claim 4, wherein the feature obtaining module comprises:
receptive field determination submodule: configured to determine a receptive field for a vertex in a graph structure, comprising:
Wherein,is a track SiThe nth locus point and the locus SjThe Hawthorn distance between the middle nth track points;
setting a relation strength threshold, sequencing the Hash distances from small to large, wherein the distance characteristics between the representing vertexes with the Hash distances smaller than the relation strength threshold are in a strong connection relation, and the distance characteristics between the representing vertexes with the Hash distances larger than the relation strength threshold are in a weak connection relation; expressing the strong connection relation by 1 and the weak connection relation by 0, and constructing a relation matrix R based on space distance connection strength characteristics to determine the receptive field of a vertex; the dimension of the relationship matrix I is X × X, X representing the number of samples, i.e. the number of vertices of the graph:
where Thr represents the relationship strength threshold,the average Ha's distance between samples to be measured;
a weight matrix acquisition submodule: the method is configured to calculate the two-norm of the average navigational speed difference of any two samples according to the average navigational speed ave _ v of all track points in the samples, the two-norm is used as the weight of an edge in a graph structure to obtain a weight matrix E, the dimensionality of the weight matrix E is X multiplied by X,
whereinIs a track SiThe average speed of the flight of the aircraft,represents the track SjAverage speed of flight of;
adjacency matrix acquisition submodule: and the method is configured to construct an adjacency matrix B based on the weight matrix E:
multiplying the point of a relation matrix R based on the space distance connection strength characteristics by a weight matrix E of edges to obtain an adjacent matrix B with the dimension of X multiplied by X,
B=R·E
and normalizing the adjacency matrix B:
where min (B) is the minimum value in matrix B, max (B) is the maximum value in the matrix, and B (i, j) is the normalized adjacency matrix;
track point extraction submodule: and constructing a vertex characteristic matrix M with the dimensionality of X multiplied by 1 by using the bow-direction characteristics configured to extract the track points as vertex characteristics.
6. A GNN-based vessel classification identifying apparatus according to claim 6, wherein said apparatus comprises:
the data cleaning module is configured to discard dirty data meeting data cleaning conditions, wherein the dirty data comprises abnormal position data and redundant position data, the abnormal position data refers to the condition that the distance difference between two adjacent track point data is larger than a first preset distance threshold when the time interval is smaller than a first preset time interval, and the redundant position data refers to the condition that the characteristic attributes of the two adjacent track point data are completely the same.
7. A GNN-based vessel classification recognition system comprising:
a processor for executing a plurality of instructions;
a memory to store a plurality of instructions;
wherein the plurality of instructions are to be stored by the memory and loaded and executed by the processor to perform the GNN based vessel classification identifying method according to any of claims 1-3.
8. A computer-readable storage medium having stored therein a plurality of instructions; the plurality of instructions for loading and executing by a processor the GNN based vessel classification identification method according to any of claims 1-3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110766734.0A CN113569921A (en) | 2021-07-07 | 2021-07-07 | Ship classification and identification method and device based on GNN |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110766734.0A CN113569921A (en) | 2021-07-07 | 2021-07-07 | Ship classification and identification method and device based on GNN |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113569921A true CN113569921A (en) | 2021-10-29 |
Family
ID=78163928
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110766734.0A Pending CN113569921A (en) | 2021-07-07 | 2021-07-07 | Ship classification and identification method and device based on GNN |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113569921A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114155491A (en) * | 2021-12-09 | 2022-03-08 | 杭州电子科技大学 | Ship behavior identification method and system based on AIS data |
CN115730263A (en) * | 2022-11-28 | 2023-03-03 | 中国人民解放军91977部队 | Ship behavior pattern detection method and device |
CN116776112A (en) * | 2023-08-25 | 2023-09-19 | 太极计算机股份有限公司 | Method and device for identifying double towing behaviors of fishing boat |
CN117935414A (en) * | 2024-01-23 | 2024-04-26 | 广州宇贤科技有限公司 | Traffic strategy analysis system based on image content big data identification |
-
2021
- 2021-07-07 CN CN202110766734.0A patent/CN113569921A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114155491A (en) * | 2021-12-09 | 2022-03-08 | 杭州电子科技大学 | Ship behavior identification method and system based on AIS data |
CN114155491B (en) * | 2021-12-09 | 2024-04-23 | 杭州电子科技大学 | Ship behavior recognition method and system based on AIS data |
CN115730263A (en) * | 2022-11-28 | 2023-03-03 | 中国人民解放军91977部队 | Ship behavior pattern detection method and device |
CN115730263B (en) * | 2022-11-28 | 2023-08-22 | 中国人民解放军91977部队 | Ship behavior pattern detection method and device |
CN116776112A (en) * | 2023-08-25 | 2023-09-19 | 太极计算机股份有限公司 | Method and device for identifying double towing behaviors of fishing boat |
CN116776112B (en) * | 2023-08-25 | 2024-02-13 | 太极计算机股份有限公司 | Method and device for identifying double towing behaviors of fishing boat |
CN117935414A (en) * | 2024-01-23 | 2024-04-26 | 广州宇贤科技有限公司 | Traffic strategy analysis system based on image content big data identification |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113569921A (en) | Ship classification and identification method and device based on GNN | |
CN111091105A (en) | Remote sensing image target detection method based on new frame regression loss function | |
CN110532859A (en) | Remote Sensing Target detection method based on depth evolution beta pruning convolution net | |
CN102436589B (en) | Complex object automatic recognition method based on multi-category primitive self-learning | |
CN112101278A (en) | Hotel point cloud classification method based on k nearest neighbor feature extraction and deep learning | |
CN105574550A (en) | Vehicle identification method and device | |
CN111950488B (en) | Improved Faster-RCNN remote sensing image target detection method | |
CN110018453A (en) | Intelligent type recognition methods based on aircraft track feature | |
CN101710422B (en) | Image segmentation method based on overall manifold prototype clustering algorithm and watershed algorithm | |
CN110610165A (en) | Ship behavior analysis method based on YOLO model | |
CN114332473B (en) | Object detection method, device, computer apparatus, storage medium, and program product | |
CN112489089B (en) | Airborne ground moving target identification and tracking method for micro fixed wing unmanned aerial vehicle | |
CN110008899B (en) | Method for extracting and classifying candidate targets of visible light remote sensing image | |
CN116469020A (en) | Unmanned aerial vehicle image target detection method based on multiscale and Gaussian Wasserstein distance | |
CN107292039B (en) | UUV bank patrolling profile construction method based on wavelet clustering | |
CN113487600A (en) | Characteristic enhancement scale self-adaptive sensing ship detection method | |
CN118097755A (en) | Intelligent face identity recognition method based on YOLO network | |
CN116958606A (en) | Image matching method and related device | |
CN106548195A (en) | A kind of object detection method based on modified model HOG ULBP feature operators | |
CN106951924B (en) | Seismic coherence body image fault automatic identification method and system based on AdaBoost algorithm | |
CN113297982A (en) | Target detection method for improving combination of KCF and DSST in aerial photography | |
CN111832463A (en) | Deep learning-based traffic sign detection method | |
CN114758119B (en) | Sea surface recovery target detection method based on eagle eye imitating vision and similar physical properties | |
CN115205693B (en) | Method for extracting enteromorpha in multi-feature integrated learning dual-polarization SAR image | |
CN115410102A (en) | SAR image airplane target detection method based on combined attention mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |