CN115100247A - Method for predicting ship track step by step based on AIS dynamic information - Google Patents

Method for predicting ship track step by step based on AIS dynamic information Download PDF

Info

Publication number
CN115100247A
CN115100247A CN202210863446.1A CN202210863446A CN115100247A CN 115100247 A CN115100247 A CN 115100247A CN 202210863446 A CN202210863446 A CN 202210863446A CN 115100247 A CN115100247 A CN 115100247A
Authority
CN
China
Prior art keywords
track
ship
points
longitude
latitude
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210863446.1A
Other languages
Chinese (zh)
Inventor
马宝山
刘昊博
熊桐
高宗江
邓宇航
张少阳
丁嘉怡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Maritime University
Original Assignee
Dalian Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Maritime University filed Critical Dalian Maritime University
Priority to CN202210863446.1A priority Critical patent/CN115100247A/en
Publication of CN115100247A publication Critical patent/CN115100247A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Medical Informatics (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Computing Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Navigation (AREA)

Abstract

The invention discloses a method for predicting a ship track step by step based on AIS dynamic information, which comprises the steps of obtaining all track point information generated in the ship sailing process and storing the track point information in an AIS data set; preprocessing and time sequence alignment are carried out on the AIS data set; calculating the navigation distance of the track points, constructing a training sample according to the longitude, latitude, speed, course, ship heading, ship length and navigation distance of all the track points, constructing an Xgboost model, training according to the training sample, predicting according to the Xgboost model to obtain the position of a ship, training k groups of Xgboost models based on k prediction time intervals to obtain k groups of ship positions, using the k groups of ship positions as framework points of the ship track, and obtaining the predicted ship track according to the framework points and an interpolation method. By training the Xgboost model, the subsequent track points of the ship are predicted only according to the last track point of the ship and form a complete track, the constraint that the model input based on the RNN must be a time sequence is eliminated, and the ship track prediction precision is improved.

Description

Method for predicting ship track step by step based on AIS dynamic information
Technical Field
The invention relates to the field of ship track prediction, in particular to a method for predicting a ship track step by step based on AIS dynamic information.
Background
The ocean shipping transportation has the advantages of large cargo capacity, low transportation cost, global navigation and the like, so the ocean shipping is always the main transportation mode of goods at import and export in China, occupies about two thirds of the total transportation volume in international trade, and has self-evident importance. A large number of ships navigate in relevant hot water areas such as ports in and out of China, and the characteristics of the ships determine that the ships cannot suddenly stop or turn, so that the ships are extremely easy to collide when navigating in the water areas with high traffic density, and the life and property safety of relevant practitioners can be damaged. Therefore, the motion track of the ship is effectively predicted by using a machine learning method to carry out risk assessment, the default course and speed of the ship can be timely modified according to the possible track of the adjacent ship, and collision is avoided as far as possible.
The AIS is an automatic tracking system on a ship, and with the forced installation of AIS equipment, information such as the speed, the ship heading, the destination, the ship identification code and the like of each shipping ship can be easily acquired. The trajectory of the vessel can be predicted by analyzing the AIS data. However, due to the harsh communication environment at sea, the original AIS data may contain some obviously incorrect fields and abnormal information points, and even the entire data packet is lost, the database may be polluted. In the case of varying data quality, RNN-based methods popular for land vehicle trajectory prediction are not suitable for predicting vessel trajectories.
In order to solve this problem and obtain the AIS data which are uniformly distributed and corrected in the time dimension, interpolation correction needs to be performed on the original AIS data, and an effective method at present is to smooth the trajectory by using cubic spline interpolation. However, if the AIS data of the ship track is still treated as a time series to be interpolated, the absence data must be compensated. The interpolation or recovery method is used for processing data, extra error accumulation is inevitably brought to a track prediction system, and the prediction accuracy is reduced.
The conventional research on the prediction of the ship track is roughly divided into two types: modeling based on kinematic equations and machine learning. Modeling method to ensure prediction accuracy, appropriate kinematic equations must be established to adapt to complex environmental factors. This requires the operator to have sufficient a priori knowledge of the vessel's travel area, which is often difficult to meet in practice. While the machine learning approach does not require the establishment of kinematic equations for each selected vessel. It is common practice to build neural network-based predictive models and cluster-based models. However, these methods either require strictly aligned AIS data as input or require large data sets to cluster similar tracks.
Disclosure of Invention
The invention provides a method for predicting a ship track step by step based on AIS dynamic information, which aims to overcome the technical problem.
A method for predicting ship tracks step by step based on AIS dynamic information comprises the following steps,
s1, acquiring track point information generated in the ship navigation process according to the AIS, wherein the track point information comprises IMO, MMSI, speed, ground heading, ship heading, longitude, latitude, time and ship length, and storing all the track point information generated in the navigation process in an AIS data set;
s2, preprocessing the AIS data set, sorting the preprocessed AIS data set according to a time sequence, wherein the preprocessing comprises deleting abnormal records, and detecting and deleting abnormal track point information acquired based on an abnormal detection algorithm, wherein the deleting of the abnormal records comprises deleting data of which the MMSI number is less than 9 bits or all 0, the course is less than 0 degree or more than 360 degrees, and the navigation speed is less than 0 mile per hour or more than 25 miles per hour;
s3, performing time sequence alignment on the sorted AIS data sets, including decomposing the AIS data sets based on time difference, interpolating the decomposed AIS data sets, and removing redundant data in the AIS data sets;
s4, carrying out gridding processing on the longitude and latitude of all track point information in the AIS data set, acquiring coupling points of all track points, representing the track points and the coupling points thereof as track point pairs, and calculating the navigation distance of the track points according to the track point pairs;
s5, obtaining the navigation distance of all track points, constructing training samples according to the longitude, latitude, speed, heading, captain and navigation distance of all track points, dividing the training samples into a training set and a testing set,
constructing an Xgboost model, training a first Xgboost model taking longitude variation as output and a second Xgboost model taking latitude variation as output through a training set, optimizing the two trained models through a testing set, wherein the optimization of the two trained models through the testing set is to optimize the two models respectively in a 5-fold cross validation mode, putting the testing set into the models, calculating an evaluation index predicted by the models, adjusting parameters of the models according to the evaluation index, and obtaining the two models with the optimal evaluation index, wherein the evaluation index comprises but is not limited to AUC, ACC, MSE and F1-score,
respectively inputting the single track point into two models, acquiring the longitude variation and latitude variation of the output, obtaining the longitude and latitude of the ship after a prediction time interval according to the longitude and latitude of the track point and the longitude variation and latitude variation of the output,
s6, setting k different prediction time intervals, respectively training a first Xgboost model and a second Xgboost model based on the different prediction time intervals, respectively inputting a single track point into the two models to obtain k groups of longitude variation and latitude variation, obtaining k groups of ship longitudes and latitudes according to the longitudes and latitudes of the track point and the k groups of longitude variation and latitude variation, taking the k groups of ship longitudes and latitudes as framework points of a ship track,
and S7, calculating the slope and slope deviation of a straight line formed between the longitudes and latitudes of the two adjacent groups of ships, when the slope deviation is smaller than a fourth threshold value, interpolating by adopting a polynomial interpolation method, otherwise, interpolating by adopting a segmented interpolation method, and taking the skeleton points and the longitudes and latitudes obtained through interpolation as predicted ship tracks.
Preferably, the S4 includes gridding the longitude and latitude of all track point information in the AIS data set by the formulas (1) and (2),
Figure BDA0003756178880000031
Figure BDA0003756178880000032
wherein, delta lng 、δ lat The grid sizes representing the longitude and latitude are set to 0.004 deg., 0.005 deg., min (lng) as the minimum longitude in the AIS data set, min (lat) as the minimum latitude in the AIS data set, and lng i Longitude, lat, representing the ith trace point i Representing the latitude of the ith track point;
obtaining the track point x according to the formula (3) i Is coupled to point x j ,P ij A pair of pairs of track points is represented,
Figure BDA0003756178880000033
x i ={lat i ,Ing i ,v i ,d t ,h i ,l i ,t} T ,lat i ,lng i ,v i ,d t ,h i ,I i respectively representing longitude, latitude, speed, heading, captain, time, x of the ith track point j ={lat j ,lng j ,v j ,d t ,h j ,l j ,t} T ,lat j ,lng j ,v j ,d t ,h j ,l j And t represents longitude, latitude, speed, heading, captain and time of the jth track point respectively,
Figure BDA0003756178880000034
the time of the ith trace point is,
Figure BDA0003756178880000035
the time of the jth track point, epsilon 1 is a prediction time interval, epsilon 2 is a deviation, T is an AIS data set,
according to P ij Calculating the navigation distance y of the ith track point i ={dlat i ,dlng i } T
dlng i =lng j -lng i (4)
dlat i =lat j -lat i (5)
lng i Longitude, lat, representing the ith trace point i Indicates the latitude, lng, of the ith trace point j Longitude, lat, representing the jth trace point j And the latitude of the jth track point is represented, and the jth track point and the ith track point form a track point pair.
Preferably, the step S7 includes calculating the slope k of a straight line between the longitude and the latitude of two adjacent ships according to the formula (6) i The slope deviation Δ k is calculated according to the formula (7),
Figure BDA0003756178880000041
Figure BDA0003756178880000042
wherein, dlng m For the mth group of longitude variations, dlat m As the m-th group of latitude variation, dlng m+1 For the (m + 1) th group of longitude variations, dlat m+1 Is the variation of the (m + 1) th group latitude,
and when the delta k is smaller than a fourth threshold value, interpolating by adopting a polynomial interpolation method, otherwise, interpolating by adopting a segmented interpolation method, and taking the skeleton point and the longitude and latitude obtained through interpolation as predicted ship tracks.
Preferably, decomposing the AIS data set based on the time difference includes setting a first time threshold, calculating the time difference of all adjacent track points in the AIS data set, sequentially traversing the time difference, and if the time difference is greater than the first time threshold, regarding the track point at the previous moment in the adjacent track points corresponding to the time difference as a demarcation point, acquiring all the demarcation points, and decomposing the AIS data according to the demarcation point.
Preferably, the interpolating the decomposed AIS data set includes setting a second time threshold, and interpolating the decomposed AIS data set according to the second time threshold and a cubic spline interpolation method.
Preferably, the removing of the redundant data in the AIS data set includes setting a third time threshold, recalculating time differences of all adjacent track points in the AIS data set, sequentially traversing the time differences, acquiring the adjacent track points corresponding to the time differences if the time differences are smaller than the third time threshold, randomly deleting information of one of the track points, and repeatedly calculating the time differences of all the adjacent track points in the AIS data set until all the time differences are greater than the third time threshold.
Preferably, the detecting and deleting the abnormal trace point information obtained based on the abnormality detection algorithm includes calculating a normalized speed of the trace point according to formula (8),
Figure BDA0003756178880000043
wherein x is i Representing the ith trace point, x i-1 Represents the i-1 st trackPoints, grad [ i]Normalized velocity, v, for the ith trace point i-1 Speed of the i-1 st track point, Dist (x) i ,x i-1 ) Representing the physical distance between the ith trace point and the (i-1) th trace point,
Figure BDA0003756178880000044
the time of the ith trace point is,
Figure BDA0003756178880000045
the time of the i-1 st trace point,
calculating the average value of the speeds of all track points in the AIS data set, traversing all track points, calculating the difference value between the normalized speed of the track points and the average value, marking the track points as speed abnormal points when the difference value is larger than a fifth threshold value, drawing a box line graph according to the adjacent track points of the speed abnormal points, marking the track points which exceed the upper boundary of the box line graph or are lower than the lower boundary of the box line graph as position abnormal points, and deleting all the speed abnormal points and the position abnormal points in the AIS data.
The invention provides a method for predicting ship track step by step based on AIS dynamic information, which is characterized in that each record of AIS data of a selected ship is independently processed, and subsequent track points of the ship are predicted according to the last track point of the ship and form a complete track by training an Xgboost model, so that the constraint that the model input based on RNN must be a time sequence is eliminated, and the ship track prediction precision is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a flow chart of anomaly detection according to the present invention;
FIG. 3 is a Xgboost model training process of the present invention;
FIG. 4 is a flow chart of the Xgboost model optimization prediction process of the present invention;
fig. 5 is a diagram of the overall model architecture of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
Fig. 1 is a flowchart of the method of the present invention, and as shown in fig. 1, the method of this embodiment may include:
a method for predicting ship tracks step by step based on AIS dynamic information comprises,
s1, acquiring track point information generated in the ship sailing process according to the AIS, wherein the track point information comprises IMO (International Maritime Organization identification), MMSI (Maritime Mobile Service identification), speed, ground heading, ship heading, longitude, latitude, time and ship length, and storing all track point information generated in the sailing process in an AIS data set;
due to the harsh communication environment at sea, the raw AIS data contains information such as IMO, MMSI, speed, heading to ground, heading of the vessel, longitude, latitude, etc., which may contain some fields that are clearly incorrect. To obtain a cleaner data set, the anomaly record is first washed out. Exception records can be classified into the following categories:
(1) an irrational MMSI numbering record. MMSI is a sequence of 9 digits that uniquely represents a ship and some records may return sequences with insufficient or all zeros;
(2) and recording the course beyond the normal value range. The course of the vehicle is within the range of 0-360 degrees.
(3) The speed should not exceed 25 miles per hour for records that are traveling at speeds outside of the normal range of values.
S2, preprocessing the AIS data set, sorting the preprocessed AIS data set according to time, wherein the preprocessing comprises deleting abnormal records, and detecting and deleting abnormal track point information acquired based on an abnormal detection algorithm;
deleting the exception record includes: deleting data with MMSI number less than 9 bits or all 0, heading less than 0 degree or more than 360 degrees, and navigation speed less than 0 mile per hour or more than 25 miles per hour.
Even if the significant erroneous recordings have been eliminated, the trajectory of the vessel may still have some outliers with anomalous position information. Such disturbances would provide inappropriate information and would inevitably disturb the efficient prediction of the vessel's position by the model. Therefore, it is necessary to further detect and delete the abnormal trace points.
Detecting and deleting abnormal trace point information obtained based on an abnormal detection algorithm includes calculating a normalized velocity of trace points according to formula (1),
Figure BDA0003756178880000061
wherein, grad [ i]Normalized velocity, v, for the ith trace point i-1 Speed of the i-1 st track point, Dist (x) i ,x i-1 ) Representing the physical distance between the ith trace point and the (i-1) th trace point,
Figure BDA0003756178880000062
respectively the time of the ith track point and the (i-1) th track point.
For AIS data records in the same voyage, the deviation between the actual average speed and the recording speed should be within a reasonable range, and the point where the speed is unreasonable is called the speed suspect anomaly point. The speed suspect anomaly point may be selected according to the 3-sigma rule. The 3-sigma rule is that in normally distributed AIS speed data, a percentage is selected that is less than one sigma, 2 sigma, 3 sigma from the mean, corresponding to ranges of 68.27%, 95.45%, and 99.73%, and points with errors exceeding 3 standard deviations are considered speed outliers.
The abnormal point detection flow chart is as shown in fig. 2, the average value of the speeds of all track points in the AIS data set is calculated, all track points are traversed, the difference value between the track point normalized speed and the average value is calculated, when the difference value is larger than a fifth threshold value, namely the difference value exceeds 3 standard deviations, the track points are marked as speed abnormal points, a box diagram is drawn according to adjacent track points of the speed abnormal points, the track points exceeding the upper boundary of the box diagram or being lower than the lower boundary of the box diagram are marked as position abnormal points, and all the speed abnormal points and the position abnormal points in the AIS data are deleted.
S3, performing time sequence alignment on the sorted AIS data sets, including decomposing the AIS data sets based on time difference, interpolating the decomposed AIS data sets, and removing redundant data in the AIS data sets;
for a continuous track, if two consecutive AIS messages are received with a time interval exceeding a time threshold between the tracks, the track is decomposed into two uncorrelated tracks, and if there are multiple such points in a track, a track is decomposed into multiple (temporally non-consecutive) uncorrelated tracks.
Decomposing the AIS data set based on the time difference comprises setting a first time threshold, calculating the time difference of all adjacent track points in the AIS data set, traversing the time difference in sequence, if the time difference is greater than the first time threshold, regarding the track point at the previous moment in the adjacent track points corresponding to the time difference as a demarcation point, acquiring all demarcation points, and decomposing the AIS data according to the demarcation points;
interpolating the decomposed AIS data set comprises setting a second time threshold, and interpolating the decomposed AIS data set according to the second time threshold and a cubic spline interpolation method;
as the AIS data minimum transmission interval can reach the second order, a large number of effective samples are collected in certain time periods of the ship, the samples are similar to each other, the training sample is similar to or even worse than the training sample which is only taken as a representative sample, and the negative effect of increasing the calculated amount is also achieved. Therefore, samples with recording time differences of less than 1 minute are considered to have similar characteristics, and only one of them is sampled as a valid sample. All records can be resampled in 1 minute to reduce redundancy. After resampling, there will not be more than one point pair in one minute, and preferably the second time threshold may take on a value of 1 minute.
Interpolating the decomposed AIS data set using a cubic spline interpolation method, approximating each time interval t using a cubic polynomial as shown in equation (2) i ~t i+1 I represents the ith track in the track set.
S i (t)=a i t 3 +b i t 2 +c i t+d i (2)
Wherein a is i ,b i ,c i ,d i Are the coefficients that need to be calculated under the constraints of the first and second derivatives. After the approximate cubic polynomial is obtained, the intermediate value between adjacent skeleton points can be derived, and the prediction of the selected track is completed at the target point, so that the interpolation is completed.
Removing the redundant data in the AIS data set comprises setting a third time threshold, recalculating the time difference of all adjacent track points in the AIS data set, sequentially traversing the time difference, acquiring the adjacent track points corresponding to the time difference if the time difference is smaller than the third time threshold, randomly deleting information of one track point, repeatedly calculating the time difference of all the adjacent track points in the AIS data set until all the time differences are larger than the third time threshold.
S4, the longitude and latitude of all track point information in the AIS data set are gridded through formulas (3) and (4),
Figure BDA0003756178880000081
Figure BDA0003756178880000082
wherein,δ lng 、δ lat The grid sizes representing the longitude and latitude are respectively set as 0.004 deg., 0.005 deg., min (lng) as the minimum longitude in AIS data set, min (lat) as the minimum latitude in AIS data set, and lng i Longitude, lat, representing the ith trace point i Representing the latitude of the ith track point;
using subscript T ═ x i |x i I-1, 2, …, m } represents a trajectory consisting of m points, where x is i The ith record representing the entire track. A pair of trajectory points is defined as two points in the same trajectory that are separated by a certain period of time (e.g., 5 minutes). "approximate" means we can tolerate very small deviations (. epsilon.2). Obtaining the track point x according to the formula (5) i Is coupled to point x j ,P ij A pair of pairs of track points is represented,
Figure BDA0003756178880000083
x i ={lat i ,lng i ,v i ,d t ,h i ,l i ,t} T ,lat i ,lng i ,v i ,d t ,h i ,l i respectively representing longitude, latitude, speed, heading, captain, time, x of the ith track point j ={lat j ,lng j ,v j ,d t ,h j ,l j ,t} T ,lat j ,lng j ,v j ,d t ,h j ,l j And t represents longitude, latitude, speed, heading, captain and time of the jth track point respectively,
Figure BDA0003756178880000084
the time of the ith trace point is,
Figure BDA0003756178880000085
the time of the jth track point, epsilon 1 is a prediction time interval, epsilon 2 is a deviation, T is an AIS data set,
according to P ij Will the ith railThe flight distance of the waypoint is denoted as y i ={dlat i ,dlng i } T Calculating dlng according to equation (6) i Calculating dlat according to equation (7) i
dlng i =lng j -lng i (6)
dlat i =lat j -lat i (7)
lng i Longitude, lat, representing the ith trace point i Indicates the latitude, lng, of the ith trace point j Longitude, lat, representing the jth trace point j The latitude of the jth track point is shown, the jth track point and the ith track point form a track point pair,
s5, obtaining the navigation distance of all track points, constructing a training sample according to the longitude, latitude, speed, course, ship heading, ship captain and navigation distance of all track points, and selecting the longitude, latitude, speed, ground heading, ship heading and ship captain as the input characteristics of the Xgboost model in all track point information according to navigation knowledge. Since the speed and the course are the key conditions for establishing the ship kinematics model, the ground course can be used as the auxiliary correction of the course. From the longitude and latitude characteristics, the model can learn the hydrological characteristic information of the research area. Since ships of different lengths have different sailing tendencies, in practice, ship length is another effective predictor.
Dividing training samples into a training set and a testing set, constructing an Xgboost model, training a first Xgboost model with longitude variation as output and a second Xgboost model with latitude variation as output by the training set, wherein the Xgboost model training process is shown in figure 3, an objective function in the Xgboost model training process is set as a formula (8),
Figure BDA0003756178880000091
the former term is loss function, and uses log-likelihood cost function to measure the difference degree between the current output predicted value and the true value, and the latter term is punishment term to avoid overfitting, and is used to generate treeThe number of nodes of the middle leaf is positively correlated with the score, wherein f t (X i ) Predicting the number of generated ships at track point X i The voyage distance at e 1 time,
Figure BDA0003756178880000092
calculating the resulting cumulative distance prediction, y, for all spanning trees previously i Is the actual distance, Ω (f) t ) A score is given to the current spanning tree structure.
An objective function Obj: ( t ) Approximation by Taylor second order Taylor expansion, and then pair f t (x i ) The derivation to find the optimal solution that minimizes the objective function is shown in equation (9),
Figure BDA0003756178880000093
wherein g is i Is a loss function of l pairs
Figure BDA0003756178880000094
First derivative of, h i Is a loss function of l pairs
Figure BDA0003756178880000095
T is the number of leaf nodes in the current tree, γ is a penalty coefficient for the number of leaf nodes, and λ is a penalty coefficient for the score of leaf nodes. And traversing all the characteristics of the sample and possible splitting nodes, calculating gains of the objective function before and after splitting, and selecting the maximum node gain to compare with a gain threshold value. And if the number of the splitting nodes is too large or the score sum of the leaf nodes is lower, terminating the tree splitting and starting the next round of iteration. Dividing a training set into an internal training set and an internal testing set, optimizing an Xgboost model by using a 5-fold cross validation mode, optimizing two trained models by using the testing set, evaluating the prediction performance of the models by using evaluation indexes such as AUC (Area Under Curve), ACC (ACCURACY, accuracy rate), MSE (Mean Square Error) and F1-score (F1 value), and evaluating parameters in the models according to the indexesAnd adjusting to obtain an optimal model.
And respectively inputting the single track point into the two models, acquiring longitude variation (longitude direction moving distance) and latitude variation (latitude direction moving distance) output in a grid coordinate system, and acquiring the longitude and the latitude of the ship after a predicted time interval according to the longitude and the latitude of the track point and the output longitude variation and latitude variation. The Xgboost model optimization prediction flow is shown in fig. 4.
However, only one future coordinate is not sufficient to construct a complete vessel trajectory. Therefore, a set of Xgboost models needs to be trained, and subsequent coordinates in a specified time period are predicted by adjusting the value of the time interval, and the subsequent coordinates are skeleton points.
S6, endowing epsilon 1 with k different prediction time intervals, respectively training a first Xgboost model and a second Xgboost model based on the different prediction time intervals to obtain k Xgboost model groups, wherein standard symbol definitions of the Xgboost model groups are shown as formulas (10), (11) and (12),
Figure BDA0003756178880000101
Figure BDA0003756178880000102
F=(φ k (x i ;∈ 1 ),ψ k (x i ;∈ 1 ))/∈ 1 ∈slot 1 ,slot 2 ,…,slot k …(12)
wherein x is i Is a characteristic of the ith sample, phi kk Latitude and longitude respectively representing the predicted voyage distance of the kth set of models. The possible values of e 1 are set to 1, 5, 10, 15, 20, 25, 30 minutes, slot k Is the predicted time interval of the kth set of models.
The method comprises the steps of inputting a single track point into an Xgboost model group, obtaining k groups of longitude variation and latitude variation, obtaining k groups of ship longitudes and latitudes according to the longitude and latitude of the track point and the k groups of longitude variation and latitude variation, and taking the k groups of ship longitudes and latitudes as framework points of a ship track.
After training to obtain a set of Xgboost models, the Xgboost models may be used to predict the position of the skeleton point in the future trajectory of the vessel. And connecting the skeleton points by an interpolation method, fitting coordinates at the middle time point, and expanding the discrete skeleton points to a complete track. Different interpolation methods have different precision and consumed resources, for example, polynomial interpolation is simple and easy to use, but when the highest degree of the polynomial is a high order, a dragon lattice phenomenon may occur, so that an interpolation curve becomes unstable. Although the piecewise interpolation method, such as the cubic spline interpolation method, can avoid the dragon lattice phenomenon, the requirement for selecting interpolation nodes is high, and the calculation amount is increased. The prediction through the Xgboost model is based on single-point framework point prediction, and the longitude and latitude coordinates of the framework points are predicted, so that the longitude and latitude between any two adjacent framework points can be calculated before interpolation, and the slope k of a straight line connected by the two points can be calculated i The tracks are distinguished by slope deviation delta k, the delta k is an absolute value of a difference between an average slope of any two adjacent skeleton points and slopes of the head point and the tail point, the delta k can reflect a change condition of a curve, generally, the curve of the fitted track is closer to a linear track when the delta k is smaller, the fitted curve is more complex when the delta k is larger, the track which is closer to the linear track uses polynomial interpolation, and the complex track uses a segmented interpolation method.
S7, calculating the slope k of a straight line formed between the longitude and the latitude of two adjacent groups of ships according to the formula (13) i The slope deviation Δ k is calculated according to the formula (14),
Figure BDA0003756178880000111
Figure BDA0003756178880000112
wherein, dlng m Is the m-th group of warpDegree of change, dlat m As the m-th group of latitude variation, dlng m+1 For the (m + 1) th group of longitude Change, dlat m+1 Is the variation of the (m + 1) th group latitude,
and when the delta k is smaller than a fourth threshold value, interpolating the complex track by adopting a polynomial interpolation method, otherwise, interpolating the complex track by adopting a simple linear track, and interpolating the complex track by adopting a segmented interpolation method, wherein the skeleton points and the longitude and latitude acquired by interpolation are used as predicted ship tracks.
Denote k prediction time intervals as [ t ] 1 ,t 2 ,t 3 ,…,t N ]The corresponding coordinate is [ y 0 ,y 1 ,y 2 …y n ],f[t k ]=y k .
For simple linear trajectories, the newton interpolation method in polynomial interpolation is used. Where the difference quotient of f (t) can be defined as:
f[k]=f(t k ),k=1,2,3,……,n (15)
Figure BDA0003756178880000113
the high-order difference quotient recurrence formula is as follows:
Figure BDA0003756178880000114
the prediction curve s (t) thus fitted via the skeleton points, as shown in equation (18):
S(t)=a 0 +a 1 (t-t 1 )+…+a N (t-t 1 )(t-t 2 )…(t-t N ) (18)
wherein a is k =f[t 1 ,t 2 ,……,t k ],k=1,2,3,……,n,
For complex trajectories, a cubic spline interpolation method in piecewise interpolation is used to fit the curve. Approximating each time interval t using a cubic polynomial as shown in equation (1) i ~t i+1 I represents the ith track in the track set. In obtaining an approximationAfter the third-order polynomial, intermediate values between adjacent skeleton points can be derived, and the prediction of the selected trajectory is completed at the target point.
The overall model of the method for predicting the ship track step by step based on the AIS dynamic information comprises the steps of obtaining historical AIS data, cleaning the AIS data, detecting and deleting AIS data abnormal points, carrying out time sequence alignment on the AIS data, selecting track point fields as feature training Xgboost models, predicting longitude and latitude coordinates of framework points according to the Xgboost models, carrying out classification interpolation according to the framework points to obtain fitting tracks, and obtaining the overall model structure chart of the embodiment as shown in FIG. 5.
The overall beneficial effects are as follows: the method for predicting the ship track step by step based on the AIS dynamic information is characterized in that each record of the AIS data of a selected ship is processed independently, and the Xgboost model is trained to predict the subsequent track points of the ship according to the last track point of the ship and form a complete track, so that the constraint that the model input based on the RNN must be a time sequence is eliminated, and the ship track prediction precision is improved.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (7)

1. A method for predicting ship tracks step by step based on AIS dynamic information is characterized by comprising the following steps of,
s1, acquiring track point information generated in the ship navigation process according to the AIS, wherein the track point information comprises IMO, MMSI, speed, ground heading, ship heading, longitude, latitude, time and ship length, and storing all the track point information generated in the navigation process in an AIS data set;
s2, preprocessing the AIS data sets, sorting the preprocessed AIS data sets according to a time sequence, wherein the preprocessing comprises deleting abnormal records, and detecting and deleting abnormal track point information acquired based on an abnormal detection algorithm, wherein the deleting of the abnormal records comprises deleting data of which the MMSI number is less than 9 bits or all 0, the course is less than 0 degree or more than 360 degrees, and the navigation speed is less than 0 mile per hour or more than 25 miles per hour;
s3, performing time sequence alignment on the sorted AIS data sets, including decomposing the AIS data sets based on time difference, interpolating the decomposed AIS data sets, and removing redundant data in the AIS data sets;
s4, carrying out gridding processing on the longitude and latitude of all track point information in the AIS data set, acquiring coupling points of all track points, representing the track points and the coupling points thereof as track point pairs, and calculating the navigation distance of the track points according to the track point pairs;
s5, obtaining the navigation distance of all track points, constructing training samples according to the longitude, latitude, speed, heading, captain and navigation distance of all track points, dividing the training samples into a training set and a testing set,
constructing an Xgboost model, training a first Xgboost model taking longitude variation as output and a second Xgboost model taking latitude variation as output through a training set, optimizing the two trained models through a test set, wherein the optimization of the two trained models through the test set is to optimize the two models respectively in a 5-fold cross validation mode, putting the test set into the models, calculating evaluation indexes predicted by the models, adjusting parameters of the models according to the evaluation indexes, and obtaining the two models with the optimal evaluation indexes, wherein the evaluation indexes include but are not limited to AUC, ACC, MSE and F1-score,
respectively inputting the single track point into two models, acquiring the longitude variation and latitude variation of the output, obtaining the longitude and latitude of the ship after a prediction time interval according to the longitude and latitude of the track point and the longitude variation and latitude variation of the output,
s6, setting k different prediction time intervals, respectively training a first Xgboost model and a second Xgboost model based on the different prediction time intervals, respectively inputting a single track point into the two models to obtain k groups of longitude variation and latitude variation, obtaining k groups of ship longitudes and latitudes according to the longitudes and latitudes of the track point and the k groups of longitude variation and latitude variation, taking the k groups of ship longitudes and latitudes as framework points of a ship track,
and S7, calculating the slope and the slope deviation of a straight line formed between the longitudes and latitudes of two adjacent groups of ships, when the slope deviation is smaller than a fourth threshold value, interpolating by adopting a polynomial interpolation method, otherwise, interpolating by adopting a segmented interpolation method, and taking the skeleton point and the longitude and latitude obtained by interpolation as the predicted ship track.
2. The method for stepwise forecasting of ship tracks based on AIS dynamic information according to claim 1, characterized in that S4 comprises gridding longitude and latitude of all track point information in AIS data set through formulas (1) and (2),
Figure FDA0003756178870000021
Figure FDA0003756178870000022
wherein, delta lng 、δ lat The grid sizes representing the longitude and latitude are respectively set as 0.004 deg., 0.005 deg., min (lng) as the minimum longitude in AIS data set, min (lat) as the minimum latitude in AIS data set, and lng i Longitude, lat, representing the ith trace point i Representing the latitude of the ith track point;
obtaining a track point x according to the formula (3) i Is coupled to point x j ,P ij A pair of pairs of track points is represented,
Figure FDA0003756178870000023
x i ={lat i ,lng i ,v i ,d t ,h i ,l i ,t} T ,lat i ,lng i ,v i ,d t ,h i ,l i respectively representing longitude, latitude, speed, heading, captain, time, x of the ith track point j ={lat j ,lng j ,v j ,d t ,h j ,l j ,t} T ,lat j ,lng j ,v j ,d t ,h j ,l j And t represents longitude, latitude, speed, heading, captain and time of the jth track point respectively,
Figure FDA0003756178870000024
the time of the ith trace point is,
Figure FDA0003756178870000025
the time of the jth track point, epsilon 1 is a prediction time interval, epsilon 2 is a deviation, T is an AIS data set,
according to P ij Calculating the navigation distance y of the ith track point i ={dlat i ,dlng i } T
dlng i =lng j -lng i (4)
dlat i =lat j -lat i (5)
lng i Longitude, lat, representing the ith trace point i Indicates the latitude, lng, of the ith trace point j Longitude, lat, representing the jth trace point j And the latitude of the jth track point is represented, and the jth track point and the ith track point form a track point pair.
3. The method for stepwise forecasting of ship trajectories according to claim 1, based on AIS dynamic information, wherein S7 comprises calculating the slope k of a straight line between the longitude and latitude of two adjacent groups of ships according to formula (6) i The slope deviation Δ k is calculated according to the formula (7),
Figure FDA0003756178870000031
Figure FDA0003756178870000032
wherein, dlng m For the mth group longitude variation, dlat m As the m-th group of latitude variation, dlng m+1 For the (m + 1) th group of longitude variations, dlat m+1 Is the variation of the (m + 1) th group latitude,
and when the delta k is smaller than a fourth threshold value, interpolating by adopting a polynomial interpolation method, otherwise, interpolating by adopting a segmented interpolation method, and taking the skeleton point and the longitude and latitude obtained through interpolation as predicted ship tracks.
4. The method for step-by-step prediction of the ship track based on the AIS dynamic information according to claim 1, wherein the decomposition of the AIS data set based on the time difference comprises setting a first time threshold, calculating the time difference of all adjacent track points in the AIS data set, traversing the time difference in sequence, if the time difference is greater than the first time threshold, regarding the track point at the previous moment in the adjacent track points corresponding to the time difference as a boundary point, acquiring all the boundary points, and decomposing the AIS data according to the boundary points.
5. The AIS dynamic information-based step-by-step ship trajectory prediction method according to claim 4, wherein the interpolating the decomposed AIS data set includes setting a second time threshold, and interpolating the decomposed AIS data set according to the second time threshold and a cubic spline interpolation method.
6. The method for predicting the ship track step by step based on the AIS dynamic information according to claim 5, wherein the removing of the redundant data in the AIS data set comprises setting a third time threshold, recalculating the time difference of all adjacent track points in the AIS data set, traversing the time differences in sequence, acquiring the adjacent track points corresponding to the time differences if the time differences are smaller than the third time threshold, deleting information of one track point randomly, and recalculating the time differences of all the adjacent track points in the AIS data set until all the time differences are larger than the third time threshold.
7. The AIS dynamic information-based step-by-step ship track prediction method according to claim 1, wherein the detecting and deleting of the abnormal track point information obtained based on the abnormal detection algorithm comprises calculating the normalized speed of the track points according to formula (8),
Figure FDA0003756178870000033
wherein x is i Representing the ith trace point, x i-1 Represents the ith-1 track point, grad [ i]Normalized velocity, v, for the ith trace point i-1 Speed of the i-1 st track point, Dist (x) i ,x i-1 ) Representing the physical distance between the ith trace point and the (i-1) th trace point,
Figure FDA0003756178870000034
the time of the ith trace point is,
Figure FDA0003756178870000035
the time of the i-1 st trace point,
calculating the average value of the speeds of all track points in the AIS data set, traversing all track points, calculating the difference value between the normalized speed of the track points and the average value, marking the track points as speed abnormal points when the difference value is larger than a fifth threshold value, drawing a box line graph according to the adjacent track points of the speed abnormal points, marking the track points which exceed the upper boundary of the box line graph or are lower than the lower boundary of the box line graph as position abnormal points, and deleting all the speed abnormal points and the position abnormal points in the AIS data.
CN202210863446.1A 2022-07-20 2022-07-20 Method for predicting ship track step by step based on AIS dynamic information Pending CN115100247A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210863446.1A CN115100247A (en) 2022-07-20 2022-07-20 Method for predicting ship track step by step based on AIS dynamic information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210863446.1A CN115100247A (en) 2022-07-20 2022-07-20 Method for predicting ship track step by step based on AIS dynamic information

Publications (1)

Publication Number Publication Date
CN115100247A true CN115100247A (en) 2022-09-23

Family

ID=83298278

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210863446.1A Pending CN115100247A (en) 2022-07-20 2022-07-20 Method for predicting ship track step by step based on AIS dynamic information

Country Status (1)

Country Link
CN (1) CN115100247A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115563889A (en) * 2022-12-06 2023-01-03 三亚海兰寰宇海洋信息科技有限公司 Target object anchoring prediction method, device and equipment
CN115730263A (en) * 2022-11-28 2023-03-03 中国人民解放军91977部队 Ship behavior pattern detection method and device
CN116778437A (en) * 2023-08-15 2023-09-19 中国铁塔股份有限公司 Target ship track monitoring method and device, electronic equipment and readable storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115730263A (en) * 2022-11-28 2023-03-03 中国人民解放军91977部队 Ship behavior pattern detection method and device
CN115730263B (en) * 2022-11-28 2023-08-22 中国人民解放军91977部队 Ship behavior pattern detection method and device
CN115563889A (en) * 2022-12-06 2023-01-03 三亚海兰寰宇海洋信息科技有限公司 Target object anchoring prediction method, device and equipment
CN116778437A (en) * 2023-08-15 2023-09-19 中国铁塔股份有限公司 Target ship track monitoring method and device, electronic equipment and readable storage medium
CN116778437B (en) * 2023-08-15 2023-10-27 中国铁塔股份有限公司 Target ship track monitoring method and device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN115100247A (en) Method for predicting ship track step by step based on AIS dynamic information
CN112906830B (en) Automatic generation method of ship optimal route based on AIS big data
CN113032502A (en) Ship anomaly detection method based on improved track segment DBSCAN clustering
CN108960421B (en) Improved online forecasting method for speed of unmanned surface vehicle based on BP neural network
CN106779137A (en) A kind of method that ship oil consumption is predicted according to sea situation and operating condition
CN113240199B (en) Port ship track prediction method based on DILATE _ TLSTM
CN115618251B (en) Ship track prediction method and device, electronic equipment and storage medium
Zhang et al. A study on the method for cleaning and repairing the probe vehicle data
CN111695299A (en) Mesoscale vortex trajectory prediction method
CN113190636B (en) Offshore road network construction method and system
CN113283653B (en) Ship track prediction method based on machine learning and AIS data
CN116342657B (en) TCN-GRU ship track prediction method, system, equipment and medium based on coding-decoding structure
CN112785077A (en) Travel demand prediction method and system based on space-time data
CN115660137B (en) Accurate estimation method for wind wave navigation energy consumption of ship
CN110737267A (en) Multi-objective optimization method for unmanned ships and intelligent comprehensive management and control system for unmanned ships
CN114254767A (en) Meteorological hydrological feature prediction method and system based on Stacking ensemble learning
CN115600733A (en) Ship track prediction method and device
CN116608861A (en) Ship track behavior abnormality detection method, system, device and storage medium
CN112800075B (en) Ship manipulation prediction database updating method based on six-degree-of-freedom attitude data of real ship
CN114565176B (en) Long-term ship track prediction method
CN115456258A (en) Method for predicting transportation capacity of competitor ship and computer readable medium
CN113887590B (en) Target typical track and area analysis method
CN111678512B (en) Star sensor and gyroscope combined satellite attitude determination method based on factor graph
CN114595770A (en) Long time sequence prediction method for ship track
Anne et al. Prediction of Ship Track Anomaly based on AIS data using Long Short-Term Memory (LSTM) and DBSCAN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination