CN112434814A - Method for analyzing shipping economic potential based on multi-source heterogeneous information fusion algorithm - Google Patents

Method for analyzing shipping economic potential based on multi-source heterogeneous information fusion algorithm Download PDF

Info

Publication number
CN112434814A
CN112434814A CN202011430916.2A CN202011430916A CN112434814A CN 112434814 A CN112434814 A CN 112434814A CN 202011430916 A CN202011430916 A CN 202011430916A CN 112434814 A CN112434814 A CN 112434814A
Authority
CN
China
Prior art keywords
bayesian network
analysis
channel
evidence
shipping
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011430916.2A
Other languages
Chinese (zh)
Inventor
汪杨骏
张韧
刘科峰
单雨龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202011430916.2A priority Critical patent/CN112434814A/en
Publication of CN112434814A publication Critical patent/CN112434814A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Mathematics (AREA)
  • Algebra (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A method for analyzing shipping economic potential based on a multi-source heterogeneous information fusion algorithm is disclosed, wherein the method comprises the following steps: constructing a Bayesian network related to the analysis of the economic potential of the marine through literature review and expert consultation; 2) evaluating and quantifying uncertainty of the collected evidence, and combining a quantification result to fuse and disperse the multivariate heterogeneous information into a new sample; 3) dividing newly generated samples into a training set and a test set, wherein the training set is brought into the constructed Bayesian network, the parameters of the Bayesian network are trained by adopting a maximum likelihood method, and the test set is used for testing the accuracy of Bayesian network reasoning; 4) analyzing shipping economic potentials of different channels according to the Bayesian network, and introducing a standard information flow method to study the sensitivity of each variable to the whole Bayesian network; the uncertainty of each variable is determined by a qualitative and quantitative evaluation method, and the transmission of the uncertainty among the variables is completed through a Bayesian network.

Description

Method for analyzing shipping economic potential based on multi-source heterogeneous information fusion algorithm
The technical field is as follows:
the invention relates to fusion of multi-source heterogeneous information, in particular to an algorithm capable of quantifying uncertainty and processing multi-source heterogeneous information fusion, and the algorithm is applied to analysis of shipping economic potential on an arctic channel and a traditional channel.
Background art:
a Bayesian Network (BN), also called belief network, consists of a Directed Acyclic Graph (DAG) and Conditional Probability Tables (CPT).
Generally, there are three different ways to construct a bayesian network:
(1) the nodes of variables (sometimes also called influencing factors) of the bayesian network are determined by domain experts, and then the structure of the bayesian network is determined by the knowledge of the experts and its distribution parameters are specified. The Bayesian network constructed in this way is completely conducted under the guidance of experts, and due to the limited knowledge acquired by human, the constructed network has great deviation from the data accumulated in practice.
(2) The nodes of the Bayesian network are determined by domain experts, and the structure and parameters of the Bayesian network are learned through a large amount of training data.
In this way
The method is completely a data-driven method, has strong adaptability, and is possible along with the continuous development of artificial intelligence, data mining and machine learning. How to learn the structure and parameters of the bayesian network from the data has become a hot spot of the research on the bayesian network.
(3) The nodes of the Bayesian network are determined by domain experts, the structure of the network is specified by the knowledge of the experts, and the parameters of the network are learned from the data by a machine learning method. This approach is actually a compromise between the first two approaches, and can greatly improve the learning efficiency when the relationship between variables in the field is significant.
It is well known that the maritime trade is an important link in international logistics and plays a very important role in the global trade market. However, the existing marine transportation industry is facing risks of pirate attacks, political swings, and airline congestion. The emergence of arctic channels offers new possibilities for existing sea transport as arctic sea ice is further ablated. Both the current shipping industry and the academic circles are evaluating the feasibility of northeast channel to become a reliable new route in the visible future, especially the economic potential of the northeast channel.
The existing literature has some theoretical frameworks about ocean transportation economic potential analysis, including scenario-based derivation, simulation and the like. While existing research has extensively discussed various aspects that affect shipping economics, the uncertainty present therein lacks effective analysis. In fact, uncertainties due to different evidence qualities may lead to deviations in the economic analysis during the calculation process, and even to the opposite. Aiming at the problems, the invention constructs a novel information fusion algorithm based on the Bayesian network, is applied to analysis of shipping economic potential, and aims to introduce uncertainty analysis into economic benefit analysis to evaluate potential economic benefits of shipping in arctic channel and traditional channel under the condition of incomplete information (large climate mode error, low navigation guarantee experience and non-uniform expert knowledge). In the algorithm, the uncertainty of each variable is determined by a qualitative and quantitative evaluation method, and the transfer of the uncertainty among the variables is completed through a Bayesian network. Meanwhile, the sensitivity analysis is carried out on each variable by the method of introducing the information flow into the model.
Reference documents:
Bai,C.,Zhang,R.,Bao,S.,Liang,X.S.,Guo,W.,2018.Forecasting the tropical cyclone genesis over the Northwest Pacific through identifying the causal factors in cyclone-climate interactions.J.Atmos.Ocean.Technol.35,247–259.https://doi.org/10.1175/JTECH-D-17-0109.1
Kiiski,T.,2017.Feasibility of Commercial Cargo Shipping Along the Northern Sea Route.University of Turku Turku.
Liang,X.S.,2014.Unraveling the cause-effect relation between time series.Phys.Rev.E-Stat.Nonlinear,Soft Matter Phys.90.https://doi.org/10.1103/PhysRevE.90.052150
Montewka,J.,Goerlandt,F.,Kujala,P.,2014.On a systematic perspective on risk for formal safety assessment(FSA).Reliab.Eng.Syst.Saf.127,77–85.https://doi.org/10.1016/j.ress.2014.03.009
the invention content is as follows:
in order to solve the problems, the invention combines a novel multi-source heterogeneous information fusion algorithm to realize quantitative analysis on shipping economic potential, so as to realize the purpose.
The technical scheme adopted by the invention is as follows: a method for analyzing shipping economic potential based on a multi-source heterogeneous information fusion algorithm is characterized by comprising the following steps: the method comprises the following steps:
the method comprises the following steps: constructing a Bayesian network related to the analysis of the economic potential of the marine through literature review and expert consultation;
1.1 screening variables to be considered in the economic potential analysis of shipping of the arctic channel and the traditional channel, and taking the variables as nodes of the Bayesian network.
1.2 causality analysis between variables in the economic potential analysis of shipping of the arctic channel and the traditional channel so as to construct the edge of the Bayesian network.
Step two: evaluating and quantifying uncertainty of the collected evidence, and combining a quantification result to fuse and disperse the multivariate heterogeneous information into a new sample;
2.1 data collection and evaluation involved in the analysis of economic potential of shipping in arctic and traditional channels.
2.2 model collection and evaluation involved in the economic potential analysis of shipping in arctic and traditional channels.
2.3 fusion and discretization sample generation of multi-source heterogeneous evidence.
Step three: dividing newly generated samples into a training set and a test set, wherein the training set is brought into the constructed Bayesian network, the parameters of the Bayesian network are trained by adopting a maximum likelihood method, and the test set is used for testing the accuracy of Bayesian network reasoning;
3.1 dividing the new discrete evidence into a training sample and a test sample, inputting the training sample into the Bayesian network, and training the Bayesian network parameters by adopting a maximum likelihood method.
3.2 inputting the test sample into a trained and mature Bayesian network, and checking the accuracy of the Bayesian network economic potential analysis.
Step four: analyzing shipping economic potential of different channels according to Bayesian network, and introducing standard information flow method for research
The sensitivity of each variable to the whole Bayesian network is studied;
and 4.1, carrying out sensitivity analysis on each variable in the Bayesian network by adopting the absolute value of the correlation coefficient, and excavating nodes (variables) which have great influence on shipping economic potential.
And 4.2, carrying out sensitivity analysis on each variable in the Bayesian network by adopting a standard information flow method, excavating nodes (variables) which have large influence on shipping economic potential, and comparing the consistency and difference of results obtained by the standard information flow method and a correlation coefficient absolute value method.
Specifically, in the step one: construction of a Bayesian network: in the novel information fusion algorithm provided by the invention, a Bayesian network is adopted to establish an inference model for economic potential of the north channel and the traditional channel, so that an optimal route is screened out. The Bayesian network combines the relevant principles of graph theory and probability theory to intuitively describe the probability relationship among data, information and model output, and can adjust the parameters of the whole network structure or nodes at any time according to the changing conditions. In addition, the bayesian network can learn causal relationships among variables from a large number of cases and provide sensitivity analysis of the variables.
The Bayesian network is a probability graph model and consists of nodes and directed edges connecting the nodes, and variables to be considered in the analysis of economic potential of shipping of the North channel and the traditional channel are screened to serve as the nodes of the Bayesian network. And (4) carrying out causality analysis between variables in the economic potential analysis of shipping of the arctic channel and the traditional channel so as to construct the edges of the Bayesian network.
In the probability map model, Δ ═ { G (V, a), P }. Where G (V, A) reflects the graph structure of the Bayesian network, and V ═ V { V }1,…,VnThe nodes (random variables) of the Bayesian network are used, A is the edge connecting the nodes (the mutual relation among the nodes), and the conditional probability P is adopted to reflect the mutual relation among the nodesThe strength of the relationship, namely:
Figure BDA0002820573330000021
wherein Pa (V)i) Is a variable ViThe parent node of (2).
In the multi-source heterogeneous information fusion algorithm, a conditional probability distribution table among variables can be trained from evidences (including models, data, assumptions and the like), and uncertainty existing in information fusion can be transmitted through a Bayesian network.
Step two: evaluation and quantification of evidence
The invention adds uncertainty bias in the shipping economic potential analysis framework and even produces results that are diametrically opposed to the facts. The evaluation framework proposed by the invention is improved based on the Montewka et al (2014) method for quantitative evaluation of evidence. The method proposed by Montewka et al (2014) quantitatively scores uncertainty of evidence from multiple angles, with the scale from 1 to 5 representing the increasing strength of the evidence in some aspect. In the method, the quantity, quality and integrity of data and experience and theoretical effectiveness of a model form a direct index for evaluating evidence uncertainty, the conceptual limitation is to measure the time and money required by the adopted data and model under different backgrounds, and the optional space indirectly evaluates the uncertainty degree of the evidence by counting the number of variable evidence types. The results of the evaluation of the different evidences are shown in table 2.
The new evidence is formed by fusing samples from different sources under different confidence coefficients, and the calculation formula is as follows:
Figure BDA0002820573330000031
wherein the content of the first and second substances,
Figure BDA0002820573330000032
is a new sample of the ith variable, VjIs a variable quantity
Figure BDA0002820573330000033
Type j evidence (data, model, assumptions), IndexjnFor the same variable, the chapter samples by taking confidence degrees corresponding to different samples and models as respective weights, and the single variable adds random disturbance according to the confidence degrees, so that original data are fused to generate a new data sample, and after the evidences of all variables are updated, uncertainty can be transferred by adopting a Bayesian network, and the method is recorded as:
P(Va|Vb)P(Va)/P(Vb) (3)
and fusing different evidences and the confidence degrees obtained by the corresponding model evaluation based on the formula (2) to obtain a new sample, and performing discretization processing on the fused new sample of each variable.
Step three: bayesian network training and prediction
The invention divides the new sample after the dispersion into a training sample and a testing sample to be brought into the Bayesian network constructed in the first step, the parameters of each node are trained by adopting the training sample and combining the maximum likelihood estimation method, and the result is verified by adopting the testing sample.
Step four: sensitivity analysis
Sensitivity analysis aims to study the effect of a single variable on the entire bayesian network.
The invention adopts a standardized information flow method (Baicheng, Zhangiun, etc., 2018) and analyzes the sensitivity of the absolute value of the correlation coefficient to the variable.
First, a standardized information flow method is introduced: any two time series Xi,XjFrom XjTo XiThe information stream transmitted may be represented as:
Figure BDA0002820573330000034
wherein, CijIs XiAnd XjCoefficient of covariance of, Ci,djRepresents XiAnd
Figure BDA0002820573330000035
of (2), wherein
Figure BDA0002820573330000036
Is to adopt Euler forward difference format pair dXjApproximation of/dt, noted as:
Figure BDA0002820573330000037
where Δ T is the time step, k is 1,2, …, n is a tunable parameter ((Liang,2014) discusses the value of k, which is generally 1, and 2 only for highly chaotic and extremely dense termsj→iWhen 0 denotes XjFail to cause XiChange if Tj→i>0, represents XjResult in XiGenerally, the larger the value, XjAnd XiThe stronger the causal relationship between them.
In order to make the causal relationship between two variables more obvious, caucasian ancestor, Zhangi et al (2018) start from differential equations, an improved information flow scheme is adopted, which is recorded as:
Figure BDA0002820573330000038
wherein abs refers to absolute value function, random variable
Figure BDA0002820573330000041
Available from (Liang, 2014).
Figure BDA0002820573330000042
The practical measurement is from XjTo XiTransmitting information stream versus other random process pairs XiThe value range of (1) is [0,1 ]]The larger the numerical value is, the larger X isjTo XiThe effect is obvious in the month, otherwise, the cause and effect relationship between the two is weak.
Second, the absolute value of the correlation coefficient indicates XjAnd XiThe correlation between the two is strong and weak, and the larger the absolute value of the correlation coefficient is, the stronger the correlation between the two is. The index is used as a control group to compare with the analysis results of the standardized information stream.
Has the advantages that: and screening variables to be considered in the economic potential analysis of shipping through the arctic channel and the traditional channel to serve as nodes of the Bayesian network. And carrying out causality analysis on variables in the shipping economic potential analysis of the arctic channel and the traditional channel so as to construct the edges of the Bayesian network. The method is different from the traditional economic potential analysis method in that the quality of the evidence is evaluated, the uncertainty of the evidence is quantified, and finally the multi-source heterogeneous evidence is fused. Furthermore, uncertainty in evidence may be transmitted between nodes over a bayesian network. The evaluation framework proposed by the invention is improved based on the Montewka et al (2014) method for quantitative evaluation of evidence.
Description of the drawings:
FIG. 1 is a work flow of a novel multi-source heterogeneous fusion algorithm in a maritime economic potential analysis; (cylinder is relevant evidence; rectangle is key technology; ellipse is intermediate or final result);
FIG. 2 is a Bayesian network for economic potential analysis of arctic channel and traditional channel shipping; as shown in fig. 2, V1 ship tonnage, V2 fuel price, V3 load, etc.;
FIG. 3 is a new sample distribution for each variable; the new sample distribution of 24 variables given in fig. 2 is shown in total in 24 graphs;
FIG. 4 is a graph of the impact (standard information flow) of variables (V1-V22) on economic cost for arctic channel (V23) (top graph) and economic cost for traditional channel (V24) (bottom graph);
FIG. 5 shows the influence (absolute values of the correlation coefficients) of the variables (V1-V22) on the economic cost of the Arctic route (V23) (upper graph) and the economic cost of the conventional route (V24) (lower graph).
The specific implementation mode is as follows:
in order to make the objects, technical solutions and advantages of the present invention more clear, the present invention is further described in detail below with reference to 4 attached tables, 5 drawings and calculation examples.
Fig. 1 is a workflow of a novel multi-source heterogeneous fusion algorithm in a shipping economic potential analysis, and a general description is given to a core principle of the multi-source heterogeneous information fusion algorithm and an application of the multi-source heterogeneous information fusion algorithm in the shipping economic potential analysis.
The method comprises the following steps: through literature review and expert consultation results, variables and models possibly related to influence the economic potential of the arctic channel and the traditional channel are sorted, and thus the Bayesian network for analyzing the shipping economic potential of the arctic channel and the traditional channel is constructed (see fig. 2).
Step two: the evidence quantitative evaluation framework related to the invention is shown in table 1, and the quantitative evaluation of the evidence comprises the following steps: data quality, number integrity, model experience effectiveness, theoretical effectiveness, model & data concept limitations, and selectable space;
the framework is used for carrying out quantitative evaluation on variables (nodes) and models (edges) involved in the Bayesian network, and the evaluation results are shown in Table 2. The evaluation and quantitative analysis of the data and models employed respectively include: sea ice data, environmental data, ship data, unit price of fuel oil, cost of a conventional ship, cost of an ice-grade ship 1, cost of an ice-grade ship 2, capital cost of a conventional channel, capital cost of an arctic channel, operating cost of a conventional channel, operating cost premium, and voyage cost; the model comprises a sailability model, a fuel consumption model 1, a fuel consumption model 2, a fuel consumption model 3, a navigational speed model 1 and a navigational speed model 2.
The specific evaluation opinions are as follows: the sea ice ensemble forecasting model in the sea ice data is good in fitting to historical data, and shows a good prediction effect on the future in a test set, but the simulation effect of the method on the sea ice extreme value is general. The geographic data are derived from high-precision measured data, so that the data are excellent in various aspects of evaluation. The ship data and the fuel price data are both derived from network real-time data, and the reliability is high. The conventional shipbuilding price and ice-level ship cost data 1 are derived from Clarkson world fleet data, the quantity is sufficient, the quality is reliable, the completeness of the data is good, the concept limitation is small under a unified dollar settlement framework, various ship types are covered, including the size, the type, the ice level and the like, and the selectable space is wide. The ice ship cost data 2 is derived from different literature data, the quantity is limited, the quality depends on the understanding degree of experts on problems, the completeness of the data is poor, the backgrounds of different experts are different, the assessment of the price overflow is time-consuming and labor-consuming, and the selectable space is large. Capital cost of the traditional channel and the arctic channel is mainly determined by ship cost, interest rate years and loan proportion, while interest rate, interest rate years and loan proportion data of different areas can be freely downloaded on the internet, data volume is sufficient, quality is reliable, completeness of data is good, selectable space is more, interest is uniformly settled by dollars in addition, and concept limitation is small. The operating costs of a traditional channel are also derived from Clarkson world fleet data. The overflow price of different operation costs comes from the knowledge of the expert on the problems and the estimation of the future, and different expert backgrounds are different, so that the evaluation of the overflow price is more time-consuming and labor-consuming, and the selectable space is more. The voyage data is derived from authoritative data issued by an official website, and the credibility is high.
The fuel consumption models are mainly obtained by three models, namely fitting data of a Clarkson world fleet, and the simulation effect and the actual measurement effect of the statistical model are good after effectiveness test. Secondly, the fuel consumption is fitted according to the statistical relationship between the navigational speed and the fuel consumption and is adopted by a large number of scholars, but Kiiski (2017) indicates that the relationship underestimates the fuel consumption and can only be used as a lower limit. And thirdly, another statistical fitting relation between the ship speed and the fuel consumption is also tested by validity, but is less adopted by other scholars. The navigational speed model I is a dynamic relation established by researching the physical action between sea ice and the ship, so that the model theory support is sufficient, and the navigational speed model II is a statistical relation obtained through ship navigation data, but the result is rough.
The method is different from the traditional economic potential analysis method in that the quality of the evidence is evaluated, the uncertainty of the evidence is quantified, and finally the multi-source heterogeneous evidence is fused. Furthermore, uncertainty in evidence may be transmitted between nodes over a bayesian network.
Step three: the evidence after evaluation and quantification is fused based on formula 2, and a new sample generated by fusion is discretized and then shown in table 3, wherein the new sample comprises ship size, fuel price, load, specific fuel consumption, interest rate, age, proportion, arctic speed, conventional fuel consumption, arctic fuel consumption, extra fuel consumption, conventional capital cost, arctic capital premium, conventional operation cost, arctic operation premium, conventional voyage cost, arctic voyage cost, season, arctic channel distance, conventional channel distance, unit distance arctic channel cost, unit distance conventional channel cost, arctic channel cost and conventional channel cost.
The distribution of the new samples is shown in figure 3. From the figure it is shown that in the new samples generated, the variables 19-20 are characterized by a uniform distribution, the variables 10-11, the variables 17, 21, 23 are characterized by a power distribution, the variables 14, 16, 22, 24 are characterized by a normal distribution, and the samples of the variables 2-7 are concentrated in several areas. The diversity of the sample distribution reflects the complexity of economic potential evaluation. Dividing a sample into a training set and a testing set, substituting the training set into a Bayesian network, and training parameters of the Bayesian network by adopting a maximum likelihood estimation algorithm, namely, reversely deducing parameter values which can cause the result most probably by using a known sample result, wherein a likelihood function can be recorded as:
Figure BDA0002820573330000051
then, the maximum likelihood estimation method solves:
Figure BDA0002820573330000052
the value of theta corresponding to the function when the function obtains the maximum value is the model parameter of the Bayesian network constructed by the invention.
And substituting the test set into the trained Bayesian network, so that the probability of the test data in the trained Bayesian network is as follows:
Figure BDA0002820573330000053
the test effect of the trained bayesian network is shown in table 4. From the results, the accuracy of the bayesian network inference is that the accuracy of the arctic channel is 92.54% when the optimal channel of the test sample is the arctic channel, and the accuracy of the conventional channel is 94.11% when the optimal channel of the test sample is the conventional channel.
Step four: FIG. 3 is the effect of variables on the economic cost of the arctic channel and the economic cost of the conventional channel (standard information flow), FIG. 4 is the effect of variables on the economic cost of the arctic channel and the economic cost of the conventional channel (absolute value of correlation coefficient), and the main factors affecting the economic cost of the arctic channel are the cost per unit distance of the arctic channel (V21), the conventional capital cost (V12), the conventional fuel consumption (V9), the size of the vessel (V1), the cost per unit distance of the conventional channel (V22), the cost per arctic channel (V17) and the speed of the arctic channel (V8) in combination with the two methods; and the main factors influencing the economic cost of the conventional channel are conventional fuel consumption (V9), ship size (V1), conventional capital cost (V12), conventional operating cost (V14), unit distance arctic channel cost (V21), conventional voyage cost (V16), arctic voyage cost (V17) and arctic fuel consumption (V10). In summary, factors affecting the economic cost of an arctic channel include the cost per unit distance for a conventional channel, the conventional capital cost, and the conventional fuel consumption, while the economic cost affecting a conventional channel also includes the cost per unit distance for an arctic channel, the arctic channel cost, and the arctic fuel consumption. This means that the channels are not independent of each other, and the shipping economy of the conventional channel is also affected by the rise of the arctic channel.
TABLE 1 framework for quantitative evaluation of evidence
Figure BDA0002820573330000061
TABLE 2 evaluation and quantitative analysis of the data and models used
Figure BDA0002820573330000062
Figure BDA0002820573330000071
TABLE 3 New sample discretization of variables
Figure BDA0002820573330000072
TABLE 4 Bayesian network parameter training results
Figure BDA0002820573330000073

Claims (5)

1. A method for analyzing shipping economic potential based on a multi-source heterogeneous information fusion algorithm is characterized by comprising the following steps: the method comprises the following steps:
the method comprises the following steps: constructing a Bayesian network related to the analysis of the economic potential of the marine through literature review and expert consultation;
1.1 screening variables to be considered in the economic potential analysis of shipping of the arctic channel and the traditional channel, and taking the variables as nodes of the Bayesian network;
1.2 causality analysis between variables in the economic potential analysis of shipping of the arctic channel and the traditional channel is carried out, so that the edge of the Bayesian network is constructed;
step two: evaluating and quantifying uncertainty of the collected evidence, and combining a quantification result to fuse and disperse the multivariate heterogeneous information into a new sample;
2.1 data collection and evaluation involved in the economic potential analysis of shipping in arctic channel and traditional channel;
2.2 model collection and evaluation involved in the economic potential analysis of shipping of the arctic channel and the traditional channel;
2.3, fusion and discretization sample generation of multi-source heterogeneous evidence;
step three: dividing newly generated samples into a training set and a test set, wherein the training set is brought into the constructed Bayesian network, the parameters of the Bayesian network are trained by adopting a maximum likelihood method, and the test set is used for testing the accuracy of Bayesian network reasoning;
3.1 dividing the new discrete evidence into a training sample and a test sample, inputting the training sample into the Bayesian network, and training the Bayesian network parameters by adopting a maximum likelihood method;
3.2 inputting the test sample into a trained Bayesian network, and checking the accuracy of the Bayesian network economic potential analysis;
step four: analyzing shipping economic potentials of different channels according to the Bayesian network, and introducing a standard information flow method to study the sensitivity of each variable to the whole Bayesian network;
4.1, carrying out sensitivity analysis on each variable in the Bayesian network by adopting the absolute value of the correlation coefficient, and excavating nodes (variables) with large influence on shipping economic potential;
and 4.2, carrying out sensitivity analysis on each variable in the Bayesian network by adopting a standard information flow method, excavating nodes (variables) which have large influence on shipping economic potential, and comparing the consistency and difference of results obtained by the standard information flow method and a correlation coefficient absolute value method.
2. The method for analyzing shipping economic potential based on multi-source heterogeneous information fusion algorithm of claim 1, characterized in that: the quantitative evaluation of the evidence involved in the step two comprises the following steps: data quality, number integrity, model experience effectiveness, theoretical effectiveness, model & data concept limitations, and selectable space;
the quantitative evaluation of the evidence is used for carrying out quantitative evaluation on variables (nodes) and models (edges) involved in the Bayesian network, and the evaluation and quantitative analysis of the adopted data and models respectively comprise: sea ice data, environmental data, ship data, fuel unit price, conventional ship cost, ice ship cost, second ice ship cost, traditional channel capital cost, arctic channel capital cost, traditional channel operation cost, operation cost premium, voyage number cost; the model comprises a sailability model, a fuel consumption model, a second fuel consumption model, a third fuel consumption model, a navigational speed model and a second navigational speed model.
3. The multi-source heterogeneous information fusion algorithm and the application thereof in shipping economic potential analysis according to claim 1, characterized in that: the fourth step specifically comprises the following substeps:
(3.1) data collection and evaluation involved in the economic potential analysis of shipping in arctic channel and traditional channel;
(3.2) collecting and evaluating models involved in the economic potential analysis of shipping of the arctic channel and the traditional channel;
and (3.3) fusion of multi-source heterogeneous evidence and discretization sample generation.
4. The multi-source heterogeneous information fusion algorithm and the application thereof in shipping economic potential analysis according to claim 1, characterized in that: the third step specifically comprises the following substeps:
(4.1) dividing the new discretized evidence into a training sample and a test sample, inputting the training sample into a Bayesian network, and training parameters of the Bayesian network by adopting a maximum likelihood method;
and (4.2) inputting the test sample into a trained and mature Bayesian network, and checking the accuracy of the Bayesian network economic potential analysis.
5. The multi-source heterogeneous information fusion algorithm and the application thereof in shipping economic potential analysis according to claim 1, characterized in that:
in the first step: construction of a Bayesian network: establishing a reasoning model for economic potential of the arctic channel and the traditional channel by adopting a Bayesian network, thereby screening an optimal route; the Bayesian network combines the correlation principle of graph theory and probability theory to describe the probability relation between data, information and model output intuitively and adjust the parameters of the whole network structure or nodes at any time according to the changing conditions; in addition, the bayesian network can learn causal relationships among variables from a large number of cases and provide sensitivity analysis of the variables;
the Bayesian network is a probability graph model and consists of nodes and directed edges connecting the nodes, and variables to be considered in the analysis of economic potential of shipping of the North channel and the traditional channel are screened and used as the nodes of the Bayesian network; causality analysis between variables in the economic potential analysis of shipping of the arctic channel and the traditional channel is carried out, so that the edge of the Bayesian network is constructed;
in the probability map model, Δ ═ { G (V, a), P }; where G (V, A) reflects the graph structure of the Bayesian network, and V ═ V { V }1,...,VnThe node (random variable) of the Bayesian network is used, A is the mutual relation between the nodes which are the edges connecting the nodes, and the strength of the mutual relation between the nodes is reflected by adopting the conditional probability P, namely:
Figure FDA0002820573320000021
wherein Pa (V)i) Is a variable ViA parent node of (a);
in the multi-source heterogeneous information fusion algorithm, a conditional probability distribution table among variables is obtained by training from models, data and assumptions, and uncertainty existing in information fusion is transmitted through a Bayesian network;
step two: evaluation and quantification of evidence
The proposed evaluation framework is improved based on the Montewka method to carry out quantitative evaluation on the evidence; the uncertainty of the evidence is quantitatively graded from multiple angles, and the integers with the grades from 1 to 5 respectively represent the increasing strength of the evidence in a certain aspect; in the method, the quantity, quality and integrity of data and the experience and theoretical effectiveness of a model form a direct index for evaluating the uncertainty of the evidence, the conceptual limitation is to measure the time and money required by the adopted data and the model under different backgrounds, and the selectable space indirectly evaluates the uncertainty degree of the evidence by counting the number of the evidence types of each variable; the results of the evaluation of the different evidences are shown in table 2;
TABLE 2 evaluation and quantitative analysis of the data and models used
Figure FDA0002820573320000022
Figure FDA0002820573320000031
The evidence is formed by fusing samples from different sources under different confidence coefficients, and the calculation formula is as follows:
Figure FDA0002820573320000032
wherein the content of the first and second substances,
Figure FDA0002820573320000033
is a new sample of the ith variable, VjIs a variable quantity
Figure FDA0002820573320000039
Class j data, model, proof of assumptions, IndexjnThe nth evaluation index of the jth evidence is used for sampling the same variable by taking confidence degrees corresponding to different samples and models as respective weights, random disturbance is added to the single variable according to the confidence degrees, so that original data are fused to generate a new data sample, and after the evidences of all the variables are updated, uncertainty can be transmitted by adopting a Bayesian network, and the method is recorded as:
P(Va|Vb)P(Va)/P(Vb) (3)
fusing the confidence degrees obtained by evaluating different evidences and corresponding models based on the formula (2) to obtain new samples, and performing discretization processing on the fused new samples of all variables;
step three: bayesian network training and prediction
Dividing the new discretized sample into a training sample and a testing sample, bringing the training sample and the testing sample into the Bayesian network constructed in the step one, training the parameters of each node by adopting the training sample and combining a maximum likelihood estimation method, and verifying the result by adopting the testing sample;
step four: sensitivity analysis
Sensitivity analysis aims at studying the effect of a single variable on the entire bayesian network;
analyzing the sensitivity of the absolute value of the correlation coefficient to the variable by adopting a standardized information flow method;
the standardized information flow method comprises the following steps: any two time series Xi,XjFrom XjTo XiThe information stream transmitted is represented as:
Figure FDA0002820573320000034
wherein, CijIs XiAnd XjCoefficient of covariance of, Ci,djRepresents XiAnd XjOf (2), wherein
Figure FDA00028205733200000310
Is to adopt Euler forward difference format pair dXjApproximation of/dt, noted as:
Figure FDA0002820573320000035
where Δ T is the time step, k 1,2, n is an adjustable parameter ((Liang,2014) discusses the value of k, which is generally 1, and 2 only for highly chaotic and extremely dense terms; in general, T isj→iWhen 0 denotes XjFail to cause XiChange if Tj→i> 0, represents XjResult in XiGenerally, the larger the value, XjAnd XiThe stronger the causal relationship exists between the two;
adopting an improved information flow scheme, and recording as:
Figure FDA0002820573320000036
where abs refers to a function of the absolute value,
Figure FDA0002820573320000037
is a random variable;
Figure FDA0002820573320000038
the practical measurement is from XjTo XiTransmitting information stream versus other random process pairs XiThe value range of (1) is [0,1 ]]The larger the numerical value is, the larger X isjTo XiThe effect is obvious in the month, otherwise, the cause and effect relationship between the two is weak.
CN202011430916.2A 2020-12-07 2020-12-07 Method for analyzing shipping economic potential based on multi-source heterogeneous information fusion algorithm Pending CN112434814A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011430916.2A CN112434814A (en) 2020-12-07 2020-12-07 Method for analyzing shipping economic potential based on multi-source heterogeneous information fusion algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011430916.2A CN112434814A (en) 2020-12-07 2020-12-07 Method for analyzing shipping economic potential based on multi-source heterogeneous information fusion algorithm

Publications (1)

Publication Number Publication Date
CN112434814A true CN112434814A (en) 2021-03-02

Family

ID=74691669

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011430916.2A Pending CN112434814A (en) 2020-12-07 2020-12-07 Method for analyzing shipping economic potential based on multi-source heterogeneous information fusion algorithm

Country Status (1)

Country Link
CN (1) CN112434814A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010572A (en) * 2021-03-18 2021-06-22 杭州码全信息科技有限公司 Public digital life scene rule model prediction early warning method based on deep Bayesian network
CN114114358A (en) * 2021-11-24 2022-03-01 中国人民解放军国防科技大学 Arctic sea ice thickness spatial resolution improving method based on multi-source satellite data fusion
CN116304991A (en) * 2023-05-16 2023-06-23 广东省科学院广州地理研究所 Multi-source heterogeneous species distribution data fusion method and device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010572A (en) * 2021-03-18 2021-06-22 杭州码全信息科技有限公司 Public digital life scene rule model prediction early warning method based on deep Bayesian network
CN114114358A (en) * 2021-11-24 2022-03-01 中国人民解放军国防科技大学 Arctic sea ice thickness spatial resolution improving method based on multi-source satellite data fusion
CN114114358B (en) * 2021-11-24 2024-05-28 中国人民解放军国防科技大学 North sea ice thickness spatial resolution improvement method based on multi-source satellite data fusion
CN116304991A (en) * 2023-05-16 2023-06-23 广东省科学院广州地理研究所 Multi-source heterogeneous species distribution data fusion method and device
CN116304991B (en) * 2023-05-16 2023-08-08 广东省科学院广州地理研究所 Multi-source heterogeneous species distribution data fusion method and device

Similar Documents

Publication Publication Date Title
CN112434814A (en) Method for analyzing shipping economic potential based on multi-source heterogeneous information fusion algorithm
Li et al. Data-driven Bayesian network for risk analysis of global maritime accidents
Feizizadeh et al. An uncertainty and sensitivity analysis approach for GIS-based multicriteria landslide susceptibility mapping
Morato et al. Optimal inspection and maintenance planning for deteriorating structural components through dynamic Bayesian networks and Markov decision processes
CN115829120B (en) Water quality prediction early warning system based on machine learning method
Chou et al. Application of fuzzy regression on air cargo volume forecast
CN108399248A (en) A kind of time series data prediction technique, device and equipment
Mazaheri et al. Assessing grounding frequency using ship traffic and waterway complexity
Pérez-Díaz et al. Introduction: Handling uncertainty in the geosciences: identification, mitigation and communication
Aldous Ship operational efficiency: performance models and uncertainty analysis
Emamgholizadeh et al. Comparison of artificial neural networks, geographically weighted regression and Cokriging methods for predicting the spatial distribution of soil macronutrients (N, P, and K)
Contractor et al. Efficacy of Feedforward and LSTM Neural Networks at predicting and gap filling coastal ocean timeseries: Oxygen, nutrients, and temperature
CN114662575A (en) Wind power water area ship navigation risk estimation method and system and storage medium
Brett Putting the public on trial: Can citizen science data be used in litigation and regulation
Jalagam et al. Water Quality Predictions for Urban Streams Using Machine Learning
Barnes et al. Adding uncertainty to neural network regression tasks in the geosciences
Zhang et al. Multi-criteria group decision-making with cloud model and TOPSIS for alternative selection under uncertainty
Meng et al. Remote Sensing for Subsurface and Deeper Oceans: An overview and a future outlook
Foss et al. Using an autonomous underwater vehicle with onboard stochastic advection‐diffusion models to map excursion sets of environmental variables
CN108171002A (en) A kind of polypropylene melt index Forecasting Methodology based on semi-supervised mixed model
Karagiannidis Data-driven ship propulsion modelling with applications in the performance analysis and fuel consumption prediction
Moon Integration and fusion of geological exploration data: a theoretical review of fuzzy logic approach
Zhong et al. Estimating link flows in road networks with synthetic trajectory data generation: Inverse reinforcement learning approach
Punzo et al. Sensitivity analysis
Milliff et al. Uncertainty management in coupled physical-biological lower trophic level ocean ecosystem models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination