CN118014605A - Port operation carbon emission prediction method based on WOA-BPNN model - Google Patents
Port operation carbon emission prediction method based on WOA-BPNN model Download PDFInfo
- Publication number
- CN118014605A CN118014605A CN202311532311.8A CN202311532311A CN118014605A CN 118014605 A CN118014605 A CN 118014605A CN 202311532311 A CN202311532311 A CN 202311532311A CN 118014605 A CN118014605 A CN 118014605A
- Authority
- CN
- China
- Prior art keywords
- woa
- data
- bpnn
- model
- carbon emission
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 title claims abstract description 65
- 229910052799 carbon Inorganic materials 0.000 title claims abstract description 65
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000012549 training Methods 0.000 claims abstract description 31
- 238000013528 artificial neural network Methods 0.000 claims abstract description 27
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 25
- 238000005457 optimization Methods 0.000 claims abstract description 14
- 241000283153 Cetacea Species 0.000 claims description 15
- 238000011985 exploratory data analysis Methods 0.000 claims description 14
- 238000012423 maintenance Methods 0.000 claims description 11
- 230000000694 effects Effects 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 8
- 210000002569 neuron Anatomy 0.000 claims description 6
- 230000006399 behavior Effects 0.000 claims description 4
- 238000011478 gradient descent method Methods 0.000 claims description 4
- 238000012804 iterative process Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 description 25
- 208000025174 PANDAS Diseases 0.000 description 10
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 10
- 240000004718 Panda Species 0.000 description 10
- 235000016496 Panda oleosa Nutrition 0.000 description 10
- 230000002159 abnormal effect Effects 0.000 description 8
- 238000009826 distribution Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000005265 energy consumption Methods 0.000 description 6
- 238000010606 normalization Methods 0.000 description 6
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 238000012847 principal component analysis method Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 description 2
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000005611 electricity Effects 0.000 description 2
- 238000004134 energy conservation Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 239000003208 petroleum Substances 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 239000008399 tap water Substances 0.000 description 2
- 235000020679 tap water Nutrition 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/08—Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
- G06Q10/083—Shipping
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/018—Certifying business or products
- G06Q30/0185—Product, service or business identity fraud
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Marketing (AREA)
- Human Resources & Organizations (AREA)
- Finance (AREA)
- Operations Research (AREA)
- Data Mining & Analysis (AREA)
- Game Theory and Decision Science (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Educational Administration (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a port operation carbon emission prediction method based on a WOA-BPNN model, which comprises the steps of determining the threshold value and the weight of an input layer, an hidden layer, an output node number and each layer of neural network node of the BPNN after normalizing sample data; adopting a group intelligent optimization algorithm WOA, and carrying out iterative training by taking the group position as an optimization parameter to obtain an optimal group position; taking the optimal population position as a threshold value and a weight value of the BPNN neural network node, and performing BPNN model training; establishing optimal learning rate and training batch parameters through parameter adjustment to obtain a training model with minimum prediction error; and predicting by using the trained model to obtain the predicted value of the carbon emission of the port operation. According to the method, the initial setting structure parameters of the BPNN are optimized by the WOA, so that the initial parameters of the model structure are the optimal values under the setting standard instead of the random values, the accuracy of predicting the port operation carbon emission is improved, and the generalization capability of the prediction model is enhanced.
Description
Technical Field
The invention relates to the field of prediction of carbon emission of water transport ports, in particular to carbon emission prediction of water transport infrastructures such as ports and wharfs in a full life cycle operation stage.
Technical Field
The carbon emission generated during the operation of the full life cycle stage of the infrastructures such as ports, wharfs and the like is accurately predicted, and the carbon emission is a necessary function of energy conservation and carbon saving in the field of water transportation. The basic method is to predict the carbon emission of the port and the wharf as a target, and based on characteristic data of the port and the wharf and historical operation carbon emission statistical data, nonlinear fitting is carried out by adopting different methods such as mathematical statistics, artificial intelligence, a mixed model and the like, so as to obtain an accurate carbon emission predicted value.
The existing carbon emission prediction methods in the traffic field are mainly divided into two types:
1. The prediction method based on mathematical statistics comprises the following steps: statistical correlations between various influencing factors (energy consumption, macroeconomy) and carbon emissions were collected. The method is based on a predictive model of data with mathematical statistics, probability and random process theory, a common time series model, a gray model, an autoregressive integrated moving average model (ARIMA) and the like. Such methods only utilize macroscopic data and historical data, without regard to the mechanism by which carbon emissions are produced, often predicting macroscopic trends in carbon emissions.
2. Prediction method based on artificial intelligence: based on a large amount of characteristic data of port and wharf infrastructures and a large amount of carbon emission-related statistical data, accurate prediction of carbon emission is achieved through an artificial intelligent nonlinear mapping prediction model. The common artificial intelligent prediction method for carbon emission in the engineering field comprises a regression algorithm, a decision tree algorithm, a Bayesian network, a support vector machine, a KNN algorithm, a BP neural network, an ant routing method, a random forest method and the like. The method is based on a typical machine learning algorithm model, can perform data mining and feature extraction on a large amount of input data, and can fully consider the complexity of carbon emission influencing factors.
Table 1 superiority and limitations of conventional machine learning algorithms
However, the above prior art has at least the following problems: the prediction method based on mathematical statistics cannot analyze the mechanism of carbon emission generation, and cannot generate practical guiding advice for realizing energy conservation and emission reduction of water transport infrastructures such as ports and wharfs. The existing traditional machine learning algorithms, such as regression algorithm, decision tree algorithm, support vector machine, KNN algorithm, ant optimizing algorithm, etc., have the limitations of easy under fitting, over fitting, affected by noise, etc., and the prediction accuracy and generalization capability are often inferior to those of the neural network model, as shown in table 1.
The prediction algorithm based on the feedback neural network (BPNN) model is to construct nonlinear mapping relations between input quantities such as port characteristic data and historical operation data and predicted carbon emission value output quantities by constructing a multi-layer feedback neural network, train weights and thresholds of the neural network through an error back propagation and gradient descent method, and minimize output result errors. The algorithm has higher precision, but the network structure parameters, the weights and the thresholds of the neurons at each layer have randomness, and the predicted values are easy to fall into local optimum rather than global optimum. In addition, the generalization capability of the prediction model is poor, and the training time is long.
Disclosure of Invention
In order to solve the technical problems, the invention provides a BPNN (WOA-BPNN) carbon emission prediction model based on a group intelligent algorithm, namely a whale algorithm (Whale Optimization Algorithm, WOA), which improves the global optimal solution searching capability of the port and dock carbon emission prediction model so as to improve the prediction precision and the generalization capability.
The complete technical scheme of the invention comprises the following steps:
A port operation carbon emission prediction method based on a WOA-BPNN model utilizes an initial setting structure parameter of WOA optimized BPNN to enable the initial parameter of a model structure to be an optimal value under a setting standard instead of a random value, thereby improving accuracy of predicting port operation carbon emission and enhancing generalization capability of a prediction model. The method specifically comprises the following steps:
(1) Standardized sample data, which is divided into a training set and a testing set; the data includes at least: port operation basic data, port operation and maintenance activity detailed data, and monthly carbon emission of various unit projects for operation and maintenance;
(2) Determining structural parameters of the BPNN: setting the number m of nodes of an input layer, the number o of nodes of an hidden layer, the number n of nodes of an output layer and the threshold value and the weight of each layer of neural network node of the neural network according to port characteristic data, historical operation data and operation unit engineering carbon emission data to be predicted;
(3) Setting initial values of whale population parameters including population scale, population position, iteration times, search space upper and lower boundaries and dynamic weights by adopting a population intelligent optimization algorithm WOA, randomly generating whale population, and carrying out iterative training by taking the population position as an optimization parameter to obtain an optimal population position;
(4) Taking the optimal population position as a threshold value and a weight value of the BPNN neural network node, training a BPNN model, and iterating the weight value and the threshold value of the neural network node by adopting a gradient descent method in error back propagation in the training process;
(5) After iteration convergence, obtaining the optimal BPNN weight and threshold value of each layer of neural network node;
(6) The optimal learning rate, training batch and other parameters are established through parameter adjustment, a training model with the minimum prediction error is obtained, and training is completed;
(7) And predicting by using the trained model to obtain the predicted value of the carbon emission of the port operation.
Further, in step (1), the port operation basic data includes: port container throughput, passenger throughput, bulk cargo throughput, ore throughput, petroleum and gas throughput, etc.; the number of liquid bulk cargos, the number of dry bulk cargos, the number of containers, the weight of roll-on vehicles, and the number of roll-on vehicles; port berth length, port berth number;
The port operation and maintenance activity basic data comprises: the average number of workers in the operation year, month and month, the total amount of consumed power in the month, the total amount of consumed tap water in the month, the total amount of consumed fossil energy in the month and the total amount of consumed warm water in the month;
The port operation and maintenance activity detailed data comprises: the type and the quantity of the operation energy consumption of each instrument and each infrastructure of the port, the working time, the consumption of various materials required by operation and maintenance, the type and the quantity of the transportation equipment in the factory, the carrying weight and the transportation distance, and the quantity, the type and the occupied area of the buildings in the field area;
The monthly carbon emission of the operation and maintenance unit projects comprises: monthly main production campaign carbon emissions, auxiliary production campaign carbon emissions, supply chain carbon emissions, total carbon emissions.
Further, in step (1), the normalization includes EDA exploratory data analysis and data feature engineering.
The EDA exploratory data analysis includes: after data is input, reading the number i and the number j of the data row, confirming matching with the dimension of the input data, checking the data type by using a pandas.info () function and a pandas.isna (). Sum () function, wherein the int or float type data is a normal data type, the object type represents the variable mixed by a number, a character string and other types in the column, checking the variable content by using a combine.xxx.value_counts () function, wherein xxx is the name of the object type data column, replacing the abnormal value with the average number of the column by using a pandas.replay () function after finding the abnormal value, and converting the data column into the column data type by using a pandas.aspype () function to complete the unification of the data types; then carrying out distribution observation on statistical indexes such as data types, data amounts, data extremum, mean values, variances and the like of the data by utilizing pandas correlation functions; according to the data distribution statistical result and the abnormal variable box diagram, carrying out abnormal value cutoff on the data with the outlier; finally, for the data types using int64 and float64, a memory compression function is constructed, and variables in the data are compressed according to the variable type int or float and the memory length of the maximum and minimum value of each column.
The data feature engineering is to conduct feature extraction on data analyzed and processed by EDA exploratory data; comprising: removing irrelevant columns; encoding discrete features; date feature coding; expanding characteristics; feature normalization processing and PCA principal component analysis method.
The culling irrelevant columns includes: and reading the number of different values in each column, and deleting the variable with only one value or the data without any data rule.
The discrete feature encoding includes: selecting a discrete feature column from the data distribution result, performing One-hot coding on the discrete feature column, and mapping the classification value to an integer value; wherein the indexed type is marked as an integer 1, and the other types are zero values. Specifically, the instrument energy consumption types machineComsumptionType include a gasoline consumption type, a diesel consumption type, and an electricity consumption type; splitting it into a gasoline index machineComsumptionType _1, a diesel index machineComsumptionType _2, an electric index machineComsumptionType _3; if the instrument consumes gasoline, the corresponding machineComsumptionType _1 value is 1, machineComsumptionType _2 value is 0, machineComsumptionType _3 value is also 0, and so on.
The date feature code includes: the date-related data columns in the data set are extracted, and the date_proc and date_transform functions are constructed and converted into a year-month form.
The feature augmentation includes: numerical addition is adopted for container throughput, bulk throughput and ore throughput data. For the operation quantity of the instruments and the operation duration data of the instruments, multiplying the values; analyzing the correlation between each characteristic and the carbon emission, and screening the characteristic with higher correlation with the carbon emission through correlation thermodynamic diagram observation; constructing a feature crossing function, crossing the classification feature and the numerical feature, and generating a statistical feature; for each classification feature (each feature in cat_col), the feature cross function calculates the maximum, minimum, and median with the numerical feature (each feature in num_col) and incorporates these newly generated features into the original data box.
The feature normalization process includes: the influence of dimension and magnitude is eliminated by converting the original data x into data y bound to a specific range by using the maximum value and the minimum value of the variable value, and the mathematical expression is as follows:
the PCA principal component analysis method for reducing dimension comprises the following steps: the original features are projected onto the new coordinate axis by linear transformation so that the variance of the data on the new coordinate axis is maximized, and then the existing data is reduced to the set dimension using the sklearn's composition function.
Further, in step (3), the dimension of the population size is determined by the neural network structure parameter, including: the connection weight of the input layer and the hidden layer, the connection weight of the hidden layer and the output layer, and the threshold of all neuron nodes.
Further, the sum of the weight numbers of the neural network nodes of each layer is as follows:
weight number= (n x o) + (o x m);
The sum of the threshold numbers is:
Threshold number = n+m.
Further, in step (3), the population size of WOA is: population size = (n x o) + (o x m) +n+m.
Further, in step (3), during the WOA iteration, the initial value of the population position is set to a random value within the upper and lower bounds of the search space.
Further, in step (3), during the WOA iteration, whale hunting behavior includes three types of random search, narrowing the envelope, and spiral path line.
Further, the probability of random search is from large to small, and the value is fromAnd (5) determining. The probability of narrowing the envelope and spiral paths is 50% respectively, determined by the p-value.
The above whale hunting principle is expressed by the following mathematical expression:
wherein, p is the random selection probability of the two behaviors, and the value interval is [0,1]. For the current population location(s),For iterative population position,/>For the optimal solution of the current population position,/>The value interval of (1) is [0,2], the value of which is linearly decreased,/>The value of (2) is a random value between 0, 1. When the A is less than or equal to 1, the reduction of the surrounding or spiral path is selected. When |A| >1, population positions are randomly generated, which is a random search behavior simulating whale population.
And the method loops until the iteration converges. The error function is judged to be the mean square error MSE and the mean absolute percentage error MAPE of the neural network model predicted value. Thereby obtaining the optimal parameters of the neural network model structure.
Further, the iteration convergence condition of the step (5) is as follows: the predicted value error is less than the error set point.
Further, in the step (4), the iterative formula is:
Wherein w is a weight, b is a threshold, and a is an intermediate parameter in the iterative process.
Compared with the prior art, the method and the device realize the prediction of the carbon emission in the port full life cycle operation stage. In the field of port carbon emission prediction, port data features are fully mined by means of EDA data analysis and data feature engineering, and the utilization value of data in the model training process is enhanced. By means of the group intelligent optimization algorithm WOA, the method is combined with the BPNN model, and limitations caused by random value taking of the initial value of the BPNN are optimized. Through iteration of WOA, the optimal value of the BPNN structural parameter in the global range is searched, and then the BPNN structural parameter is put into BPNN training, so that the global searching capacity of the BPNN structural parameter is enhanced, and the problem of sinking into a local optimal solution is prevented. By the method, the prediction precision of the model can be improved, the generalization capability of the model is enhanced, and the convergence rate of the model is accelerated.
Drawings
FIG. 1 is a flow chart of a port carbon emission prediction method based on a WOA-BPNN model.
Fig. 2 is a schematic diagram of a neural network topology.
Fig. 3 is a schematic diagram of a whale hunting of the whale optimization algorithm.
Fig. 4 is a schematic graph of the spiral path of the whale optimization algorithm.
Fig. 5 is a schematic graph of a reduced bounding path for a whale optimization algorithm.
Detailed Description
The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings, and it is obvious that the described embodiments are only illustrative and not limiting of the present application.
FIG. 1 is a flow chart of a method for predicting carbon emissions based on a WOA-BPNN model in accordance with the present invention. As shown, the model comprises the steps of:
Step 01: start model
Step 02: obtaining input data, wherein the data types comprise: 1. port operation basic data: port container throughput, passenger throughput, bulk cargo throughput, ore throughput, petroleum and gas throughput, etc.; the number of liquid bulk cargos, the number of dry bulk cargos, the number of containers, the weight of roll-on vehicles, and the number of roll-on vehicles; port berth length and port berth number. 2. Port operation and maintenance activity basic data: the average number of workers in the operation year, month, the total power consumption in the month, the total tap water consumption in the month, the total heating in the month and the total fossil energy consumption in the month. 3. Port operation and maintenance activity detailed data: the type and the quantity of the operation energy consumption of each instrument and each infrastructure of the port, the working time, the consumption of various materials required by operation and maintenance, the type and the quantity of the transportation equipment in the factory, the carrying weight and the transportation distance, and the number, the type and the occupied area of the buildings in the field. 4. The monthly carbon emission of various unit projects is operated and maintained, and the monthly carbon emission comprises main production activity carbon emission, auxiliary production activity carbon emission, supply chain carbon emission and total carbon emission of each month. The 36 kinds of statistical data are collected, stored in the form of array and input into algorithm. The data is then normalized.
The data standardization is one of innovation points of the invention, and the data is subjected to primary processing and finding of hiding rules in the data by means of EDA exploratory data analysis and data characteristic engineering, wherein the data is inconsistent in scale, strength and the like of data reflection and difficult to effectively carry out subsequent operation due to different data units, distribution, characteristics and the like of the obtained data. The specific process of data normalization has an important influence on the training result of the final neural network. Therefore, the invention adopts the following specific EDA exploratory data analysis and data characteristic engineering means through analysis.
Step 03: EDA exploratory data analysis was performed on the data. After the data is input, the number i of rows and the number j of columns of the data are read, and the dimension matching with the input data is confirmed.
Firstly, combining project data characteristics to induce data characteristics; the data type is checked by using a pandas.info () function and a pandas.isna (). Sum () function, wherein the int or float type data is a normal data type, the object type means that the column has a variable mixed by a plurality of types such as numbers, character strings and the like, the variable content is checked by using a combine.xxx.value_counts () function, wherein xxx is the name of the column of the object type data, the abnormal value is replaced by the pandas.replay () function after the abnormal value is found, and the column of the data is converted into the column of the data type by using the pandas.aspype () function. And finishing unification of data types.
In addition, the pandas related functions are utilized to carry out distribution observation on statistical indexes such as data types, data amounts, data extremum, mean values, variances and the like on the data. And according to the data distribution statistical result and the abnormal variable box diagram, cutting off the abnormal value of the data with the outlier.
Finally, the data types of int64 and float64 are used according to all variables in the database, so that the memory is wasted and the data compression is needed. And constructing a memory compression function, compressing variables in data according to the variable type int or float and the memory length of the maximum and minimum values of each row, and saving the memory space.
Step 04: and performing data characteristic engineering. This step includes extracting data features from the step processed in step 03. Mainly comprises the following procedures: removing irrelevant columns; encoding discrete features; date feature coding; expanding characteristics; feature normalization processing based on a polar difference method; and reducing the dimension by a PCA principal component analysis method.
Reject the irrelevant columns: reading the number of different values in each column is not helpful to training for variables with only one value, such as the number of berths in a port. Or extremely unbalanced data, can not provide useful information to the model. Or data which does not contain any data rule, such as material type numbers, etc., can be deleted.
Discrete feature coding: and selecting a discrete feature column from the data distribution result, performing One-hot coding on the discrete feature column, and mapping the classification value to an integer value. Where the indexed type marks integer 1, the other types are all zero values. For example, the instrument energy consumption types machineComsumptionType are classified into a gasoline consumption type, a diesel consumption type, and an electricity consumption type. It is split into a gasoline index machineComsumptionType _1, a diesel index machineComsumptionType _2, an electric index machineComsumptionType _3. If the instrument consumes gasoline, the corresponding machineComsumptionType _1 value is 1, machineComsumptionType _2 value is 0, machineComsumptionType _3 value is also 0, and so on.
Date feature encoding: the date-related data columns in the data set are extracted, and the date_proc and date_transform functions are constructed and converted into a year-month form. Such as year and month of operation data recordDate, which is converted to recordDate _year, recordDate _mole.
Feature expansion: numerical addition is adopted for data such as container throughput, bulk cargo throughput, ore throughput and the like. And (3) multiplying the data such as the operation quantity of the instruments and the operation duration of the instruments by numerical values, so that more information is expanded. Meanwhile, the correlation between each characteristic and the carbon emission is analyzed, and the characteristic with higher correlation with the carbon emission is screened through the observation of a correlation thermodynamic diagram. And constructing a feature crossing function, crossing the classification features and the numerical features, and generating new statistical features. For each classification feature (each feature in cat_col), the function calculates the maximum, minimum and median of the numerical features (each feature in num_col) and incorporates these newly generated features into the original data box.
Finally, the feature is normalized by utilizing MinMaxScaler function skleam, and then PCA dimension reduction is carried out to obtain the final data size. Wherein, normalization is to eliminate the influence of dimension and magnitude by converting the original data x into data y bound to a specific range by using the maximum value and the minimum value of the variable value, and the mathematical expression is as follows:
the PCA projects the original features onto a new coordinate axis through linear transformation, so that the variance of the data on the new coordinate axis is maximum, and the purpose of the PCA is to reduce the number of the features, improve the model training efficiency and remove redundant information among the features. The existing data is reduced to a set dimension, such as 30 dimensions, using a sklearn's composition function, resulting in final data.
Step 05: and finishing data preprocessing to obtain input data required by the training model.
Step 11: the topology is determined according to the neural network structure of BPNN, as shown in fig. 2. The method comprises the steps of inputting the number of layer neuron nodes, outputting the number of layer neuron nodes, hiding the number of layers and hiding the number of layer nodes. The data dimension of the number of the BPNN input layer nodes is the dimension of the input data after pretreatment, and the number of the BPNN output layer nodes is the sum of the total carbon emission and the number of unit engineering carbon emission in the port operation stage.
Step 12, initializing weights and thresholds of the BPNN model, including: the connection weight of the input layer and the hidden layer, the connection weight of the hidden layer and the output layer, and the threshold value of each neuron node. And setting the BPNN weight and the threshold value as population parameter input values of WOA, and training to obtain the optimized BPNN weight and the threshold value initial set values, namely, step 13.
In step 12 to step 13, global optimization of the neural network weights and the threshold initial set values is performed by using a group intelligent optimization algorithm WOA, as shown in fig. 3-5. In this process, the model structure of the WOA needs to be determined first (step 21). Comprising: population size, maximum number of iterations, whale population location, search space upper and lower bounds, and dynamic weights. The dimension of the population scale is determined by the structural parameters of the BPNN. The number of nodes of the input layer of the neural network is m, the hidden layer number is 1, the number of nodes of the hidden layer is o, the number of nodes of the output layer is n, and the sum of the weight numbers of the nodes of the neural network of each layer is:
weight number= (n x o) + (o x m)
The sum of the threshold numbers is:
Threshold number=n+m
The population scale of WOA is:
Population size = (n x o) + (o x m) +n+m
Step 22, after setting the population scale, the population position, namely the weight and the threshold value, are valued. The initial value of the population position is set to a random value.
Step 23, using the training error of the BPNN as a population fitness function,
In step 24, in the training process, in order to jump out of the locally optimal solution, the population has a certain probability of selecting a randomly generated population position, which is called random search. The probability of random search is from big to small all the time, and the value is determined by the parameter A. A is defined as:
Wherein, And/>Are all intermediate variables of WOA algorithm,/>The value is [0,2], linearly decreasing,/>The value of (2) is a random value between [0,1 ]. /(I)And/>The value of (1) is such that/>The value interval of (2-2)/(2)The absolute value interval of (2) is located at [0,2]. It is specified that when |a| >1, population positions are randomly generated. When |A| is less than or equal to 1, the population position is contracted and surrounded (step 25-1) or moved along the spiral path (step 25-2). The occurrence probability of the narrowing-down envelope and the spiral path is set to a set value p. When p <0.5, a narrowing-surrounding motion is performed, and when p is equal to or greater than 0.5, a spiral path motion is performed. The mathematical expression of the two movements is as follows:
Wherein, For the next population position,/>For the current population location,/>For the current optimal population position,/>The population position after the iteration of this step. /(I)And/>L and b are constants.
And 26-27, calculating the fitness of the population group in each training, continuously updating the population position by taking the fitness as a convergence condition, and calculating the fitness again until the fitness reaches a set standard value, wherein the output of the WOA model is the global optimal solution. And (3) importing the optimized BPNN structural parameters into initial values which are imported into the BPNN model structural parameters (entering step 13), and continuing the training of the BPNN.
And 14, after obtaining the optimal set values of the BPNN weight and the threshold value, performing model calculation to obtain a prediction error MAE value or MAPE value, reversely deriving and transmitting the error from the output layer to the input layer, and updating the threshold value and the weight value of each layer of the optimized neural network. The back propagation uses a gradient descent method to perform iterative updating optimization of weights and thresholds (step 15), and the iterative formula is as follows:
Wherein w is a weight, b is a threshold, and a is an intermediate parameter in the iterative process.
And step 16, judging convergence conditions, and completing model training when the predicted value error meets the conditions and is smaller than the error set value.
Step 17: the trained pattern was used to predict port carbon emissions.
The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and any simple modification, variation and equivalent structural changes made to the above embodiment according to the technical substance of the present invention still fall within the scope of the technical solution of the present invention.
Claims (10)
1. A port operation carbon emission prediction method based on a WOA-BPNN model is characterized by comprising the following steps of;
(1) Standardized sample data, which is divided into a training set and a testing set; the data includes at least: port operation basic data, port operation and maintenance activity detailed data, and monthly carbon emission of various unit projects for operation and maintenance;
(2) Determining structural parameters of the BPNN: setting the number m of nodes of an input layer, the number o of nodes of an hidden layer, the number n of output nodes and the threshold value and weight of each layer of neural network nodes of the neural network according to port characteristic data, historical operation data and operation unit engineering carbon emission data to be predicted;
(3) Setting initial values of whale population parameters including population scale, population position, iteration times, search space upper and lower boundaries and dynamic weights by adopting a population intelligent optimization algorithm WOA, randomly generating whale population, and carrying out iterative training by taking the population position as an optimization parameter to obtain an optimal population position;
(4) Taking the optimal population position as a threshold value and a weight value of the BPNN neural network node, training a BPNN model, and iterating the weight value and the threshold value of the neural network node by adopting a gradient descent method in error back propagation in the training process;
(5) After iteration convergence, obtaining an optimal BPNN weight threshold value of each layer of neural network node;
(6) The optimal learning rate and training batch parameters are established through parameter adjustment, a training model with the minimum prediction error is obtained, and training is completed;
(7) And predicting by using the trained model to obtain the predicted value of the carbon emission of the port operation.
2. The port operation carbon emission prediction method based on the WOA-BPNN model according to claim 1, wherein the standardization includes EDA exploratory data analysis and data feature engineering;
3. The method for predicting carbon emissions in port operations based on WOA-BPNN model as claimed in claim 1, wherein in step (3), the dimension of population size is determined by neural network structure parameters, comprising: the connection weight of the input layer and the hidden layer, the connection weight of the hidden layer and the output layer, and the threshold of all neuron nodes.
4. The port operation carbon emission prediction method based on the WOA-BPNN model as set forth in claim 3, wherein the sum of the weight numbers of the neural network nodes at each layer is:
weight number= (n x o) + (o x m);
The sum of the threshold numbers is:
Threshold number = n+m.
5. The port operation carbon emission prediction method based on the WOA-BPNN model as defined in claim 4, wherein in step (3), the population size of WOA is: population size = (n x o) + (o x m) +n+m.
6. The method for predicting carbon emissions in port operations based on WOA-BPNN model as recited in claim 5, wherein in step (3), the initial value of the population position is set to a random value within the upper and lower bounds of the search space during WOA iteration.
7. The method for predicting carbon emissions in port operations based on WOA-BPNN model as recited in claim 6, wherein in step (3), the behavior of whale hunting during WOA iteration includes three types of random search, narrowing-down envelope, and spiral path line.
8. The port operation carbon emission prediction method based on the WOA-BPNN model as claimed in claim 7, wherein the probability of random search is from large to small; the probability of narrowing the envelope and spiral paths is 50%, respectively.
9. The port operation carbon emission prediction method based on the WOA-BPNN model as set forth in claim 1, wherein the iterative convergence condition of step (5) is: the predicted value error is less than the error set point.
10. The port operation carbon emission prediction method based on the WOA-BPNN model as set forth in claim 1, wherein in step (4), the iterative formula is:
Wherein w is a weight, b is a threshold, and a is an intermediate parameter in the iterative process.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311532311.8A CN118014605A (en) | 2023-11-17 | 2023-11-17 | Port operation carbon emission prediction method based on WOA-BPNN model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311532311.8A CN118014605A (en) | 2023-11-17 | 2023-11-17 | Port operation carbon emission prediction method based on WOA-BPNN model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118014605A true CN118014605A (en) | 2024-05-10 |
Family
ID=90949298
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311532311.8A Pending CN118014605A (en) | 2023-11-17 | 2023-11-17 | Port operation carbon emission prediction method based on WOA-BPNN model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118014605A (en) |
-
2023
- 2023-11-17 CN CN202311532311.8A patent/CN118014605A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110909924B (en) | Urban expansion multi-scenario simulation cellular automaton method based on cross entropy optimizer | |
CN113887916A (en) | Dynamic quantitative evaluation method and system for line loss of power distribution network | |
CN113822499B (en) | Train spare part loss prediction method based on model fusion | |
CN110135635A (en) | A kind of region electric power saturation load forecasting method and system | |
CN109787821B (en) | Intelligent prediction method for large-scale mobile client traffic consumption | |
CN105184398A (en) | Power maximum load small-sample prediction method | |
CN114066196A (en) | Power grid investment strategy optimization system | |
CN104732091A (en) | Cellular automaton river bed evolution prediction method based on natural selection ant colony algorithm | |
CN104732067A (en) | Industrial process modeling forecasting method oriented at flow object | |
Kamalzadeh et al. | Modeling and Prediction of Iran's Steel Consumption Based on Economic Activity Using Support Vector Machines | |
CN117807374A (en) | Spare part abnormal leading data identification method, device and computer equipment | |
CN115481841A (en) | Material demand prediction method based on feature extraction and improved random forest | |
CN115481844A (en) | Distribution network material demand prediction system based on feature extraction and improved SVR model | |
CN118014605A (en) | Port operation carbon emission prediction method based on WOA-BPNN model | |
CN114254828B (en) | Power load prediction method based on mixed convolution feature extractor and GRU | |
CN114897262A (en) | Rail transit equipment fault prediction method based on deep learning | |
CN116108963A (en) | Electric power carbon emission prediction method and equipment based on integrated learning module | |
CN108596781A (en) | Data mining and prediction integration method for power system | |
Fadoul et al. | Integrating Autoencoder and Decision Tree Models for Enhanced Energy Consumption Forecasting in Microgrids: A Meteorological Data-Driven Approach in Djibouti | |
Li et al. | Performance prediction of a production line with variability based on grey model artificial neural network | |
Satpute et al. | Predictive Modeling of Vehicle CO 2 Emissions Using Machine Learning Techniques: A Comprehensive Analysis of Automotive Attributes | |
Yang et al. | Man-machine Collaborative Welding Time Prediction Based on Simulated Annealing Algorithm | |
Sasi et al. | Modeling of carbon dioxide (CO2) emissions | |
CN118297633A (en) | Power consumption prediction method based on multielement XGBoost combined model | |
Zhu et al. | A Study on the Carbon Emission Prediction Method in the Ceramic Industry using Stacking Ensemble Learning based on Bayesian Optimization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |