WO2021196743A1 - 热带气旋强度预报信息的生成方法及系统 - Google Patents

热带气旋强度预报信息的生成方法及系统 Download PDF

Info

Publication number
WO2021196743A1
WO2021196743A1 PCT/CN2020/136689 CN2020136689W WO2021196743A1 WO 2021196743 A1 WO2021196743 A1 WO 2021196743A1 CN 2020136689 W CN2020136689 W CN 2020136689W WO 2021196743 A1 WO2021196743 A1 WO 2021196743A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
model
forecast
tropical cyclone
information
Prior art date
Application number
PCT/CN2020/136689
Other languages
English (en)
French (fr)
Inventor
刘健
靳晴文
简洪登
杜小平
范湘涛
Original Assignee
中国科学院空天信息创新研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院空天信息创新研究院 filed Critical 中国科学院空天信息创新研究院
Publication of WO2021196743A1 publication Critical patent/WO2021196743A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • the invention relates to the technical field of meteorological information processing, in particular to a method and system for generating tropical cyclone intensity forecast information.
  • Tropical cyclones are one of the important catastrophic weather systems, which can cause serious economic losses. For this reason, in order to be able to solve the problem of tropical cyclone intensity forecasting, researchers have established various forecasting methods, the most common of which are continuous weather forecasting, dynamic forecasting, statistical dynamic forecasting, and ensemble forecasting.
  • the forecast error of the statistical dynamic forecast is smaller than the error of the numerical simulation forecast.
  • the statistical dynamic forecast cannot solve all the problems, and large deviations usually occur.
  • the present invention provides a method and system for generating tropical cyclone intensity forecast information, so as to achieve the purpose of improving forecast accuracy.
  • a method for generating tropical cyclone intensity forecast information comprising:
  • Preprocessing the to-be-processed data to obtain initial data including climate persistence feature factors, environmental feature factors, and brainstorm feature factors;
  • the initial data is predicted by using a preset forecast model to obtain tropical cyclone intensity forecast information, and the forecast model represents a model trained by training samples, and the training samples match the initial data.
  • the initial data includes climate persistence feature factors
  • the preprocessing of the to-be-processed data to obtain the initial data includes:
  • the tropical cyclone data is constructed with predictor data to obtain the climate persistence characteristic factor.
  • the initial data includes environmental feature factors
  • the preprocessing of the to-be-processed data to obtain the initial data includes:
  • the environmental information is constructed by adopting a preset construction mode to obtain environmental characteristic factors, and the preset construction mode represents a processing model capable of determining the relationship between various attribute information in the environmental information.
  • the method further includes creating a forecast model, including:
  • sample data including climate persistence feature factors, environmental feature factors, and brainstorm feature factors;
  • the prediction model includes an XGBoost model
  • the use of a preset prediction model to predict the initial data to obtain tropical cyclone intensity forecast information includes:
  • the XGBoost model and the classification regression tree are used to train the initial data to obtain tropical cyclone intensity forecast information.
  • a tropical cyclone intensity forecast information generation system includes:
  • the obtaining unit is used to obtain the to-be-processed data according to different data dimensions
  • the preprocessing unit is configured to preprocess the to-be-processed data to obtain initial data, where the initial data includes continuous climate feature factors, environmental feature factors, and brainstorm feature factors;
  • the prediction unit is configured to predict the initial data by using a preset forecast model to obtain tropical cyclone intensity forecast information.
  • the forecast model represents a model trained by training samples, and the training samples match the initial data.
  • the preprocessing unit includes:
  • the first acquisition subunit is used to acquire tropical cyclone data in the data to be processed
  • the first construction subunit is used to construct the forecast factor data of the tropical cyclone data according to the time information determined by the difference between the current time and the preset time to obtain the climate persistence characteristic factor.
  • the preprocessing unit includes:
  • the second acquiring subunit is used to acquire environmental information in the to-be-processed data
  • the second construction subunit is used to construct the environmental information using a preset construction mode to obtain environmental feature factors, and the preset construction mode represents a processing model that enables the relationship between various attribute information in the environmental information to be determined.
  • system further includes a creation unit for creating a forecast model, and the creation unit includes:
  • the sample acquisition subunit is used to acquire sample data, the sample data including climate persistence feature factors, environmental feature factors, and brainstorm feature factors;
  • the verification subunit is used to verify the sample data to obtain target sample data, each sample of the target sample data includes parameters that meet specific conditions;
  • the training subunit is used to train the target sample data to obtain a prediction model.
  • the prediction model includes an XGBoost model
  • the prediction unit is specifically configured to:
  • the XGBoost model and the classification regression tree are used to train the initial data to obtain tropical cyclone intensity forecast information.
  • the present invention provides a method and system for generating tropical cyclone intensity forecast information, which acquires to-be-processed data according to different data dimensions, preprocesses the to-be-processed data, obtains initial data, and uses preset forecasts
  • the model predicts the initial data and obtains the tropical cyclone intensity forecast information. Since the initial data includes continuous climate characteristics, environmental characteristics, and brainstorm characteristics, it can make full use of the influencing factors of tropical cyclones, and combine the preset forecast models to predict tropical cyclone intensity forecast information, making the forecast processing more intelligent And objectification, the tropical cyclone forecasting system has been improved, and the forecasting accuracy has been improved.
  • FIG. 1 is a schematic flowchart of a method for generating tropical cyclone intensity forecast information according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a data processing flow provided by an embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of a tropical cyclone intensity forecast information generating system provided by an embodiment of the present invention.
  • An embodiment of the present invention provides a method for generating tropical cyclone intensity forecast information.
  • the method may include the following steps:
  • the data to be processed represents the currently obtained meteorological data, that is, the tropical cyclone intensity can be forecasted based on the meteorological data.
  • data can be obtained according to different data dimensions. For example, it can be based on meteorological data dimensions, environmental data dimensions, empirical data dimensions, and so on.
  • S102 Perform preprocessing on the data to be processed to obtain initial data.
  • the initial data obtained after processing includes: climate persistence characteristic factors, environmental characteristic factors, and brainstorm characteristic factors.
  • the climate persistence feature factor characterizes the feature factor constructed based on climate information
  • the environmental feature factor characterizes the feature factor constructed based on environmental information
  • the brainstorm feature factor characterizes the feature factor obtained based on the consensus of experts.
  • the forecast model is a model trained based on training samples, which can be used to predict tropical cyclone intensity forecast information. And the training samples of the model match the initial data, so that the initial data can be input to the forecast model to obtain forecast information.
  • the present invention provides a method for generating tropical cyclone intensity forecast information, which acquires data to be processed according to different data dimensions, preprocesses the data to be processed to obtain initial data, and uses a preset forecast model to predict the initial data to obtain tropical Cyclone intensity forecast information. Since the initial data includes continuous climate characteristics, environmental characteristics, and brainstorm characteristics, it can make full use of the influencing factors of tropical cyclones, and combine the preset forecast models to predict tropical cyclone intensity forecast information, making the forecast processing more intelligent And objectification, the tropical cyclone forecasting system has been improved, and the forecasting accuracy has been improved.
  • the basic data in the present invention can be derived from existing published data. For example, you can download the CMA-STI best path data set for the Pacific region from 1979 to 2017 from the China Meteorological Administration.
  • the data set includes latitude, longitude, 2-minute average maximum sustained wind (near tropical cyclone center), intensity category, and minimum pressure (near tropical cyclone center).
  • the tropical cyclone in the Northwest Pacific in the present invention is defined as a tropical cyclone that has passed through or generated in the Northwest Pacific. Tropical cyclones must have a life history of at least 48 hours. The area studied is north of the equator and west of 180°E.
  • the ERA reanalysis data of the European Center for Mid-range Weather Forecast download the reanalysis data of the Northwest Pacific Region from 1979 to 2017, including: 200, 250, 300, 350, 400, 450, 500, 700, 750, 775, 800, 825 and 850hPa relative humidity, zonal wind, meridional wind, relative vorticity, divergence and temperature properties.
  • the data mainly include climate persistence factors and environmental factors, climate persistence factors (from the best path data set obtained by the China Meteorological Administration) and environmental factors (from the atmosphere and ocean data set obtained from the European Medium-term Weather Forecast Center ERA reanalysis data).
  • FIG. 2 shows a schematic diagram of a data processing flow provided by an embodiment of the invention.
  • the data processing flow in Figure 2 includes data processing, obtaining predictive factors, adjusting model parameters, and running the model to obtain prediction results. specific:
  • the data to be processed is preprocessed to obtain the initial data, including:
  • the tropical cyclone data is constructed with predictor data to obtain the climate persistence characteristic factor.
  • the forecast factors are constructed, and 72 possible possibilities are specifically constructed.
  • climate persistence feature factors carry timeliness, that is, the corresponding climate persistence feature factors can be constructed according to the expected timeliness of prediction.
  • the forecast timeliness that is, the above-mentioned setting duration, can be flexibly set according to needs.
  • 72 specific climatic persistence characteristic factors that may affect tropical cyclones are specifically constructed, as shown in Table 1:
  • V25, V26, V27, V28 Longitude difference between current time and previous 6h, 12h, 18h, 24h time V29, V30, V31, V32
  • the air pressure difference between the current time and the previous 6h, 12h, 18h, 24h V33, V34, V35, V36 The difference in central wind speed between the current time and the previous 6h, 12h, 18h, 24h V37, V38, V39, V40
  • the current time and the previous 6h, 12h, 18h, 24h time zonal movement speed V41, V42, V43, V44 The current time and the previous 6h, 12h, 18h, 24h meridional speed V45, V46, V47, V48
  • the preprocessing of the to-be-processed data to obtain the initial data includes:
  • the environmental information is constructed by adopting a preset construction mode to obtain environmental characteristic factors, and the preset construction mode represents a processing model capable of determining the relationship between various attribute information in the environmental information.
  • the relative humidity, zonal wind, meridional wind, relative vorticity, divergence and temperature attributes are selected from the output results of the numerical forecast model.
  • 24 environmental factors that may affect tropical cyclones are constructed as predictors selected into the model.
  • the environmental factors are all from the reanalysis data of the European Center for Mid-range Weather Forecast.
  • the data is 1°*1°, and the time is 6-h interval.
  • the environmental factors selected relative humidity, zonal wind, meridional wind, relative vorticity, divergence, and temperature attributes are 200, 250, 300, 350, 400, 450, 500, 700, 750, 800, 850hPa. Divergence and relative humidity are calculated in each grid, using different methods of wind field information and center. All environmental predictors are average data of each different radius.
  • the maximum MPI is 80m/s.
  • Feature engineering is a superset of a set of activities including feature extraction and feature selection. Each step is an important step and should not be ignored. The importance can be summarized; according to experience, the relative importance of the steps will follow the following order: feature construction>feature extraction>feature selection.
  • the brainstorming factor corresponds to the feature construction. Brainstorming refers to spontaneous group discussions, the purpose of which is to solve problems or come up with good ideas. In order to accurately predict the intensity of tropical cyclones, the present invention extracts several key features from a large number of literature studies.
  • the potential predictors in traditional statistical typhoon intensity forecasting schemes include several quadratic terms and cosine functions. Therefore, the plan includes the cosine of the latitude at the current time, the square of the average maximum continuous wind in 2 minutes near the center of the tropical cyclone at the current time, and the cube of the average maximum continuous wind in 2 minutes near the center of the tropical cyclone at the current time. Refer to Table 3 to list 59 brainstorming factors.
  • an embodiment of the present invention also provides a method for creating a forecast model, and the method may include:
  • sample data including climate persistence feature factors, environmental feature factors, and brainstorm feature factors;
  • the best parameter combination can be selected through a specific function, so that the training of the model is more accurate.
  • the prediction model includes an XGBoost model
  • the use of a preset prediction model to predict the initial data to obtain tropical cyclone intensity forecast information includes:
  • the XGBoost model and the classification regression tree are used to train the initial data to obtain tropical cyclone intensity forecast information.
  • the XGBoost model can be used to predict energy consumption, traffic volume at intersections, image classification and other scenarios.
  • the XGBoost model combined with M classification regression trees is expressed as ⁇ T 1 (x i , y i )...T M (x i , y i ) ⁇
  • the predicted future intensity of the trained predictor related to xi being a tropical cyclone (y' i ):
  • f m is a tree, and F represents the space of CART.
  • regularization can be used. The formula is as follows:
  • loss function is the actual result of the prediction result y i y 'i of the difference
  • [tau] represents a regularization parameter
  • N represents the leaf node data
  • is the leaf node score
  • are used to describe the level of regularization.
  • subsampling can also prevent overfitting.
  • step t Represents the first derivative of the loss function, Represents the second derivative of the loss function.
  • I j
  • iq(x i ) j ⁇ is the distance of leaf node j
  • q(x) is the optimized leaf node weight
  • the XGBOOST model can accomplish this task. It has a variety of adjustable parameters. Limit the scope of this experiment to the use of rstudio to execute XGBOOST.
  • the ETA parameter reduces the weight of the feature, makes the calculation process more conservative, prevents overfitting, and uses a shrinking step in the update process.
  • the gamma parameter is the minimum loss reduction required to further partition the leaf nodes of the tree.
  • the max_depth parameter indicates the maximum depth of the subtree.
  • the min_child_weight parameter shows the minimum sum of the required instance weights in the child.
  • the sub-sample represents the ratio of the observed sub-sample.
  • the colsample byte tree parameter represents the ratio of variables used to construct each tree.
  • the extracted tropical cyclone sample database is a two-dimensional matrix
  • selecting the XGBOOST model can accurately predict the two-dimensional matrix data, but it is difficult to train all tree results at one time when using the gradient descent method to optimize the objective function, so the particle swarm algorithm is used to find the best result.
  • XGBOOST is a boosting tree algorithm, which can perform multi-threaded parallel calculations and generate new trees from generation to generation through iterations. In fact, it combines many weak learners with low classification performance into a strong learner with high accuracy. Each decision tree may not have a good classification effect, but the results of multiple classifications will definitely get more accurate predictions.
  • XGBOOST is basically to build K regression trees to make the accuracy rate, generalization good, prediction error as small as possible, and the objective function with as few leaf nodes as possible can train a better model, which is determined by particle swarm optimization and secondary optimization.
  • the optimal node and the smallest loss function are used as a basis for tree splitting to obtain small saplings. Then continue to split according to the above method and continue to form new trees. According to the previous predictions, the optimal tree will be established every time. Stop the iteration when you increase the depth; at this time, you get the most basic model, and then use methods such as raster search to optimize the lumped parameters.
  • Correlation coefficient (CC), mean absolute error (MAE) and normalized root mean square error (NRMES; expressed as a percentage) are selected as the parameters to evaluate the ability of the XGBOOST model in the training and testing phases.
  • ⁇ obs, i is the observed value of the i-th sample
  • ⁇ fore, i is the predicted value of the i-th sample
  • N is the number of all forecast samples, Is the average of the observations, Is the average of the predicted value.
  • the input parameters are climate persistence factor, environmental factor and brainstorming factor.
  • the output factor is the tropical cyclone intensity with a forecast time of 6, 12, 18, and 24 hours.
  • eta, gamma, max-depth, min-child-weight, subsample and colsample-bytree are (0.01, 0.1, 1), (0.1, 0.5, 0.8), (2, 4, 6, 8), (2 , 4, 8), 0.8 and 0.95, the best execution rate of XGBOOST was obtained. This setting results in 108 parameter combinations.
  • the cross-validation method is used to obtain the best parameter combination.
  • BPNN For example, use a BPNN to predict the intensity of tropical cyclones under the same sample input parameters.
  • the 24-hour result of BPNN method is 4.57m/s.
  • the 24-hour lead time MAE of the model of the present invention is 3.70m/s.
  • the prediction results of the XGBOOST model are better than the BPNN model with the same sample requirements.
  • the XGBOOST model has the advantages of simple training process, low computer processing cost, and fast convergence speed. Therefore, it is very advantageous to use the XGBOOST model to predict the intensity of tropical cyclones. Therefore, this finding supports the use of the XGBOOST model as a new tropical cyclone intensity prediction method for prediction within 24 hours.
  • an embodiment of the present invention also provides a tropical cyclone intensity forecast information generating system, which includes:
  • the obtaining unit 10 is configured to obtain data to be processed according to different data dimensions
  • the preprocessing unit 20 is configured to preprocess the to-be-processed data to obtain initial data, where the initial data includes continuous climate feature factors, environmental feature factors, and brainstorm feature factors;
  • the prediction unit 30 is configured to predict the initial data by using a preset forecast model to obtain tropical cyclone intensity forecast information, the forecast model characterizing a model obtained through training with training samples, and the training samples match the initial data .
  • the preprocessing unit includes:
  • the first acquisition subunit is used to acquire tropical cyclone data in the data to be processed
  • the first construction subunit is used to construct the forecast factor data of the tropical cyclone data according to the time information determined by the difference between the current time and the preset time to obtain the climate persistence characteristic factor.
  • the preprocessing unit includes:
  • the second acquiring subunit is used to acquire environmental information in the to-be-processed data
  • the second construction subunit is used to construct the environmental information using a preset construction mode to obtain environmental feature factors, and the preset construction mode represents a processing model that enables the relationship between various attribute information in the environmental information to be determined.
  • system further includes a creating unit for creating a forecast model, and the creating unit includes:
  • the sample acquisition subunit is used to acquire sample data, the sample data including climate persistence feature factors, environmental feature factors, and brainstorm feature factors;
  • the verification subunit is used to verify the sample data to obtain target sample data, each sample of the target sample data includes parameters that meet specific conditions;
  • the training subunit is used to train the target sample data to obtain a prediction model.
  • the prediction model includes an XGBoost model, and the prediction unit is specifically configured to:
  • the XGBoost model and the classification regression tree are used to train the initial data to obtain tropical cyclone intensity forecast information.
  • the present invention provides a tropical cyclone intensity forecast information generation system.
  • An acquisition unit acquires data to be processed according to different data dimensions.
  • the preprocessing unit preprocesses the data to be processed to obtain initial data.
  • the prediction unit uses a preset forecast model to The initial data is used for prediction, and the tropical cyclone intensity forecast information is obtained. Since the initial data includes continuous climate characteristics, environmental characteristics, and brainstorm characteristics, it can make full use of the influencing factors of tropical cyclones, and combine the preset forecast models to predict tropical cyclone intensity forecast information, making the forecast processing more intelligent And objectification, the tropical cyclone forecasting system has been improved, and the forecasting accuracy has been improved.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种热带气旋强度预报信息生成方法及系统,按照不同数据维度获取待处理数据(S101),对所述待处理数据进行预处理,得到初始数据(S102),利用预设预报模型对初始数据进行预测,得到热带气旋强度预报信息(S103)。由于初始数据包括气候持续特征因子、环境特征因子、头脑风暴特征因子,可以充分利用对热带气旋的影响因素,并结合预设的预报模型对热带气旋强度预报信息进行预测,使得预测处理更加智能化和客观化,完善了热带气旋预报系统,提高预报准确率。

Description

热带气旋强度预报信息的生成方法及系统 技术领域
本发明涉及气象信息处理技术领域,特别是涉及一种热带气旋强度预报信息的生成方法及系统。
背景技术
热带气旋是重要的灾害性天气系统之一,其可以造成严重的经济损失。为此为了能够解决热带气旋强度预报问题,研究人员建立了各种预报方法、其中最常见的是天气持续预报、动力预报、统计动力预报、集合预报等方法。统计动力预报的预报误差小于数值模拟预报的误差,然而,统计动力预报并不能解决所有的问题,通常会出现较大的偏差。
虽然现在也有采用基于机器学习的热带气旋统计动力强度预测的方法,但是,其也存在较多难点问题,如因为机器学习模型自身的缺陷。可见,现有的针对热带气旋强度的预测方案都存在预报准确率低的问题。
发明内容
针对于上述问题,本发明提供一种热带气旋强度预报信息的生成方法及系统,实现了提高预报准确率的目的。
为了实现上述目的,本发明提供了如下技术方案:
一种热带气旋强度预报信息生成方法,所述方法包括:
按照不同数据维度获取待处理数据;
对所述待处理数据进行预处理,得到初始数据,所述初始数据包括气候持续特征因子、环境特征因子、头脑风暴特征因子;+
利用预设预报模型对所述初始数据进行预测,得到热带气旋强度预报信息,所述预报模型表征通过训练样本训练得到的模型,所述训练样本与所述初始数据相匹配。
可选地,所述初始数据包括气候持续特征因子,所述对所述待处理数据进行预处理,得到初始数据,包括:
获取待处理数据中的热带气旋数据;
根据当前时刻与预设时间差确定的时间信息,对所述热带气旋数据进行预报因子的数据构造,得到气候持续特征因子。
可选地,所述初始数据包括环境特征因子,所述对所述待处理数据进行预处理,得到初始数据,包括:
获取所述待处理数据中的环境信息;
采用预设构建模式对所述环境信息进行构造,得到环境特征因子,所述预设构建模式表征能够使确定环境信息中各个属性信息的关系的处理模型。
可选地,所述方法还包括创建预报模型,包括:
获取样本数据,所述样本数据包括气候持续特征因子、环境特征因子和头脑风暴特征因子;
对所述样本数据进行验证,获得目标样本数据,所述目标样本数据的每条样本包括满足特定条件的参数;
对所述目标样本数据进行训练,得到预报模型。
可选地,所述预测模型包括XGBoost模型,所述利用预设预报模型对所述初始数据进行预测,得到热带气旋强度预报信息,包括:
根据所述目标样本数据生成分类回归树;
利用所述XGBoost模型和所述分类回归树,对所述初始数据进行训练,得到热带气旋强度预报信息。
一种热带气旋强度预报信息生成系统,所述系统包括:
获取单元,用于按照不同数据维度获取待处理数据;
预处理单元,用于对所述待处理数据进行预处理,得到初始数据,所述初始数据包括气候持续特征因子、环境特征因子、头脑风暴特征因子;
预测单元,用于利用预设预报模型对所述初始数据进行预测,得到热带气旋强度预报信息,所述预报模型表征通过训练样本训练得到的模型,所述训练样本与所述初始数据相匹配。
可选地,所述预处理单元包括:
第一获取子单元,用于获取待处理数据中的热带气旋数据;
第一构造子单元,用于根据当前时刻与预设时间差确定的时间信息,对所述热带气旋数据进行预报因子的数据构造,得到气候持续特征因子。
可选地,所述预处理单元包括:
第二获取子单元,用于获取所述待处理数据中的环境信息;
第二构造子单元,用于采用预设构建模式对所述环境信息进行构造,得到环境特征因子,所述预设构建模式表征能够使确定环境信息中各个属性信息的关系的处理模型。
可选地,所述系统还包括创建单元,用于创建预报模型,所述创建单元包括:
样本获取子单元,用于获取样本数据,所述样本数据包括气候持续特征因子、环境特征因子和头脑风暴特征因子;
验证子单元,用于对所述样本数据进行验证,获得目标样本数据,所述目标样本数据的每条样本包括满足特定条件的参数;
训练子单元,用于对所述目标样本数据进行训练,得到预报模型。
可选地,所述预测模型包括XGBoost模型,所述预测单元具体用于:
根据所述目标样本数据生成分类回归树;
利用所述XGBoost模型和所述分类回归树,对所述初始数据进行训练,得到热带气旋强度预报信息。
相较于现有技术,本发明提供了一种热带气旋强度预报信息生成方法及系统,按照不同数据维度获取待处理数据,对所述待处理数据进行预处理,得到初始数据,利用预设预报模型对初始数据进行预测,得到热带气旋强度预报信息。由于初始数据包括气候持续特征因子、环境特征因子、头脑风暴特征因子,可以充分利用对热带气旋的影响因素,并结合预设的预报模型对热带气旋强度预报信息进行预测,使得预测处理更加智能化和客观化,完善了热带气旋预报系统,提高预报准确率。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1为本发明实施例提供的一种热带气旋强度预报信息生成方法的流程示意图;
图2为本发明实施例提供的一种数据处理流程的示意图;
图3为本发明实施例提供的一种热带气旋强度预报信息生成系统的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明的说明书和权利要求书及上述附图中的术语“第一”和“第二”等是用于区别不同的对象,而不是用于描述特定的顺序。此外术语“包括”和“具有”以及他们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有设定于已列出的步骤或单元,而是可包括没有列出的步骤或单元。
在本发明实施例中提供了一种热带气旋强度预报信息生成方法,参见图1,该方法可以包括以下步骤:
S101、按照不同数据维度获取待处理数据。
其中,待处理数据表征了当前获得的气象数据,即可以根据该气象数据进行热带气旋强度的预报。为了能够便于对数据的处理和分析,可以根据不同数据维度进行数据获取。例如,可以根据气象数据维度、环境数据维度、经验数据维度等。
S102、对待处理数据进行预处理,得到初始数据。
由于获取到的待处理数据可能存在格式不统一,或者时间维度不准确等问题,需要对数据进行预处理。并且由于初始数据需要输入到预报模型进行预测,也需要对待处理数据进行预处理,使得构造成该模型可以处理的数据。经过处理后得到的初始数据包括:气候持续特征因子、环境特征因子、头脑风暴特征因子。其中,气候持续特征因子表征根据气候信息构造的特征因子,环境特征因子表征根据环境信息构造的特征因子,头脑风暴特征因子表征根据专家共识得到的特征因子。
S103、利用预设预报模型对初始数据进行预测,得到热带气旋强度预报信息。
其中,预报模型是根据训练样本训练得到的模型,可以用来对热带气旋强度预报信息进行预测。并且该模型的训练样本与初始数据相匹配的,使得可以将初始数据输入到预报模型,得到预报信息。
本发明提供了一种热带气旋强度预报信息生成方法,按照不同数据维度获取待处理数据,对所述待处理数据进行预处理,得到初始数据,利用预设预报模型对初始数据进行预测,得到热带气旋强度预报信息。由于初始数据包括气候持续特征因子、环境特征因子、头脑风暴特征因子,可以充分利用对热带气旋的影响因素,并结合预设的预报模型对热带 气旋强度预报信息进行预测,使得预测处理更加智能化和客观化,完善了热带气旋预报系统,提高预报准确率。
下面对本发明实施例中的数据处理和预测过程进行详细说明。
在本发明中的基础数据即对模型训练的样本数据可以来源于现有已公开数据。例如,可以从中国气象局下载1979-2017年太平洋地区的CMA-STI最佳路径数据集。数据集包括纬度、经度、2分钟平均最大持续风(靠近热带气旋中心)、强度类别和最小压力(靠近热带气旋中心)。此外,在本发明中的西北太平洋的热带气旋定义为已经通过或在西北太平洋区域生成的热带气旋。热带气旋必须至少有48小时的生命史。研究的区域范围赤道以北,东经180°以西。
欧洲中期天气预报中心ERA再分析资料,下载1979-2017年西北太平洋地区的再分析资料,包括:在200,250,300,350,400,450,500,700,750,775,800,825与850hPa的相对湿度、纬向风、经向风、相对涡度、发散度与温度属性。基于XGBoost模型预报热带气旋统计动力强度的预报方案。数据主要有气候持续因子与环境因子,气候持续因子(来自中国气象局得到的最佳路径数据集)与环境因子(来自欧洲中期天气预报中心ERA再分析资料得到的大气与海洋的数据集)。
参见图2,其示出了发明实施例提供的一种数据处理流程的示意图。在图2中数据的处理流程,包括了数据处理,得到预报因子,模型参数调整,运行模型得到预测结果。具体的:
当初始数据包括气候持续特征因子时,对待处理数据进行预处理,得到初始数据,包括:
获取待处理数据中的热带气旋数据;
根据当前时刻与预设时间差确定的时间信息,对所述热带气旋数据进行预报因子的数据构造,得到气候持续特征因子。
根据气候持续性预报方法,根据上述热带气旋路径样本数据中的热带气旋当前时刻及距当前时刻的时间差为6h、12h、18h和24h的热带气旋数据进行预报因子的构造,具体构造出72个可能影响热带气旋的气候持续特征因子,并生成一个单独的样本文件进行存储。
需要说明的是,气候持续特征因子携带有时效性,即可以根据希望达到的预测时效,来构造相应的气候持续特征因子。预测时效,也即上述的设定时长,可以根据需求而灵活设定。在本发明实施例中,具体构造72个可能影响热带气旋的气候持续特征因子,具体如表1所示:
表1气候特征预报因子
因子代号 所代表的实际意义
V1、V2、V3、V4 当前时刻的纬度,经度,中心气压,中心最大风速
V5、V6、V7、V8 前6h时刻的纬度,经度,中心气压,中心最大风速
V9、V10、V11、V12 前12h时刻的纬度,经度,中心气压,中心最大风速
V13、V14、V15、V16 前18h时刻的纬度,经度,中心气压,中心最大风速
V17、V18、V19、V20 前24h时刻的纬度,经度,中心气压,中心最大风速
V21、V22、V23、V24 当前时刻与前6h,12h,18h,24h时刻纬度差
V25、V26、V27、V28 当前时刻与前6h,12h,18h,24h时刻经度差
V29、V30、V31、V32 当前时刻与前6h,12h,18h,24h时刻气压差
V33、V34、V35、V36 当前时刻与前6h,12h,18h,24h时刻中心风速差
V37、V38、V39、V40 当前时刻与前6h,12h,18h,24h时刻纬向移速
V41、V42、V43、V44 当前时刻与前6h,12h,18h,24h时刻经向移速
V45、V46、V47、V48 当前时刻与前6h,12h,18h,24h时刻合成移速
V49、V50、V51、V52 当前时刻与前6h,12h,18h,24h时刻纬向加速度
V53、V54、V55、V56 当前时刻与前6h,12h,18h,24h时刻经向加速度
V57、V58、V59、V60 当前时刻与前6h,12h,18h,24h时刻合成加速度
V61、V62、V63、V64 当前时刻与前6h,12h,18h,24h时刻纬向位移
V65、V66、V67、V68 当前时刻与前6h,12h,18h,24h时刻经向位移
V69、V70、V71、V72 当前时刻与前6h,12h,18h,24h时刻合成位移
当所述初始数据包括环境特征因子,所述对所述待处理数据进行预处理,得到初始数据,包括:
获取所述待处理数据中的环境信息;
采用预设构建模式对所述环境信息进行构造,得到环境特征因子,所述预设构建模式表征能够使确定环境信息中各个属性信息的关系的处理模型。
在环境预报因子的选择上,根据国际上传统的统计动力模型的预报因子,从数值预报模式的输出结果中选择相对湿度、纬向风、经向风、相对涡度、发散度与温度属性,利用“完全预报”(PerfectProg)方法,构造24个可能影响热带气旋的环境因子,作为选入模型的预报因子。
环境因子都来自欧洲中期天气预报中心的再分析资料。资料是1°*1°,时刻是6-h间隔。环境因子选取相对湿度、纬向风、经向风、相对涡度、发散度、与温度属性在200,250,300,350,400,450,500,700,750,800,850hPa。发散度,相对湿度在每个网格都被计算,利用风场信息与中心的不同方法。所有环境预报因子都是平均每个不同半径的平均数据。
从ERA再分析数据集中选择的海表温度按照1979-2017年的最佳路径插入到热带气旋中心,以确定海表温度和强度之间的关系。计算得到的关系公式为(MPI=A+Be C(T-T_0)),A=18.42m/s,B=51.47m/s,C=0.09687℃-1,T 0=30.0℃。最大的MPI是80m/s。
表2环境预报因子
Figure PCTCN2020136689-appb-000001
在建立预报模型时,需要对每一个特征进行分析,这个分析过程即为特征工程。特征工程是包含特征提取和特征选择的一组活动的超集。每一步都是重要的一步,不应忽视。可以概括其重要性;根据经验,步骤的相对重要性将遵循以下顺序:特征构造>特征提取>特征选择。为了从原始数据中提取特征而不考虑它们的重要性,头脑风暴因子与特征构造相对应。头脑风暴是指自发的小组讨论,目的是解决问题或提出好的想法。为了准确预测热带气旋强度,本发明从大量文献研究中提取了几个关键特征。传统的统计台风强度预测方案中的潜在预测因子包括几个二次项和余弦函数。因此,该方案中包括当前时间纬度的余弦、当前时间热带气旋中心附近2分钟平均最大持续风的平方、当前时间热带气旋中心附近2分钟平均最大持续风的立方等。参照表3列出了59个头脑风暴因子。
表3头脑风暴预报因子
Figure PCTCN2020136689-appb-000002
Figure PCTCN2020136689-appb-000003
对应的,在本发明实施例中还提供了一种创建预报模型的方法,该方法可以包括:
获取样本数据,所述样本数据包括气候持续特征因子、环境特征因子和头脑风暴特征因子;
对所述样本数据进行验证,获得目标样本数据,所述目标样本数据的每条样本包括满足特定条件的参数;
对所述目标样本数据进行训练,得到预报模型。
其中,对样本数据进行验证可以通过特定的函数选择最佳的参数组合,使得模型的训练更加准确。
对应的,所述预测模型包括XGBoost模型,所述利用预设预报模型对所述初始数据进行预测,得到热带气旋强度预报信息,包括:
根据所述目标样本数据生成分类回归树;
利用所述XGBoost模型和所述分类回归树,对所述初始数据进行训练,得到热带气旋强度预报信息。
下面具体对预报模型进行说明。本发明中以XGBoost模型为例进行说明。
XGBoost模型可以用于预测能量消耗、十字路口的交通量、图像分类等场景。XGBoost模型结合M个分类回归树表示为{T 1(x i,y i)…T M(x i,y i)}训练的与x i是热带气旋相关的预报因子预报未来的强度(y’ i):
Figure PCTCN2020136689-appb-000004
f m是一个树,F代表CART的空间,为了避免模型的过拟合,所以可以使用正则化,公式如下表示:
Figure PCTCN2020136689-appb-000005
其中,l代表损失函数,损失函数是实际结果y i与预报结果y’ i的差异,τ代表一个正则化参数。
Figure PCTCN2020136689-appb-000006
N表示叶子节点数据,α是叶子节点得分,
Figure PCTCN2020136689-appb-000007
和θ是用来描述正则化的水平。除了使用正则项,子采样也可以阻止过拟合。
在XGBOOST模型中,预报过程中,加入每个树的结果可以获取最后的树,即最终的XGBOOST模型。需要决定每个树的参数(f t),包括树的结构与每个叶子节点的分数。训练方法在给定时间将树的结果添加到模型中。预报值(y i (t))在t步获取算法的过程:
Figure PCTCN2020136689-appb-000008
选择最优树,实现每一步的优化效果:
Figure PCTCN2020136689-appb-000009
二阶泰勒展开出现在上式中:
Figure PCTCN2020136689-appb-000010
Figure PCTCN2020136689-appb-000011
表示损失函数的第一次导数,
Figure PCTCN2020136689-appb-000012
表示损失函数的第二次导数。通过删除常数,得到步骤t的方程如下:
Figure PCTCN2020136689-appb-000013
添加正则项添加正则项
Figure PCTCN2020136689-appb-000014
到损失函数。代入目标函数,按如下顺序排列:
Figure PCTCN2020136689-appb-000015
最大的参数θ和
Figure PCTCN2020136689-appb-000016
是最好的树,I j={|iq(x i)=j}是叶子节点j的距离,q(x)是优化的叶子节点权重,
Figure PCTCN2020136689-appb-000017
为最佳目标函数。
Figure PCTCN2020136689-appb-000018
Figure PCTCN2020136689-appb-000019
Figure PCTCN2020136689-appb-000020
Figure PCTCN2020136689-appb-000021
在实际应用中达到这个效果比较困难。因此,选层一层树进行优化,计算节点拆分前后的增益,并选择增益最大的点作为拆分点。在XGBOOST算法中,如果一个节点被划分为两个叶节点,则分数增益如下:
Figure PCTCN2020136689-appb-000022
I L与I R拆分后显示左右节点的实例数据集,
Figure PCTCN2020136689-appb-000023
Figure PCTCN2020136689-appb-000024
I=I LUI R
在影响热带气旋强度的诸多因素中,很难选择最佳的预测因子。但是,XGBOOST模型能够完成这项任务。它有多种可调参数。将此实验的范围限制为使用rstudio执行XGBOOST。ETA参数减少了特征的权重,使计算过程更为保守,防止过度拟合,并在更新过程中使用收缩步骤。gamma参数是对树的叶节点进行进一步分区所需的最小损失减少。max_depth参数表示子树的最大深度。min_child_weight参数显示子级中所需实例权重的最小和。子样本表示观察到的子样本的比率。colsample字节树参数表示用于构造每棵树的变量的比率。
由于提取的热带气旋样本数据库为二维矩阵,选取XGBOOST模型能够准确的预测二维矩阵数据,但采用梯度下降法优化目标函数时难以一次性训练所有树结果,因此利用粒子群算法寻找的最优结果。选取最佳路径数据集作为初始样本,并填充缺省值,生成可以使用的数据集;建立K棵回归树与目标函数,利用粒子群算法确定最优节点及最小的损失函数以此依据进行树分裂;直到达到树的最大深度停止迭代,即可得到最基本的模型,之后进一步优化,预报数据输出结果。
XGBOOST为boosting型树类算法,能进行多线程并行计算,通过一次次迭代生成一代代新的树,实际上是把很多分类性能较低的弱学习器组合成一个准确率高的强学习器,每个决策树可能没有良好的分类效果,但是多个分类的结果肯定会得到更准确的预测。XGBOOST最根本就是建立K棵回归树,使得准确率高、泛化性好、预测误差尽量小,叶子节点尽量少的目标函数才能训练出更好的模型,利用粒子群优化及二次最优化确定最优节点及最小的损失函数,以此为依据进行树分裂,得到小树苗,接下来按照上述方式继续分裂,并 继续形成新树,根据之前的预测每次都会建立最优的树,当达到做大深度时停止迭代;此时得到了最基本的模型,之后使用栅格搜索等方法对集中参数进行优化。
选择相关系数(CC)、平均绝对误差(MAE)和归一化均方根误差(NRMES;以百分比表示)作为评估XGBOOST模型在训练和测试阶段能力的参数。
Figure PCTCN2020136689-appb-000025
Figure PCTCN2020136689-appb-000026
Figure PCTCN2020136689-appb-000027
θ obs,i是第i个样本的观测值,θ fore,i是第i个样本的预报值。N是所有预报样本的个数,
Figure PCTCN2020136689-appb-000028
是观测值的平均值,
Figure PCTCN2020136689-appb-000029
是预测值的平均值。
在模型中,输入参数是气候持续因子、环境因子与头脑风暴因子。输出因子为预报时效为6,12,18,24小时的热带气旋强度。因为参数设置对于运行XGBOOST非常重要,所以我们使用expand.grid()函数来选择参数的最佳组合。当eta、gamma、max-depth、min-child-weight、subsample和colsample-bytree分别为(0.01、0.1、1)、(0.1、0.5、0.8)、(2、4、6、8)、(2、4、8)、0.8和0.95时,获得了XGBOOST的最佳执行率。此设置导致108个参数组合。对于1979-2005年的训练样本,采用交叉验证方法获得最佳参数组合。
表4输入的因子
预报因子 描述
PF 持续因子(20)
CF 气候因子(52)
BF 头脑风暴因子(59)
MON 热带气旋月份
IC 强度标记
EF 环境因子(24)
例如,使用一个BPNN来预测相同样本输入参数下的热带气旋强度。BPNN方法24小时结果为4.57m/s。本发明的模型的24小时提前期MAE为3.70m/s。XGBOOST模型的预测结果优于具有相同样本要求的BPNN模型。与人工神经网络相比,XGBOOST模型具有训练过程简单、计算机处理成本低、收敛速度快等优点。因此,使用XGBOOST模型预测热带气旋强度是非常有利的。因此,这一发现支持在24小时内使用XGBOOST模型作为一种新的热带气旋强度预测方法进行预测。
表5对比结果
方法 输入参数 MAE(m/s)
XGBOOSTmodel 气候持续因子、环境因子与头脑风暴因子 3.70
BPNNmodel 气候持续因子、环境因子与头脑风暴因子 4.57
对应的,参见图3,在本发明实施例中还提供了一种热带气旋强度预报信息生成系统,该系统包括:
获取单元10,用于按照不同数据维度获取待处理数据;
预处理单元20,用于对所述待处理数据进行预处理,得到初始数据,所述初始数据包括气候持续特征因子、环境特征因子、头脑风暴特征因子;
预测单元30,用于利用预设预报模型对所述初始数据进行预测,得到热带气旋强度预报信息,所述预报模型表征通过训练样本训练得到的模型,所述训练样本与所述初始数据相匹配。
在上述实施例的基础上,所述预处理单元包括:
第一获取子单元,用于获取待处理数据中的热带气旋数据;
第一构造子单元,用于根据当前时刻与预设时间差确定的时间信息,对所述热带气旋数据进行预报因子的数据构造,得到气候持续特征因子。
在上述实施例的基础上,所述预处理单元包括:
第二获取子单元,用于获取所述待处理数据中的环境信息;
第二构造子单元,用于采用预设构建模式对所述环境信息进行构造,得到环境特征因子,所述预设构建模式表征能够使确定环境信息中各个属性信息的关系的处理模型。
在上述实施例的基础上,所述系统还包括创建单元,用于创建预报模型,所述创建单元包括:
样本获取子单元,用于获取样本数据,所述样本数据包括气候持续特征因子、环境特征因子和头脑风暴特征因子;
验证子单元,用于对所述样本数据进行验证,获得目标样本数据,所述目标样本数据的每条样本包括满足特定条件的参数;
训练子单元,用于对所述目标样本数据进行训练,得到预报模型。
在上述实施例的基础上,所述预测模型包括XGBoost模型,所述预测单元具体用于:
根据所述目标样本数据生成分类回归树;
利用所述XGBoost模型和所述分类回归树,对所述初始数据进行训练,得到热带气旋强度预报信息。
本发明提供了一种热带气旋强度预报信息生成系统,获取单元按照不同数据维度获取待处理数据,预处理单元对所述待处理数据进行预处理,得到初始数据,预测单元利用预设预报模型对初始数据进行预测,得到热带气旋强度预报信息。由于初始数据包括气候持续特征因子、环境特征因子、头脑风暴特征因子,可以充分利用对热带气旋的影响因素,并结合预设的预报模型对热带气旋强度预报信息进行预测,使得预测处理更加智能化和客观化,完善了热带气旋预报系统,提高预报准确率。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明 将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (10)

  1. 一种热带气旋强度预报信息生成方法,其特征在于,所述方法包括:
    按照不同数据维度获取待处理数据;
    对所述待处理数据进行预处理,得到初始数据,所述初始数据包括气候持续特征因子、环境特征因子、头脑风暴特征因子;
    利用预设预报模型对所述初始数据进行预测,得到热带气旋强度预报信息,所述预报模型表征通过训练样本训练得到的模型,所述训练样本与所述初始数据相匹配。
  2. 根据权利要求1所述的方法,其特征在于,所述初始数据包括气候持续特征因子,所述对所述待处理数据进行预处理,得到初始数据,包括:
    获取待处理数据中的热带气旋数据;
    根据当前时刻与预设时间差确定的时间信息,对所述热带气旋数据进行预报因子的数据构造,得到气候持续特征因子。
  3. 根据权利要求1所述的方法,其特征在于,所述初始数据包括环境特征因子,所述对所述待处理数据进行预处理,得到初始数据,包括:
    获取所述待处理数据中的环境信息;
    采用预设构建模式对所述环境信息进行构造,得到环境特征因子,所述预设构建模式表征能够使确定环境信息中各个属性信息的关系的处理模型。
  4. 根据权利要求1所述的方法,其特征在于,所述方法还包括创建预报模型,包括:
    获取样本数据,所述样本数据包括气候持续特征因子、环境特征因子和头脑风暴特征因子;
    对所述样本数据进行验证,获得目标样本数据,所述目标样本数据的每条样本包括满足特定条件的参数;
    对所述目标样本数据进行训练,得到预报模型。
  5. 根据权利要求4所述的方法,其特征在于,所述预测模型包括XGBoost模型,所述利用预设预报模型对所述初始数据进行预测,得到热带气旋强度预报信息,包括:
    根据所述目标样本数据生成分类回归树;
    利用所述XGBoost模型和所述分类回归树,对所述初始数据进行训练,得到热带气旋强度预报信息。
  6. 一种热带气旋强度预报信息生成系统,其特征在于,所述系统包括:
    获取单元,用于按照不同数据维度获取待处理数据;
    预处理单元,用于对所述待处理数据进行预处理,得到初始数据,所述初始数据包括气候持续特征因子、环境特征因子、头脑风暴特征因子;
    预测单元,用于利用预设预报模型对所述初始数据进行预测,得到热带气旋强度预报信息,所述预报模型表征通过训练样本训练得到的模型,所述训练样本与所述初始数据相匹配。
  7. 根据权利要求6所述的系统,其特征在于,所述预处理单元包括:
    第一获取子单元,用于获取待处理数据中的热带气旋数据;
    第一构造子单元,用于根据当前时刻与预设时间差确定的时间信息,对所述热带气旋数据进行预报因子的数据构造,得到气候持续特征因子。
  8. 根据权利要求6所述的系统,其特征在于,所述预处理单元包括:
    第二获取子单元,用于获取所述待处理数据中的环境信息;
    第二构造子单元,用于采用预设构建模式对所述环境信息进行构造,得到环境特征因子,所述预设构建模式表征能够使确定环境信息中各个属性信息的关系的处理模型。
  9. 根据权利要求6所述的系统,其特征在于,所述系统还包括创建单元,用于创建预报模型,所述创建单元包括:
    样本获取子单元,用于获取样本数据,所述样本数据包括气候持续特征因子、环境特征因子和头脑风暴特征因子;
    验证子单元,用于对所述样本数据进行验证,获得目标样本数据,所述目标样本数据的每条样本包括满足特定条件的参数;
    训练子单元,用于对所述目标样本数据进行训练,得到预报模型。
  10. 根据权利要求9所述的系统,其特征在于,所述预测模型包括XGBoost模型,所述预测单元具体用于:
    根据所述目标样本数据生成分类回归树;
    利用所述XGBoost模型和所述分类回归树,对所述初始数据进行训练,得到热带气旋强度预报信息。
PCT/CN2020/136689 2020-03-31 2020-12-16 热带气旋强度预报信息的生成方法及系统 WO2021196743A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010242733.1A CN111461427A (zh) 2020-03-31 2020-03-31 热带气旋强度预报信息的生成方法及系统
CN202010242733.1 2020-03-31

Publications (1)

Publication Number Publication Date
WO2021196743A1 true WO2021196743A1 (zh) 2021-10-07

Family

ID=71683504

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/136689 WO2021196743A1 (zh) 2020-03-31 2020-12-16 热带气旋强度预报信息的生成方法及系统

Country Status (2)

Country Link
CN (1) CN111461427A (zh)
WO (1) WO2021196743A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116360013A (zh) * 2023-04-04 2023-06-30 中国气象局上海台风研究所(上海市气象科学研究所) 一种梯度风平衡的台风客观定强方法及系统
CN117908166A (zh) * 2024-03-18 2024-04-19 南京气象科技创新研究院 基于机器学习的强降水超级单体识别预警方法
WO2024098499A1 (zh) * 2022-11-10 2024-05-16 中国科学院深圳先进技术研究院 热带气旋对称性结构分析方法、装置、设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111461427A (zh) * 2020-03-31 2020-07-28 中国科学院空天信息创新研究院 热带气旋强度预报信息的生成方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050096758A1 (en) * 2003-10-31 2005-05-05 Incorporated Administrative Agncy Ntl Agricultural And Bio-Oriented Research Organization Prediction apparatus, prediction method, and computer product
CN104915729A (zh) * 2015-05-26 2015-09-16 中国科学院深圳先进技术研究院 一种处理台风强度环境预测因子数据的方法及系统
CN104932035A (zh) * 2015-05-26 2015-09-23 中国科学院深圳先进技术研究院 一种台风强度预报方法及系统
CN109902885A (zh) * 2019-04-09 2019-06-18 中国人民解放军国防科技大学 基于深度学习混合cnn-lstm模型的台风预测方法
CN110119494A (zh) * 2018-12-31 2019-08-13 三亚中科遥感研究所 一种基于主被动微波遥感观测的热带气旋自动定强方法
CN111461427A (zh) * 2020-03-31 2020-07-28 中国科学院空天信息创新研究院 热带气旋强度预报信息的生成方法及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050096758A1 (en) * 2003-10-31 2005-05-05 Incorporated Administrative Agncy Ntl Agricultural And Bio-Oriented Research Organization Prediction apparatus, prediction method, and computer product
CN104915729A (zh) * 2015-05-26 2015-09-16 中国科学院深圳先进技术研究院 一种处理台风强度环境预测因子数据的方法及系统
CN104932035A (zh) * 2015-05-26 2015-09-23 中国科学院深圳先进技术研究院 一种台风强度预报方法及系统
CN110119494A (zh) * 2018-12-31 2019-08-13 三亚中科遥感研究所 一种基于主被动微波遥感观测的热带气旋自动定强方法
CN109902885A (zh) * 2019-04-09 2019-06-18 中国人民解放军国防科技大学 基于深度学习混合cnn-lstm模型的台风预测方法
CN111461427A (zh) * 2020-03-31 2020-07-28 中国科学院空天信息创新研究院 热带气旋强度预报信息的生成方法及系统

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024098499A1 (zh) * 2022-11-10 2024-05-16 中国科学院深圳先进技术研究院 热带气旋对称性结构分析方法、装置、设备及存储介质
CN116360013A (zh) * 2023-04-04 2023-06-30 中国气象局上海台风研究所(上海市气象科学研究所) 一种梯度风平衡的台风客观定强方法及系统
CN116360013B (zh) * 2023-04-04 2023-10-10 中国气象局上海台风研究所(上海市气象科学研究所) 一种梯度风平衡的台风客观定强方法及系统
CN117908166A (zh) * 2024-03-18 2024-04-19 南京气象科技创新研究院 基于机器学习的强降水超级单体识别预警方法
CN117908166B (zh) * 2024-03-18 2024-05-24 南京气象科技创新研究院 基于机器学习的强降水超级单体识别预警方法

Also Published As

Publication number Publication date
CN111461427A (zh) 2020-07-28

Similar Documents

Publication Publication Date Title
WO2021196743A1 (zh) 热带气旋强度预报信息的生成方法及系统
Shao et al. A novel deep learning approach for short-term wind power forecasting based on infinite feature selection and recurrent neural network
CN109902885A (zh) 基于深度学习混合cnn-lstm模型的台风预测方法
Zhang et al. Forecast of solar energy production-A deep learning approach
CN110929161B (zh) 一种面向大规模用户的个性化教学资源推荐方法
Reddy et al. Survey on weather prediction using big data analystics
CN108875816A (zh) 融合置信度准则和多样性准则的主动学习样本选择策略
Suo et al. Wind speed prediction by a swarm intelligence based deep learning model via signal decomposition and parameter optimization using improved chimp optimization algorithm
CN110969290A (zh) 一种基于深度学习的径流概率预测方法及系统
Xia et al. Machine learning-based weather support for the 2022 Winter Olympics
Kazor et al. Assessing the performance of model-based clustering methods in multivariate time series with application to identifying regional wind regimes
CN114445634A (zh) 一种基于深度学习模型的海浪波高预测方法及系统
Xiong et al. Research on wind power ramp events prediction based on strongly convective weather classification
Liang et al. Method of bidirectional LSTM modelling for the atmospheric temperature
Tan et al. A prediction scheme of tropical cyclone frequency based on lasso and random forest
CN115912502A (zh) 一种智能电网运营优化方法及装置
Zhang et al. Application of machine learning methods in photovoltaic output power prediction: A review
Lin et al. Reducing TC position uncertainty in an ensemble data assimilation and prediction system: A case study of Typhoon Fanapi (2010)
Justin et al. Toward operational real-time identification of frontal boundaries using machine learning
CN106227965A (zh) 一种顾及时空分布非平稳特征的土壤有机碳空间抽样网络设计方法
Semero et al. A GA-PSO hybrid algorithm based neural network modeling technique for short-term wind power forecasting
Wei Discretized and continuous target fields for the reservoir release rules during floods
CN115034159A (zh) 一种海上风电场的功率预测方法、装置、存储介质及系统
CN115296298A (zh) 一种风电场功率预测方法
Li et al. Short-term Photovoltaic Power Forecasting Using SOM-based Regional Modelling Methods

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20928633

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20928633

Country of ref document: EP

Kind code of ref document: A1