WO2023193395A1 - Mixed-frequency data imputation model creation method based on functional data analysis - Google Patents

Mixed-frequency data imputation model creation method based on functional data analysis Download PDF

Info

Publication number
WO2023193395A1
WO2023193395A1 PCT/CN2022/115192 CN2022115192W WO2023193395A1 WO 2023193395 A1 WO2023193395 A1 WO 2023193395A1 CN 2022115192 W CN2022115192 W CN 2022115192W WO 2023193395 A1 WO2023193395 A1 WO 2023193395A1
Authority
WO
WIPO (PCT)
Prior art keywords
year
quarterly
growth rate
data
monthly
Prior art date
Application number
PCT/CN2022/115192
Other languages
French (fr)
Chinese (zh)
Inventor
林溪桥
程敏
蒙琦
覃惠玲
陈志君
卢纯颢
董贇
周春丽
岑剑峰
周珑
王鹏
Original Assignee
广西电网有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广西电网有限责任公司 filed Critical 广西电网有限责任公司
Publication of WO2023193395A1 publication Critical patent/WO2023193395A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/109Time management, e.g. calendars, reminders, meetings or time accounting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/12Timing analysis or timing optimisation

Definitions

  • Embodiments of the present application relate to the field of mixing data complementation, and in particular to a method of creating a mixing data complementation model based on functional data analysis.
  • the present invention combines the research method of mixed frequency data with the functional data analysis method, and extends the approximate relationship of the mixed frequency data to the basis function fitting method.
  • the approximate transformation of quarterly indicators into monthly indicators has been achieved. Conveniently provide richer data sources for economic modeling.
  • time-frequency data which includes quarterly data and potential monthly data
  • the approximate relationship includes the approximate relationship of quarterly indicators and potential monthly indicators and the approximate relationship of the quarterly logarithmic year-on-year growth rate of the quarterly indicator and the potential monthly logarithmic year-on-year growth rate;
  • a complementary model of the mixing data is created by fitting the eigenfunctions.
  • Y j is the quarterly indicator, is a potential monthly indicator.
  • the formula of the potential monthly year-on-year growth rate eigenfunction is:
  • y * (t) is the potential monthly year-on-year growth rate eigenfunction
  • ⁇ k (t) is the selected basic spline function
  • ⁇ k is the corresponding coefficient
  • t is time.
  • the formula of the quarterly year-on-year growth rate intrinsic function is:
  • time-frequency data which includes quarterly data and potential monthly data
  • determine the approximate relationship of time-frequency data which includes the approximate relationship between quarterly indicators and potential monthly indicators, as well as the quarterly logarithmic year-on-year growth of quarterly indicators.
  • the approximate relationship between the rate and the potential monthly logarithmic year-on-year growth rate construct the eigenfunction based on the time-frequency data and the approximate relationship.
  • the eigenfunction includes the potential monthly year-on-year growth rate eigenfunction and the quarterly year-on-year growth rate eigenfunction; by fitting the eigenfunction Eigenfunctions create a complementary model of mixed-frequency data.
  • the present invention uses a functional data analysis method, combines the relationship between the growth rates of different time-frequency indicators, and uses a spline fitting method to fit the year-on-year growth rate of the corresponding monthly indicator through the year-on-year growth rate of the quarterly indicator.
  • the invention helps to solve the problem that some economic indicators only have quarterly data, but monthly data are often required in some economic analyses.
  • Figure 1 is a schematic flow chart of an embodiment of the creation method of the mixing data complement model based on functional data analysis in this application.
  • the embodiment of the present application provides a method for creating a mixed-frequency data complement model based on functional data analysis.
  • the functional data analysis method is used, combined with the relationship between the growth rates of different time-frequency indicators, and the spline fitting method.
  • the year-on-year growth rate of the corresponding monthly indicator can be fitted through the year-on-year growth rate of the quarterly indicator.
  • the invention helps to solve the problem that some economic indicators only have quarterly data, but monthly data are often required in some economic analyses.
  • the time-frequency data includes quarterly data and potential monthly data. Specifically:
  • Time-frequency analysis is By windowing the data and assuming that the data is stationary within the time window, Fourier transform is performed to extract the frequency domain information within the time window. Then slide the window forward along the time axis, and perform the same processing on the data in each time window, so that the frequency information that changes with time can be obtained, and the result is time-frequency data.
  • the approximate relationship includes the approximate relationship between quarterly indicators and potential monthly indicators, as well as the approximate relationship between the quarterly logarithmic year-on-year growth rate of quarterly indicators and the potential monthly logarithmic year-on-year growth rate;
  • time-frequency data quarterly data and potential monthly data
  • ⁇ 1> Determine the approximate relationship between quarterly indicators and potential monthly indicators, including:
  • the potential monthly indicator is composed of a combination of eigenfunctions and random noise, and the expression is as follows:
  • the eigenfunction of the potential monthly indicator is a linear combination of K basis spline functions, and the expression is as follows:
  • the potential monthly indicator can be expressed as a linear combination of basic spline functions plus random noise, as shown in the following formula:
  • the fitting eigenfunction aims to fit observable quarterly indicators, adopts the idea of smooth spline, and considers the trade-off between fitting error and curve smoothness.
  • the second derivative of the fitted curve reflects the smoothness of the curve.
  • the optimization objective consists of minimizing the sum of squares of the residuals plus the smoothing penalty, and the expression is as follows:
  • the hyperparameters are usually selected based on the principle of minimizing the results of generalized cross validation (GCV).
  • GCV generalized cross validation
  • df is the degree of freedom of the model.
  • Fitting high time-frequency (monthly) data specifically: fitting high time-frequency (monthly) data
  • the estimation result of the basis function coefficient is obtained from the optimization formula.
  • Will Substituted into the eigenfunction of the potential monthly indicator, the fitted logarithmic year-on-year growth rate formula of the potential monthly indicator is as follows:
  • the estimate of the potential monthly logarithmic year-on-year growth rate in month t is the estimate of the potential monthly logarithmic year-on-year growth rate in month t.
  • the estimate of the potential monthly logarithmic year-on-year growth rate of the quarterly indicator month t can be combined with the actual monthly year-on-year growth rate of other monthly indicators for modeling, that is, the mixing data complement model is completed. create.
  • time-frequency data which includes quarterly data and potential monthly data
  • determine the approximate relationship of time-frequency data which includes the approximate relationship between quarterly indicators and potential monthly indicators, as well as the quarterly logarithmic year-on-year growth rate of quarterly indicators and the potential monthly relationship.
  • the approximate relationship of the year-on-year growth rate of the number construct the eigenfunction based on the time-frequency data and the approximate relationship.
  • the eigenfunction includes the potential monthly year-on-year growth rate eigenfunction and the quarterly year-on-year growth rate eigenfunction; create a mixing frequency by fitting the eigenfunction Data complement model.
  • the present invention uses a functional data analysis method, combines the relationship between the growth rates of different time-frequency indicators, and uses a spline fitting method to fit the year-on-year growth rate of the corresponding monthly indicator through the year-on-year growth rate of the quarterly indicator.
  • the invention helps to solve the problem that some economic indicators only have quarterly data, but monthly data are often required in some economic analyses.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, read-only memory), random access memory (RAM, random access memory), magnetic disk or optical disk and other media that can store program code. .

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Educational Administration (AREA)
  • General Business, Economics & Management (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Geometry (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Disclosed in the present application is a mixed-frequency data imputation model creation method based on functional data analysis. The method comprises: collecting time-frequency data; determining approximate relationships between the time-frequency data, the approximate relationships comprising an approximate relationship between a quarterly indicator and a potential monthly indicator, and an approximate relationship between a quarterly logarithmic year-to-year growth rate of the quarterly indicator and a potential monthly logarithmic year-to-year growth rate; constructing an intrinsic function according to the time-frequency data and the approximate relationships, the intrinsic function comprising a potential monthly year-on-year growth rate intrinsic function and a quarterly year-on-year growth rate intrinsic function; and creating a mixed-frequency data imputation model by fitting the intrinsic function. The present invention uses a functional data analysis method, by combining with the relationship between different time frequency indicator growth rates and using a spline fitting method, the year-on-year growth rate of a corresponding monthly indicator can be fit on the basis of the year-on-year growth rate of the quarterly indicator. The present invention helps to solve the problem that some economic indicators only have quarterly data while monthly data is often needed in some economic analysis.

Description

一种基于函数型数据分析的混频数据补值模型的创建方法A method to create a mixed-frequency data complement model based on functional data analysis 技术领域Technical field

本申请实施例涉及混频数据补值领域,特别涉及一种基于函数型数据分析的混频数据补值模型的创建方法。Embodiments of the present application relate to the field of mixing data complementation, and in particular to a method of creating a mixing data complementation model based on functional data analysis.

背景技术Background technique

在大数据时代,数据来源更加丰富,可以获得不同领域,不同时频,不同类型的数据。在经济分析中,特别是指数构建中各式各样的指标需要纳入模型中。但不同指标的时频不同,常常给分析建模带来困难。一些指标(如电力数据)常常有每月更新的月度数据,但一些宏观经济指标(如GDP)往往只在一个季度公布一次。时频的不一致导致建模过程中只能将月度数据进行简单加总成季度数据。这样减少了样本容量,无法刻画指标的周期态势,信息缺失严重,导致建模分析的不准确。In the era of big data, data sources are more abundant, and different fields, different time frequencies, and different types of data can be obtained. In economic analysis, especially in index construction, various indicators need to be incorporated into the model. However, different indicators have different time and frequencies, which often brings difficulties to analysis and modeling. Some indicators (such as electricity data) often have monthly data updated every month, but some macroeconomic indicators (such as GDP) are often released only once a quarter. The inconsistency in time and frequency results in that monthly data can only be simply summed into quarterly data during the modeling process. This reduces the sample size, fails to characterize the cyclical trends of indicators, and results in a serious lack of information, resulting in inaccurate modeling and analysis.

现阶段关于混频数据的研究主要集中在经济周期因子模型的构建上,较少考虑混频数据的缺失值补值问题。本发明将混频数据的研究方法结合函数型数据分析方法,将混频数据的近似关系拓展到基函数拟合的方法上。实现了季度指标向月度指标的近似转化。方便为经济建模提供更丰富的数据源。At this stage, research on mixed data mainly focuses on the construction of business cycle factor models, and less consideration is given to the problem of supplementing missing values in mixed data. The present invention combines the research method of mixed frequency data with the functional data analysis method, and extends the approximate relationship of the mixed frequency data to the basis function fitting method. The approximate transformation of quarterly indicators into monthly indicators has been achieved. Conveniently provide richer data sources for economic modeling.

发明内容Contents of the invention

本申请实施例提供了一种基于函数型数据分析的混频数据补值模型的创建方法,使用函数型数据分析的方法,结合不同时频指标增长率的关系,使用样条拟合的方法,能够通过季度指标的同比增长率拟合出相对应月度指标的同比增长率。本发明有助于解决一些经济指标仅有季度数据,但一些经济分析中常常需要月度数据的问题。The embodiment of the present application provides a method for creating a mixed-frequency data complement model based on functional data analysis. The functional data analysis method is used, combined with the relationship between the growth rates of different time-frequency indicators, and the spline fitting method. The year-on-year growth rate of the corresponding monthly indicator can be fitted through the year-on-year growth rate of the quarterly indicator. The invention helps to solve the problem that some economic indicators only have quarterly data, but monthly data are often required in some economic analyses.

本申请第一方面提供了一种基于函数型数据分析的混频数据补值模型的创 建方法,包括:The first aspect of this application provides a method for creating a mixing data complement model based on functional data analysis, including:

采集时频数据,所述时频数据包括季度数据和潜在月度数据;Collect time-frequency data, which includes quarterly data and potential monthly data;

确定所述时频数据的近似关系,所述近似关系包括季度指标和潜在月度指标的近似关系以及季度指标的季度对数同比增长率和潜在月度对数同比增长率的近似关系;Determine the approximate relationship of the time-frequency data, the approximate relationship includes the approximate relationship of quarterly indicators and potential monthly indicators and the approximate relationship of the quarterly logarithmic year-on-year growth rate of the quarterly indicator and the potential monthly logarithmic year-on-year growth rate;

根据所述时频数据和所述近似关系构造本征函数,所述本征函数包括,潜在月度同比增长率本征函数和季度同比增长率本征函数;Construct eigenfunctions based on the time-frequency data and the approximate relationship, where the eigenfunctions include potential monthly year-on-year growth rate eigenfunctions and quarterly year-on-year growth rate eigenfunctions;

通过拟合所述本征函数创建混频数据补值模型。A complementary model of the mixing data is created by fitting the eigenfunctions.

可选的,所述季度指标和潜在月度指标的近似关系为:Optionally, the approximate relationship between the quarterly indicators and potential monthly indicators is:

Figure PCTCN2022115192-appb-000001
Figure PCTCN2022115192-appb-000001

Y j为季度指标,

Figure PCTCN2022115192-appb-000002
为潜在月度指标。 Y j is the quarterly indicator,
Figure PCTCN2022115192-appb-000002
is a potential monthly indicator.

可选的,所述季度指标的季度对数同比增长率和潜在月度对数同比增长率的近似关系为:Optionally, the approximate relationship between the quarterly logarithmic year-on-year growth rate of the quarterly indicator and the potential monthly logarithmic year-on-year growth rate is:

Figure PCTCN2022115192-appb-000003
Figure PCTCN2022115192-appb-000003

y t为季度指标的季度对数同比增长率,

Figure PCTCN2022115192-appb-000004
为潜在月度对数同比增长率。 y t is the quarterly logarithmic year-on-year growth rate of the quarterly indicator,
Figure PCTCN2022115192-appb-000004
is the potential monthly logarithmic year-over-year growth rate.

可选的,所述潜在月度同比增长率本征函数的公式为:Optionally, the formula of the potential monthly year-on-year growth rate eigenfunction is:

Figure PCTCN2022115192-appb-000005
Figure PCTCN2022115192-appb-000005

其中,y *(t)为潜在月度同比增长率本征函数,φ k(t)为选取的基样条函数,β k为对应的系数,t为时间。 Among them, y * (t) is the potential monthly year-on-year growth rate eigenfunction, φ k (t) is the selected basic spline function, β k is the corresponding coefficient, and t is time.

可选的,所述季度同比增长率本征函数的公式为:Optionally, the formula of the quarterly year-on-year growth rate intrinsic function is:

Figure PCTCN2022115192-appb-000006
Figure PCTCN2022115192-appb-000006

其中,y(t)为潜在月度同比增长率本征函数,φ k(t)为选取的基样条函数,β k为对应的系数,t为时间。 Among them, y(t) is the potential monthly year-on-year growth rate eigenfunction, φ k (t) is the selected basic spline function, β k is the corresponding coefficient, and t is time.

从以上技术中:采集时频数据,时频数据包括季度数据和潜在月度数据;确定时频数据的近似关系,近似关系包括季度指标和潜在月度指标的近似关系以及季度指标的季度对数同比增长率和潜在月度对数同比增长率的近似关系;根据时频数据和近似关系构造本征函数,本征函数包括潜在月度同比增长率本征函数和季度同比增长率本征函数;通过拟合本征函数创建混频数据补值模型。本发明使用函数型数据分析的方法,结合不同时频指标增长率的关系,使用样条拟合的方法,能够通过季度指标的同比增长率拟合出相对应月度指标的同比增长率。本发明有助于解决一些经济指标仅有季度数据,但一些经济分析中常常需要月度数据的问题。From the above techniques: collect time-frequency data, which includes quarterly data and potential monthly data; determine the approximate relationship of time-frequency data, which includes the approximate relationship between quarterly indicators and potential monthly indicators, as well as the quarterly logarithmic year-on-year growth of quarterly indicators. The approximate relationship between the rate and the potential monthly logarithmic year-on-year growth rate; construct the eigenfunction based on the time-frequency data and the approximate relationship. The eigenfunction includes the potential monthly year-on-year growth rate eigenfunction and the quarterly year-on-year growth rate eigenfunction; by fitting the eigenfunction Eigenfunctions create a complementary model of mixed-frequency data. The present invention uses a functional data analysis method, combines the relationship between the growth rates of different time-frequency indicators, and uses a spline fitting method to fit the year-on-year growth rate of the corresponding monthly indicator through the year-on-year growth rate of the quarterly indicator. The invention helps to solve the problem that some economic indicators only have quarterly data, but monthly data are often required in some economic analyses.

附图说明Description of the drawings

图1为本申请中基于函数型数据分析的混频数据补值模型的创建方法一个实施例流程示意图。Figure 1 is a schematic flow chart of an embodiment of the creation method of the mixing data complement model based on functional data analysis in this application.

具体实施方式Detailed ways

本申请实施例提供了一种基于函数型数据分析的混频数据补值模型的创建方法,使用函数型数据分析的方法,结合不同时频指标增长率的关系,使用样条拟合的方法,能够通过季度指标的同比增长率拟合出相对应月度指标的同比增长率。本发明有助于解决一些经济指标仅有季度数据,但一些经济分析中常常需要月度数据的问题。The embodiment of the present application provides a method for creating a mixed-frequency data complement model based on functional data analysis. The functional data analysis method is used, combined with the relationship between the growth rates of different time-frequency indicators, and the spline fitting method. The year-on-year growth rate of the corresponding monthly indicator can be fitted through the year-on-year growth rate of the quarterly indicator. The invention helps to solve the problem that some economic indicators only have quarterly data, but monthly data are often required in some economic analyses.

请参阅图1,本申请实施例中基于函数型数据分析的混频数据补值模型的创建方法一个实施例包括:Please refer to Figure 1. In the embodiment of this application, an embodiment of a method for creating a mixing data complement model based on functional data analysis includes:

101、采集时频数据,时频数据包括季度数据和潜在月度数据;101. Collect time-frequency data, which includes quarterly data and potential monthly data;

在本实施例中,在进行模型的创建之前,需要对创建模型需要的时频数据进行采集,即采集时频数据,其中时频数据有季度数据和潜在月度数据,具体的:In this embodiment, before creating the model, it is necessary to collect the time-frequency data required to create the model, that is, collect time-frequency data. The time-frequency data includes quarterly data and potential monthly data. Specifically:

先采集历史的月度电力数据和历史的季度电力数据,再经过对月度电力数据和季度电力数据进行时频分析来获取时频数据(季度数据和潜在月度数据),其中的时频分析的原理是通过对数据进行加窗处理,并假设在该时间窗内数据是稳态的,从而进行傅里叶变换,提取该时间窗内的频域信息。然后将窗口沿着时间轴向前滑动,并对每个时间窗内的数据进行同样的处理,这样就能得到随时间变化的频率的信息,所得到的结果就是时频数据。First collect historical monthly power data and historical quarterly power data, and then perform time-frequency analysis on monthly power data and quarterly power data to obtain time-frequency data (quarterly data and potential monthly data). The principle of time-frequency analysis is By windowing the data and assuming that the data is stationary within the time window, Fourier transform is performed to extract the frequency domain information within the time window. Then slide the window forward along the time axis, and perform the same processing on the data in each time window, so that the frequency information that changes with time can be obtained, and the result is time-frequency data.

在本实施例中,先采集历史的月度电力数据和历史的季度电力数据可以是任一数据格式的数据,具体此处不做具体限定。In this embodiment, the historical monthly power data and historical quarterly power data collected first can be data in any data format, and are not specifically limited here.

102、确定时频数据的近似关系,近似关系包括季度指标和潜在月度指标的近似关系以及季度指标的季度对数同比增长率和潜在月度对数同比增长率的近似关系;102. Determine the approximate relationship between time-frequency data. The approximate relationship includes the approximate relationship between quarterly indicators and potential monthly indicators, as well as the approximate relationship between the quarterly logarithmic year-on-year growth rate of quarterly indicators and the potential monthly logarithmic year-on-year growth rate;

在获取到时频数据(季度数据和潜在月度数据)之后,对时频数据进行数据处理然后确定时频数据的近似关系,具体的:After obtaining the time-frequency data (quarterly data and potential monthly data), perform data processing on the time-frequency data and determine the approximate relationship of the time-frequency data. Specifically:

<一>确定季度指标和潜在月度指标的近似关系,包括:<1> Determine the approximate relationship between quarterly indicators and potential monthly indicators, including:

季度指标和潜在月度指标的近似关系如下:The approximate relationship between quarterly indicators and potential monthly indicators is as follows:

Figure PCTCN2022115192-appb-000007
Figure PCTCN2022115192-appb-000007

该式是根据算术平均数与几何平均数的近似关系得到的。假设第t月公布的季度指标为Y t,潜在月度指标为

Figure PCTCN2022115192-appb-000008
其潜在关系为
Figure PCTCN2022115192-appb-000009
因为经济指标月度变化不会过大,因此算术平均数可以用几何平均数近似。得到季度指 标和潜在月度指标的近似关系近似关系如下: This formula is obtained based on the approximate relationship between the arithmetic mean and the geometric mean. Assume that the quarterly indicator released in month t is Y t , and the potential monthly indicator is
Figure PCTCN2022115192-appb-000008
The potential relationship is
Figure PCTCN2022115192-appb-000009
Because economic indicators do not change too much from month to month, the arithmetic mean can be approximated by the geometric mean. The approximate relationship between quarterly indicators and potential monthly indicators is obtained as follows:

Figure PCTCN2022115192-appb-000010
Figure PCTCN2022115192-appb-000010

<二>确定季度指标的季度对数同比增长率和潜在月度对数同比增长率的近似关系,包括:<2> Determine the approximate relationship between the quarterly logarithmic year-on-year growth rate of quarterly indicators and the potential monthly logarithmic year-on-year growth rate, including:

季度指标的季度对数同比增长率的公式为:The formula for quarterly logarithmic year-over-year growth rates for quarterly indicators is:

Figure PCTCN2022115192-appb-000011
Figure PCTCN2022115192-appb-000011

潜在月度同对数比增长率的公式为:The formula for potential monthly log-ratio growth rate is:

Figure PCTCN2022115192-appb-000012
Figure PCTCN2022115192-appb-000012

季度指标的季度对数同比增长率和潜在月度对数同比增长率的近似关系如下:The approximate relationship between the quarterly log year-over-year growth rate of the quarterly indicator and the potential monthly log year-over-year growth rate is as follows:

Figure PCTCN2022115192-appb-000013
Figure PCTCN2022115192-appb-000013

该式由季度指标和潜在月度指标的近似关系导出。推导过程如下:将式代入式和式得到:This formula is derived from the approximate relationship between quarterly indicators and potential monthly indicators. The derivation process is as follows: Substituting equation into equation and equation, we get:

Figure PCTCN2022115192-appb-000014
Figure PCTCN2022115192-appb-000014

103、根据时频数据和近似关系构造本征函数,本征函数包括潜在月度同比增长率本征函数和季度同比增长率本征函数;103. Construct eigenfunctions based on time-frequency data and approximate relationships. The eigenfunctions include potential monthly year-on-year growth rate eigenfunctions and quarterly year-on-year growth rate eigenfunctions;

当确定时频数据的近似关系之后,即确定季度指标和潜在月度指标的近似 关系近似关系和季度指标的季度对数同比增长率和潜在月度对数同比增长率的近似关系之后,本实施例中就会根据已经确定的近似关系和采集并处理后的视频数据进行构造本征函数,具体的:After determining the approximate relationship between the time-frequency data, that is, after determining the approximate relationship between the quarterly indicator and the potential monthly indicator, and the approximate relationship between the quarterly logarithmic year-on-year growth rate of the quarterly indicator and the potential monthly logarithmic year-on-year growth rate, in this embodiment The eigenfunction will be constructed based on the determined approximate relationship and the collected and processed video data. Specifically:

<一>潜在月度指标由本征函数和随机噪声组合而成,表达式如下:<1> The potential monthly indicator is composed of a combination of eigenfunctions and random noise, and the expression is as follows:

Figure PCTCN2022115192-appb-000015
Figure PCTCN2022115192-appb-000015

其中y *(t)是潜在月度指标的本征函数,

Figure PCTCN2022115192-appb-000016
是随机噪声。 where y * (t) is the eigenfunction of the underlying monthly indicator,
Figure PCTCN2022115192-appb-000016
is random noise.

潜在月度指标的本征函数是K个基样条函数的线性组合,表达式如下:The eigenfunction of the potential monthly indicator is a linear combination of K basis spline functions, and the expression is as follows:

Figure PCTCN2022115192-appb-000017
Figure PCTCN2022115192-appb-000017

其中,φ k(t)为选取的基样条函数,通常对有周期性的数据常取傅里叶样条,对于没有明显周期性的序列数据常取B样条函数,β k为对应的系数。 Among them, φ k (t) is the selected basic spline function. Usually Fourier spline is often used for periodic data, and B-spline function is often used for sequence data without obvious periodicity. β k is the corresponding coefficient.

结合上式,潜在月度指标可以表示成基样条函数的线性组合加上随机噪声的形式,如下式所示:Combined with the above formula, the potential monthly indicator can be expressed as a linear combination of basic spline functions plus random noise, as shown in the following formula:

Figure PCTCN2022115192-appb-000018
Figure PCTCN2022115192-appb-000018

<二>季度指标本征函数的构造结合了潜在月度指标的本征函数以及季度指标与潜在月度指标之间的近似关系;其具体构造过程如下:将式与式联立得到:<2> The construction of the quarterly indicator eigenfunction combines the eigenfunction of the potential monthly indicator and the approximate relationship between the quarterly indicator and the potential monthly indicator; the specific construction process is as follows: combine Eq. with Eq. to get:

Figure PCTCN2022115192-appb-000019
Figure PCTCN2022115192-appb-000019

由此可以得到季度指标的本征函数,其表达式为:From this, the eigenfunction of the quarterly indicator can be obtained, and its expression is:

Figure PCTCN2022115192-appb-000020
Figure PCTCN2022115192-appb-000020

104、通过拟合本征函数创建混频数据补值模型。104. Create a complementary model of mixed frequency data by fitting eigenfunctions.

拟合本征函数以拟合可观测的季度指标为目标,采用光滑样条的思想,考虑拟合误差和曲线光滑程度的权衡取舍。拟合误差越小,拟合曲线将越不光滑。因此,本申请的优化目标在最小化拟合残差平方和的前提下,加入拟合曲线二阶导的平方积分惩罚项。拟合曲线二阶导反映了曲线的光滑程度。The fitting eigenfunction aims to fit observable quarterly indicators, adopts the idea of smooth spline, and considers the trade-off between fitting error and curve smoothness. The smaller the fitting error, the less smooth the fitted curve will be. Therefore, the optimization goal of this application is to add the square integral penalty term of the second derivative of the fitting curve on the premise of minimizing the sum of squares of the fitting residuals. The second derivative of the fitted curve reflects the smoothness of the curve.

优化目标由最小化残差平方和加上光滑惩罚构成,表达式如下:The optimization objective consists of minimizing the sum of squares of the residuals plus the smoothing penalty, and the expression is as follows:

Figure PCTCN2022115192-appb-000021
Figure PCTCN2022115192-appb-000021

代入式,目标函数等价于:Substituting into the formula, the objective function is equivalent to:

Figure PCTCN2022115192-appb-000022
Figure PCTCN2022115192-appb-000022

其中,季度指标对数同比增长率观测值序列为{y t:t=1,...,T},y(t)为季度指标的本征函数,β k为对应基函数的系数,λ为超参数。根据所需本征函数的平滑程度选择,通常采用广义交叉验证(GCV)结果最小为原则来选择超参数。 Among them, the logarithmic year-on-year growth rate observation sequence of the quarterly indicator is {y t :t=1,...,T}, y(t) is the intrinsic function of the quarterly indicator, βk is the coefficient of the corresponding basis function, λ is a hyperparameter. According to the smoothness of the required eigenfunction, the hyperparameters are usually selected based on the principle of minimizing the results of generalized cross validation (GCV).

广义交叉验证(GCV)的表达式如下:The expression of generalized cross validation (GCV) is as follows:

Figure PCTCN2022115192-appb-000023
Figure PCTCN2022115192-appb-000023

其中,df为模型的自由度。Among them, df is the degree of freedom of the model.

拟合高时频(月度)数据,具体的:拟合高时频(月度)数据将基函数系 数的估计结果由最优化式得出。将

Figure PCTCN2022115192-appb-000024
代入潜在月度指标的本征函数中,拟合出的潜在月度指标的对数同比增长率公式如下: Fitting high time-frequency (monthly) data, specifically: fitting high time-frequency (monthly) data, the estimation result of the basis function coefficient is obtained from the optimization formula. Will
Figure PCTCN2022115192-appb-000024
Substituted into the eigenfunction of the potential monthly indicator, the fitted logarithmic year-on-year growth rate formula of the potential monthly indicator is as follows:

Figure PCTCN2022115192-appb-000025
Figure PCTCN2022115192-appb-000025

其中

Figure PCTCN2022115192-appb-000026
为t月潜在月度对数同比增长率的估计值。在实际指数建模过程中,可以将季度指标t月潜在月度对数同比增长率的估计值与其他月度指标的实际月度同比增长率结合在一起进行建模,即完成混频数据补值模型的创建。 in
Figure PCTCN2022115192-appb-000026
is the estimate of the potential monthly logarithmic year-on-year growth rate in month t. In the actual index modeling process, the estimate of the potential monthly logarithmic year-on-year growth rate of the quarterly indicator month t can be combined with the actual monthly year-on-year growth rate of other monthly indicators for modeling, that is, the mixing data complement model is completed. create.

采集时频数据,时频数据包括季度数据和潜在月度数据;确定时频数据的近似关系,近似关系包括季度指标和潜在月度指标的近似关系以及季度指标的季度对数同比增长率和潜在月度对数同比增长率的近似关系;根据时频数据和近似关系构造本征函数,本征函数包括潜在月度同比增长率本征函数和季度同比增长率本征函数;通过拟合本征函数创建混频数据补值模型。本发明使用函数型数据分析的方法,结合不同时频指标增长率的关系,使用样条拟合的方法,能够通过季度指标的同比增长率拟合出相对应月度指标的同比增长率。本发明有助于解决一些经济指标仅有季度数据,但一些经济分析中常常需要月度数据的问题。Collect time-frequency data, which includes quarterly data and potential monthly data; determine the approximate relationship of time-frequency data, which includes the approximate relationship between quarterly indicators and potential monthly indicators, as well as the quarterly logarithmic year-on-year growth rate of quarterly indicators and the potential monthly relationship. The approximate relationship of the year-on-year growth rate of the number; construct the eigenfunction based on the time-frequency data and the approximate relationship. The eigenfunction includes the potential monthly year-on-year growth rate eigenfunction and the quarterly year-on-year growth rate eigenfunction; create a mixing frequency by fitting the eigenfunction Data complement model. The present invention uses a functional data analysis method, combines the relationship between the growth rates of different time-frequency indicators, and uses a spline fitting method to fit the year-on-year growth rate of the corresponding monthly indicator through the year-on-year growth rate of the quarterly indicator. The invention helps to solve the problem that some economic indicators only have quarterly data, but monthly data are often required in some economic analyses.

所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working processes of the systems, devices and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be described again here.

在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其他的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或 一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些界面,装置或单元的间接耦合或通信连接,可以是电性,机械或其他的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit. The above integrated units can be implemented in the form of hardware or software functional units.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,read-only memory)、随机存取存储器(RAM,random access memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, read-only memory), random access memory (RAM, random access memory), magnetic disk or optical disk and other media that can store program code. .

Claims (5)

一种基于函数型数据分析的混频数据补值模型的创建方法,其特征在于,包括:A method for creating a mixed-frequency data complement model based on functional data analysis, which is characterized by including: 采集时频数据,所述时频数据包括季度数据和潜在月度数据;Collect time-frequency data, which includes quarterly data and potential monthly data; 确定所述时频数据的近似关系,所述近似关系包括季度指标和潜在月度指标的近似关系以及季度指标的季度对数同比增长率和潜在月度对数同比增长率的近似关系;Determine the approximate relationship of the time-frequency data, the approximate relationship includes the approximate relationship of quarterly indicators and potential monthly indicators and the approximate relationship of the quarterly logarithmic year-on-year growth rate of the quarterly indicator and the potential monthly logarithmic year-on-year growth rate; 根据所述时频数据和所述近似关系构造本征函数,所述本征函数包括,潜在月度同比增长率本征函数和季度同比增长率本征函数;Construct eigenfunctions based on the time-frequency data and the approximate relationship, where the eigenfunctions include potential monthly year-on-year growth rate eigenfunctions and quarterly year-on-year growth rate eigenfunctions; 通过拟合所述本征函数创建混频数据补值模型。A complementary model of the mixing data is created by fitting the eigenfunctions. 根据权利要求1所述的创建方法,其特征在于,所述季度指标和潜在月度指标的近似关系为:The creation method according to claim 1, characterized in that the approximate relationship between the quarterly indicator and the potential monthly indicator is:
Figure PCTCN2022115192-appb-100001
Figure PCTCN2022115192-appb-100001
Y j为季度指标,
Figure PCTCN2022115192-appb-100002
为潜在月度指标。
Y j is the quarterly indicator,
Figure PCTCN2022115192-appb-100002
is a potential monthly indicator.
根据权利要求1所述的创建方法,其特征在于,所述季度指标的季度对数同比增长率和潜在月度对数同比增长率的近似关系为:The creation method according to claim 1, characterized in that the approximate relationship between the quarterly logarithmic year-on-year growth rate and the potential monthly logarithmic year-on-year growth rate of the quarterly indicator is:
Figure PCTCN2022115192-appb-100003
Figure PCTCN2022115192-appb-100003
y t为季度指标的季度对数同比增长率,
Figure PCTCN2022115192-appb-100004
为潜在月度对数同比增长率。
y t is the quarterly logarithmic year-on-year growth rate of the quarterly indicator,
Figure PCTCN2022115192-appb-100004
is the potential monthly logarithmic year-over-year growth rate.
根据权利要求1所述的创建方法,其特征在于,所述潜在月度同比增长率本征函数的公式为:The creation method according to claim 1, characterized in that the formula of the potential monthly year-on-year growth rate eigenfunction is:
Figure PCTCN2022115192-appb-100005
Figure PCTCN2022115192-appb-100005
其中,y *(t)为潜在月度同比增长率本征函数,φ k(t)为选取的基样条函数,β k 为对应的系数,t为时间。 Among them, y * (t) is the potential monthly year-on-year growth rate eigenfunction, φ k (t) is the selected basic spline function, β k is the corresponding coefficient, and t is time.
根据权利要求1所述的创建方法,其特征在于,所述季度同比增长率本征函数的公式为:The creation method according to claim 1, characterized in that the formula of the quarterly year-on-year growth rate eigenfunction is:
Figure PCTCN2022115192-appb-100006
Figure PCTCN2022115192-appb-100006
其中,y(t)为潜在月度同比增长率本征函数,φ k(t)为选取的基样条函数,β k为对应的系数,t为时间。 Among them, y(t) is the potential monthly year-on-year growth rate eigenfunction, φ k (t) is the selected basic spline function, β k is the corresponding coefficient, and t is time.
PCT/CN2022/115192 2022-04-07 2022-08-26 Mixed-frequency data imputation model creation method based on functional data analysis WO2023193395A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210361787.9 2022-04-07
CN202210361787.9A CN114781837A (en) 2022-04-07 2022-04-07 Method for creating frequency mixing data complementary value model based on functional data analysis

Publications (1)

Publication Number Publication Date
WO2023193395A1 true WO2023193395A1 (en) 2023-10-12

Family

ID=82427168

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/115192 WO2023193395A1 (en) 2022-04-07 2022-08-26 Mixed-frequency data imputation model creation method based on functional data analysis

Country Status (2)

Country Link
CN (1) CN114781837A (en)
WO (1) WO2023193395A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114781837A (en) * 2022-04-07 2022-07-22 广西电网有限责任公司 Method for creating frequency mixing data complementary value model based on functional data analysis

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548285A (en) * 2016-11-04 2017-03-29 广西电网有限责任公司电力科学研究院 The bulk sale power predicating method that meter and small power station exert oneself
CN107220764A (en) * 2017-05-25 2017-09-29 北京中电普华信息技术有限公司 A kind of electricity sales amount Forecasting Methodology compensated based on preamble analysis and factor and device
US20180239586A1 (en) * 2017-02-21 2018-08-23 International Business Machines Corporation Optimizing data approximation analysis using low power circuitry
US20200129130A1 (en) * 2018-10-26 2020-04-30 Firstbeat Technologies Oy Minimum heart rate value approximation
CN113988550A (en) * 2021-10-15 2022-01-28 广西电网有限责任公司 Power dynamic comprehensive evaluation method suitable for frequency mixing data
CN114781837A (en) * 2022-04-07 2022-07-22 广西电网有限责任公司 Method for creating frequency mixing data complementary value model based on functional data analysis

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548285A (en) * 2016-11-04 2017-03-29 广西电网有限责任公司电力科学研究院 The bulk sale power predicating method that meter and small power station exert oneself
US20180239586A1 (en) * 2017-02-21 2018-08-23 International Business Machines Corporation Optimizing data approximation analysis using low power circuitry
CN107220764A (en) * 2017-05-25 2017-09-29 北京中电普华信息技术有限公司 A kind of electricity sales amount Forecasting Methodology compensated based on preamble analysis and factor and device
US20200129130A1 (en) * 2018-10-26 2020-04-30 Firstbeat Technologies Oy Minimum heart rate value approximation
CN113988550A (en) * 2021-10-15 2022-01-28 广西电网有限责任公司 Power dynamic comprehensive evaluation method suitable for frequency mixing data
CN114781837A (en) * 2022-04-07 2022-07-22 广西电网有限责任公司 Method for creating frequency mixing data complementary value model based on functional data analysis

Also Published As

Publication number Publication date
CN114781837A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
Efron et al. Improving retrieval of short texts through document expansion
US8316009B2 (en) Generating histograms of population data by scaling from sample data
US11188738B2 (en) System and method associated with progressive spatial analysis of prodigious 3D data including complex structures
CN110458187B (en) A malicious code family clustering method and system
US8862638B2 (en) Interpolation data template to normalize analytic runs
CN103608812A (en) Query optimization techniques for business intelligence systems
CN104809244B (en) Data digging method and device under a kind of big data environment
Kim et al. Investigating technology opportunities: The use of SAOx analysis
CN110716970B (en) Isomorphic data isomorphism processing method and device, computer equipment and storage medium
Yang et al. An online log template extraction method based on hierarchical clustering
WO2019056887A1 (en) Method for performing probabilistic modeling of large-scale renewable-energy data
CN105740448B (en) More microblogging timing abstract methods towards topic
CN106294418A (en) Search method and searching system
CN117078048A (en) Digital twinning-based intelligent city resource management method and system
Choi et al. Testing goodness-of-fit for Laplace distribution based on maximum entropy
Ho et al. Amic: An adaptive information theoretic method to identify multi-scale temporal correlations in big time series data
WO2023193395A1 (en) Mixed-frequency data imputation model creation method based on functional data analysis
WO2015168988A1 (en) Data index creation method and device, and computer storage medium
CN113254517A (en) Service providing method based on internet big data
US7702699B2 (en) Dynamic data stream histograms for large ranges
CN114821140A (en) Image clustering method, terminal device and storage medium based on Manhattan distance
CN104143009B (en) Competition and cooperation clustering method based on the maximal clearance cutting of dynamic encompassing box
WO2023050649A1 (en) Esg index determination method based on data complementing, and related product
CN113205124B (en) Clustering method, system and storage medium based on density peak value under high-dimensional real scene
Li et al. DbET: Execution Time Distribution-Based Plan Selection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22936323

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE