CN102982229A - Multi-assortment commodity price expectation data pre-processing method based on neural networks - Google Patents
Multi-assortment commodity price expectation data pre-processing method based on neural networks Download PDFInfo
- Publication number
- CN102982229A CN102982229A CN2012103253686A CN201210325368A CN102982229A CN 102982229 A CN102982229 A CN 102982229A CN 2012103253686 A CN2012103253686 A CN 2012103253686A CN 201210325368 A CN201210325368 A CN 201210325368A CN 102982229 A CN102982229 A CN 102982229A
- Authority
- CN
- China
- Prior art keywords
- commodity
- price
- data
- magnitude
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 87
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000007781 pre-processing Methods 0.000 title claims abstract description 36
- 230000006870 function Effects 0.000 claims description 55
- 210000002569 neuron Anatomy 0.000 claims description 48
- 238000012360 testing method Methods 0.000 claims description 33
- 238000012549 training Methods 0.000 claims description 28
- 230000008878 coupling Effects 0.000 claims description 9
- 238000010168 coupling process Methods 0.000 claims description 9
- 238000005859 coupling reaction Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 5
- 238000013075 data extraction Methods 0.000 claims description 4
- 238000005065 mining Methods 0.000 claims description 4
- 238000013461 design Methods 0.000 claims description 3
- 238000003672 processing method Methods 0.000 claims 2
- 238000002474 experimental method Methods 0.000 description 12
- 238000013277 forecasting method Methods 0.000 description 6
- 238000010606 normalization Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 230000004044 response Effects 0.000 description 5
- 235000019483 Peanut oil Nutrition 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 239000000312 peanut oil Substances 0.000 description 4
- 240000007594 Oryza sativa Species 0.000 description 3
- 235000007164 Oryza sativa Nutrition 0.000 description 3
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 3
- 229930006000 Sucrose Natural products 0.000 description 3
- 235000015278 beef Nutrition 0.000 description 3
- 235000013601 eggs Nutrition 0.000 description 3
- 235000013312 flour Nutrition 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 239000003921 oil Substances 0.000 description 3
- 235000019198 oils Nutrition 0.000 description 3
- 235000015277 pork Nutrition 0.000 description 3
- 235000009566 rice Nutrition 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000010365 information processing Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 239000003549 soybean oil Substances 0.000 description 2
- 235000012424 soybean oil Nutrition 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000002238 attenuated effect Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000010779 crude oil Substances 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004377 microelectronic Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000002945 steepest descent method Methods 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
技术领域 technical field
本发明属于数据处理领域,特别涉及一种基于神经网络的多品种商品价格预测的数据预处理方法,可应用于商品价格预测分析与商品销售决策支持系统中的商品价格预测的数据预处理。The invention belongs to the field of data processing, in particular to a neural network-based data preprocessing method for multi-variety commodity price prediction, which can be applied to commodity price prediction data preprocessing in commodity price prediction analysis and commodity sales decision support systems.
背景技术 Background technique
商品价格的预测方法是市场预测分析与商品生产销售决策的基础,是市场预测领域中的一个重要问题,在商品生产、销售等很多问题中起着关键作用,而预测方法中的数据预处理方法对预测方法的通用性和准确性有着很大的影响。由于网络技术的发展与网络商店的普及,因此近年来,人们越来越重视对商品价格的预测方法的研究。商品价格的预测问题可以看作是基于时间序列的数据处理与数据分析问题,分为数据获取、数据处理与预测模型三个方面。股票市场、期货市场、电力市场等公开价格数据获取较为容易,用于价格预测的模型主要有最小二乘回归、神经网络、灰色马尔科夫链、小波理论和GM(1,1)模型等。针对消费类商品价格数据的获取方法,商品价格数据预处理方法和动态价格预测,2010年至2012年,朱全银等给出了商品销售数据抽取与数据挖掘的方法以及基于Web的商品价格的预处理方法和动态预测方法(Quanyin Zhu,Yunyang Yan,Jin Ding and Yu Zhang.The Commodities PriceExtracting for Shop Online,2010International Conference on Future Information Technology andManagement Engineering,Changzhou,Jiangsu,Chian,Dec.2010,Vol.2,pp.317-320;Quanyin Zhu,Yunyang Yan,Jin Ding and Jin Qian.The Case Study for Price Extracting of Mobile Phone SellOnline.IEEE 2nd International Conference on Software Engineering and Service Science,Beijing,Chian,July.2011,pp.281-295;Quanyin Zhu,Sunqun Cao,Jin Ding and Zhengyin Hah.Research onthe Price Forecast without Complete Data based on Web Mining,2011Distributed Computing andApplications to Business,Engineering and Science,Wuxi,Jiangsu,Chian,Oct.2011,pp.120-123;Quanyin Zhu,Hong Zhou,Yunyang Yan,Jin Qian and Pei Zhou.Commodities Price DynamicTrend Analysis Based on Web Mining.The International Conference on Multimedia InformationNetworking and Security,Shanghai,Chian,Nov.2011,pp.524-527;Jianping Deng,Fengwen Cao,Quanyin Zhu,and Yu Zhang.The Web Data Extracting and Application for Shop Online Based onCommodities Classified.Communications in Computer and Information Science,Vol.234(4):189-197;Quanyin Zhu,Suqun Cao,Pei Zhou,Yunyang Yan,Hong Zhou.Integrated Price Forecastbased on Dichotomy Backfilling and Disturbance Factor Algorithm.International Review onComputers and Software,2011.Vol.6(6):1089-1093;Quan-yin Zhu,Pei Zhou,Yun-Yang Yan,Yong-Hua Yin.Exchange Rate Forecasting based on Adaptive Sliding Window and RBF NeuralNetwork.International Review on Computers and Software,2011.Vol.6(7):1290-1296;Jiajun Zong,Quanyin Zhu.Price Forecasting for Agricultural Products Based on BP and RBF Neural.ICSESS2012,p.607-610;Hong Zhou,Quanyin Zhu,Pei Zhou.A Hybrid Price Forecasting Based onLinear Backfilling and Sliding Window Algorithm.International Review on Computers andSoftware,2011.Vol.6(6):1131-1134;王红艳,朱全银,严云洋,钱进.商品价格数据的两种WEB挖掘算法比较.微电子学与计算机.2011.Vol.28(19):168-172)。The commodity price prediction method is the basis of market forecast analysis and commodity production and sales decision-making. It is an important issue in the field of market forecasting and plays a key role in many problems such as commodity production and sales. The data preprocessing method It has a great influence on the generality and accuracy of the forecasting method. Due to the development of network technology and the popularization of online stores, people have paid more and more attention to the research of commodity price forecasting methods in recent years. The problem of commodity price forecasting can be regarded as a problem of data processing and data analysis based on time series, which is divided into three aspects: data acquisition, data processing and forecasting model. It is relatively easy to obtain public price data such as stock market, futures market, and electricity market. The models used for price prediction mainly include least squares regression, neural network, gray Markov chain, wavelet theory, and GM (1,1) model. Aiming at the acquisition method of consumer commodity price data, commodity price data preprocessing method and dynamic price forecasting, from 2010 to 2012, Zhu Quanyin et al. gave commodity sales data extraction and data mining methods and web-based commodity price preprocessing Methods and Dynamic Forecasting Methods (Quanyin Zhu, Yunyang Yan, Jin Ding and Yu Zhang. The Commodities Price Extracting for Shop Online, 2010 International Conference on Future Information Technology and Management Engineering, Changzhou, Jiangsu, Chian, Dec.2010, Vol.2, pp. 317-320; Quanyin Zhu, Yunyang Yan, Jin Ding and Jin Qian. The Case Study for Price Extracting of Mobile Phone SellOnline. IEEE 2nd International Conference on Software Engineering and Service Science, Beijing, Chian, July.2011, pp.281- 295; Quanyin Zhu, Sunqun Cao, Jin Ding and Zhengyin Hah. Research on the Price Forecast without Complete Data based on Web Mining, 2011 Distributed Computing and Applications to Business, Engineering and Science, Wuxi, Jiangsu, Chian, Oct. 102011, pp. 123; Quanyin Zhu, Hong Zhou, Yunyang Yan, Jin Qian and Pei Zhou. Commodities Price Dynamic Trend Analysis Based on Web Mining. The International Conference on Multimedia Information Networking and Security, Shanghai, Chian, No v.2011, pp.524-527; Jianping Deng, Fengwen Cao, Quanyin Zhu, and Yu Zhang. The Web Data Extracting and Application for Shop Online Based on Commodities Classified. Communications in Computer and Information Science, Vol.234(4): 189-197; Quanyin Zhu, Suqun Cao, Pei Zhou, Yunyang Yan, Hong Zhou. Integrated Price Forecast based on Dichotomy Backfilling and Disturbance Factor Algorithm. International Review on Computers and Software, 2011. Vol.6(6): 1089-1093; -yin Zhu, Pei Zhou, Yun-Yang Yan, Yong-Hua Yin. Exchange Rate Forecasting based on Adaptive Sliding Window and RBF NeuralNetwork. International Review on Computers and Software, 2011.Vol.6(7): 1290-1296; Jiajun Zong, Quanyin Zhu. Price Forecasting for Agricultural Products Based on BP and RBF Neural. ICSESS2012, p.607-610; Hong Zhou, Quanyin Zhu, Pei Zhou. A Hybrid Price Forecasting Based on Linear Backfilling and Sliding or view on Compiling Window. International Alg andSoftware, 2011.Vol.6(6): 1131-1134; Wang Hongyan, Zhu Quanyin, Yan Yunyang, Qian Jin. Comparison of Two WEB Mining Algorithms for Commodity Price Data. Microelectronics and Computers. 2011.Vol.28(19): 168 -172).
RBF(Radical Basis Function)神经网络:RBF (Radical Basis Function) neural network:
RBF是一种前馈式神经网络,它模拟了人脑中局部调整、相互覆盖接受域的神经网络结构,具有很强的生物背景和逼近任意非线性函数的能力。它是一种三层结构的前馈网络:第一层为输入层,有信号源节点组成。第二层为隐含层,隐单元的变换函数式是一种局部分布的非负非线性函数,它对中心点径向对称且衰减。隐含层的单元数由所描述问题的需要确定。第三层为输出层,网络的输出是隐单元输出的线性加权。其中,输入层节点只传递输入信号到隐含层;隐含层的基函数为非线性的,它对输入信号产生一个局部化的响应,即每一个隐含节点有一个参数矢量称之为中心。该中心用来与网络输入矢量相比较以产生径向对称响应,仅当输入落在一个很小的指定区域中时,隐含节点才做出有意义的非零响应,响应值在0到1之间,输入与基函数中心的距离越近,隐节点响应越大;输出单元是线性的,即输出单元对隐节点输出进行线性加权组合。RBF is a feed-forward neural network, which simulates the neural network structure of the human brain with local adjustments and mutual coverage of the receptive field. It has a strong biological background and the ability to approximate any nonlinear function. It is a feed-forward network with a three-layer structure: the first layer is the input layer, which is composed of signal source nodes. The second layer is the hidden layer. The transformation function of the hidden unit is a locally distributed non-negative nonlinear function, which is radially symmetrical and attenuated to the center point. The number of units in the hidden layer is determined by the needs of the described problem. The third layer is the output layer, and the output of the network is the linear weighting of the hidden unit output. Among them, the input layer node only transmits the input signal to the hidden layer; the basis function of the hidden layer is nonlinear, and it produces a localized response to the input signal, that is, each hidden node has a parameter vector called the center . The center is used to compare with the network input vector to produce a radially symmetric response. Only when the input falls in a small specified area, the hidden node makes a meaningful non-zero response, and the response value is between 0 and 1. Between, the closer the distance between the input and the center of the basis function, the greater the response of the hidden node; the output unit is linear, that is, the output unit performs a linear weighted combination of the output of the hidden node.
BP(Back Propagation)神经网络:BP (Back Propagation) neural network:
BP是一种按误差逆传播算法训练的多层前馈网络。它能学习和存贮大量的输入-输出模式映射关系,而无需事前揭示描述这种映射关系的数学方程。它的学习规则是使用最速下降法,通过反向传播来不断调整网络的权值和阈值,使网络的误差平方和最小。BP神经网络是一种三层前馈网络,包括输入层、隐层和输出层。输入层各神经元负责接收来自外界的输入信息,并传递给中间层各神经元;中间层是内部信息处理层,负责信息变换,根据信息变化能力的需求,中间层可以设计为单隐层或者多隐层结构;最后一个隐层传递到输出层各神经元的信息,经进一步处理后,完成一次学习的正向传播处理过程,由输出层向外界输出信息处理结果。当实际输出与期望输出不符时,进入误差的反向传播阶段。误差通过输出层,按误差梯度下降的方式修正各层权值,向隐层、输入层逐层反传。周而复始的信息正向传播和误差反向传播过程,是各层权值不断调整的过程,也是神经网络学习训练的过程,此过程一直进行到网络输出的误差减少到可以接受的程度,或者预先设定的学习次数为止。BP is a multilayer feed-forward network trained by the error backpropagation algorithm. It can learn and store a large number of input-output pattern mappings without revealing the mathematical equations describing such mappings in advance. Its learning rule is to use the steepest descent method to continuously adjust the weights and thresholds of the network through backpropagation to minimize the sum of squared errors of the network. BP neural network is a three-layer feedforward network, including input layer, hidden layer and output layer. Each neuron in the input layer is responsible for receiving input information from the outside world and passing it to each neuron in the middle layer; the middle layer is the internal information processing layer, which is responsible for information transformation. According to the requirements of information change capability, the middle layer can be designed as a single hidden layer or Multi-hidden layer structure; the information transmitted from the last hidden layer to each neuron in the output layer, after further processing, completes a forward propagation process of learning, and the output layer outputs information processing results to the outside world. When the actual output does not match the expected output, enter the error backpropagation stage. The error passes through the output layer, corrects the weights of each layer according to the error gradient descent method, and then propagates back to the hidden layer and input layer layer by layer. The repeated process of information forward propagation and error back propagation is a process of continuous adjustment of the weights of each layer, and also a process of neural network learning and training. This process continues until the error of the network output is reduced to an acceptable level, or the pre-set up to the specified number of studies.
以上算法在用于价格预测时,无论是预测准确率,还是算法学习时间上都存在着很大的不确定性。算法中用到的技术计算语言MATLAB中的函数部分参数自定义的不确定性,增加了算法学习时间上和预测精度上的不确定性,这种不确定性使算法在用于商品价格的预测中存在很大的局限性。为了能更好的利用以上算法,提出了很多改进的价格预测方法:基于BP神经网络模型的k-means聚类股价预测;基于BP神经网络的自适应算法的IPO抑价预测;基于组合BP神经网络的时间序列模型的农产品价格预测模型;一种改进的基于小波变换和RBF神经网络的原油价格预测;基于动态RBF神经网络的非线性时间序列预测等。在提出的改进预测方法中,这些预测方法的针对性都较强,缺乏通用性,改进的预测方法只适用于一种商品或者同一类商品,而且预测方法的定参性使预测方法缺乏灵活性,在面对同一类不同种商品时不能保证价格预测的准确性。缺乏灵活性和通用性使这些改进的预测方法不能满足广大的销售商对不同消费种类商品市场预测分析与商品销售决策的迫切需求,因此,需要找到一种能够适用于不同种类商品价格或同种类不同商品价格的预测方法,或找到一种针对不同种类商品价格的数据预处理方法,以获得预测方法更好的通用性和更高的预测准确率。When the above algorithms are used for price prediction, there is a great deal of uncertainty in both the prediction accuracy and the learning time of the algorithm. The uncertainty of some function parameters in MATLAB, the technical computing language used in the algorithm, increases the uncertainty of the algorithm learning time and prediction accuracy. This uncertainty makes the algorithm used in the prediction of commodity prices There are great limitations in . In order to make better use of the above algorithms, many improved price prediction methods have been proposed: k-means cluster stock price prediction based on BP neural network model; IPO underpricing prediction based on adaptive algorithm of BP neural network; Agricultural product price prediction model based on network time series model; an improved crude oil price prediction based on wavelet transform and RBF neural network; nonlinear time series prediction based on dynamic RBF neural network, etc. Among the improved prediction methods proposed, these prediction methods are highly pertinent and lack universality. The improved prediction methods are only applicable to one commodity or the same type of commodity, and the fixed parameters of the prediction method make the prediction method inflexible. , the accuracy of price prediction cannot be guaranteed when faced with the same type of different commodities. The lack of flexibility and versatility makes these improved forecasting methods unable to meet the urgent needs of the vast number of sellers for market forecast analysis and commodity sales decisions of different types of commodities. The prediction method of different commodity prices, or find a data preprocessing method for different types of commodity prices, so as to obtain better versatility and higher prediction accuracy of the prediction method.
发明内容 Contents of the invention
本发明的目的是将归一化原始数据数量级方法与改进的RBF神经网络和BP神经网预测方法结合,利用改进的RBF神经网络和BP神经网络对网页挖掘的商品价格数据计算其最佳数量级,用计算得出的最佳数量级对商品价格数进行归一化数据量级的预处理,之后利用改进的RBF神经网络和BP神将网络进行商品价格的预测,提高RBF神经网络和BP神经网络的预测准确率,同时提高RBF神经网络和BP神经网络用于不同商品价格预测的通用性。The purpose of the present invention is to combine the normalized original data order of magnitude method with the improved RBF neural network and BP neural network prediction method, and utilize the improved RBF neural network and BP neural network to calculate its optimal order of magnitude for the commodity price data of web mining, Use the calculated optimal order of magnitude to preprocess the normalized data magnitude of the commodity price, and then use the improved RBF neural network and BP neural network to predict the commodity price, and improve the performance of the RBF neural network and BP neural network. Prediction accuracy, while improving the versatility of RBF neural network and BP neural network for different commodity price predictions.
本发明的技术方案是通过归一化原始数据数量级方法对网页挖取的数据进行预处理,在实现归一化数量级后的数据集上利用改进的RBF神经网络和BP神经网络计算得出商品价格数据的最佳量级,用计算得出的最佳数量级对商品价格数进行归一化数据量级的预处理,进而完成商品的市场价格预测。The technical solution of the present invention is to preprocess the data excavated from the webpage through the method of normalizing the order of magnitude of the original data, and use the improved RBF neural network and BP neural network to calculate the commodity price on the data set after realizing the normalized order of magnitude The optimal magnitude of the data, using the calculated optimal magnitude to preprocess the normalized data magnitude of the commodity price, and then complete the market price prediction of the commodity.
为便于理解本发明方案,首先对本发明的理论基础进行描述如下:For the convenience of understanding the scheme of the present invention, at first the theoretical basis of the present invention is described as follows:
在基于神经网络的价格预测领域中,提出了很多改进的用于价格预测的数据预处理方法,并都取得了明显的改进效果。但这些改进方法针对性较强,忽视了预测方法的灵活性和通用性,使改进的价格预测方法存在很大的局限性。归一化原始数据数量级的数据预处理方法能很好的提高预测方法的通用性和预测准确率。归一化原始数据数量级方法,对于某一商品的价格数据,相对降低了商品价格数据的波动范围,提高了预测方法的稳定性,同时提高了预测方法对于该商品价格预测时的准确率;对于不同商品的价格数据,相对降低了不同商品价格数据间的差异,同时对于某一特定商品,相对降低了该商品价格数据的波动范围,提高了预测方法的稳定性的同时增强了预测方法的通用性,获得了更高的预测准确率;利用改进的RBF神经网络和BP神经网络在归一化量级后的价格数据上实现商品的价格预测,获得更高的商品价格预测准确率。In the field of price prediction based on neural network, many improved data preprocessing methods for price prediction have been proposed, and all of them have achieved obvious improvement effects. However, these improved methods are highly targeted, ignoring the flexibility and versatility of the forecasting method, which makes the improved price forecasting method have great limitations. The data preprocessing method of normalizing the order of magnitude of the original data can improve the versatility and prediction accuracy of the prediction method. The normalized original data order of magnitude method, for the price data of a commodity, relatively reduces the fluctuation range of the commodity price data, improves the stability of the prediction method, and improves the accuracy of the prediction method for the price prediction of the commodity; The price data of different commodities relatively reduces the difference between the price data of different commodities. At the same time, for a specific commodity, the fluctuation range of the commodity price data is relatively reduced, which improves the stability of the prediction method and enhances the generality of the prediction method. The improved RBF neural network and BP neural network are used to realize the price prediction of commodities on the normalized price data and obtain higher prediction accuracy of commodity prices.
具体的说,本发明方案通过如下各步骤实现归一化原始数据数量级与改进的RBF神经网络和BP神经网络的商品价格预测:Specifically, the scheme of the present invention realizes the normalized original data order of magnitude and the commodity price prediction of the improved RBF neural network and BP neural network through the following steps:
步骤1、抽取网页中商品的名称、型号、类型与价格数据,建立有h个商品的数据集X={A1,A2,...,Ah},设第i个商品抽取的价格数据为n个,Ai={x1,x2,...,xn},其中i∈[1,h],x1,x2,...,xn指第Ai个商品抽取的n个价格数据;
步骤2、计算i个不同商品的价格量级,得到不同商品的价格量级M={b1,b2,...,bh};
步骤3、自定义一个包含数据个数为z的预测样本,共需预测价格个数D;Step 3. Customize a forecast sample that contains the number of data z, and the total number of predicted prices is D;
步骤4、选定预测模型;Step 4, select the prediction model;
步骤5、当选定的预测模型为RBF神经网络,执行步骤6到步骤12;当选定的预测模型为BP神经网络,执行步骤14到步骤21;Step 5, when the selected prediction model is RBF neural network, perform steps 6 to 12; when the selected prediction model is BP neural network, perform steps 14 to 21;
步骤6、设定模型训练函数为技术计算语言MATLAB中的newrbe(P,T,SPREAD)函数,该函数用于设计一个严格的径向基网络,其中P为输入矢量,T为目标矢量,SPREAD为径向基函数的分布;模型预测函数为技术计算语言MATLAB中的sim(′MODEL′,PARAMETERS)函数,此函数用于仿真一个神经网络,其中MODEL为训练好的网络模型,PARAMETERS为输入矢量;设定j个不同的径向基函数的分布值Spreads={spread1,spread2,...,spreadj};Step 6, set the model training function as the newrbe (P, T, SPREAD) function in the technical computing language MATLAB, which is used to design a strict radial basis network, where P is the input vector, T is the target vector, and SPREAD is the distribution of radial basis functions; the model prediction function is the sim('MODEL', PARAMETERS) function in the technical computing language MATLAB, which is used to simulate a neural network, where MODEL is the trained network model, and PARAMETERS is the input vector ;Set the distribution values of j different radial basis functions Spreads={spread 1 , spread 2 ,..., spread j };
步骤7、将商品Ai的销售价格数量级归一化为量级bi,得到 Step 7. Normalize the order of magnitude of the sales price of commodity A i to order of magnitude b i , and obtain
步骤8、将输入矢量P,目标矢量T带入训练函数newrbe(P,T,SPREAD),训练j个不同网络netij=newrbe(P,T,spreadj),建立预测样本Test=[t1,t2,...,tz], Step 8. Bring the input vector P and the target vector T into the training function newrbe(P, T, SPREAD), train j different networks net ij =newrbe(P, T, spread j ), and establish a prediction sample Test=[t 1 ,t 2 ,...,t z ],
步骤9、商品Ai的第n+1天的j个预测值Yij=sim(netij,Test),设商品Ai的第n+1天的最佳预测值为yi,yi∈Yij;Step 9. The j predicted value Y ij of commodity A i on day n+1 = sim(net ij , Test), and the best predicted value of commodity A i on day n+1 is y i , y i ∈ Y ij ;
步骤10、定义耦合权重W=(w1,w2,w3),设商品Ai的第n+1天的三个最佳预测径向基函数的分布的值为Bspreadi1∈Spreads,Bspreadi2∈Spreads,Bspreadi3∈Spreads,求得最佳径向基函数的分布的值
步骤11、训练不变网络net=newrbe(P,T,Bspread);Step 11, training invariant network net=newrbe(P, T, Bspread);
步骤12、带入最佳预测值yi作为预测样本进行下一次预测,方法为新的预测样本[t1,t2,...,tz]中t1=上次预测样本[t1,t2,...,tz]中的t2,新的预测样本[t1,t2,...,tz]中t2=上次预测样本[t1,t2,...,tz]中的t3,…,新的预测样本[t1,t2,...,tz]中tz-1=上次预测样本[t1,t2,...,tz]中的tz,新的预测样本[t1,t2,...,tz]中tz=yi,得到新的预测样本Test=[t1,t2,...,tz],商品第n+2天的预测值yi=sim(net,Test);Step 12. Bring in the best forecast value y i as the forecast sample for the next forecast. The method is t 1 = last forecast sample [t 1 ] in the new forecast sample [t 1 , t 2 , ..., t z ] . _ _ _ _ _ _ _ _ .., t z ] in t 3 , ...,
步骤13、重复步骤12,得到商品Ai的所有预测值;重复步骤7到步骤12,得到数据集X中所有商品在不同数量级上的预测值,并得到最佳预测数量级O,O∈M;Step 13. Repeat step 12 to obtain all predicted values of commodity A i ; repeat steps 7 to 12 to obtain predicted values of all commodities in data set X at different orders of magnitude, and obtain the best predicted order of magnitude O, O∈M;
步骤14、设定模型训练函数为技术计算语言MATLAB中的NET=newff(P,T,NEURON)函数和NET′=train(NET,P,T)函数,其中newff()函数用于创建一个前馈BP网络,P为输入矢量,T为目标矢量,NEURON为隐层神经元个数,train()函数用于训练一个神经网络,NET为创建好的前馈BP网络;模型预测函数为NET′(Test),其中Test为预测样本;设定j个不同的隐层神经元个数的值Neurons={neuron1,neuron2,...,neuronj};Step 14, setting the model training function as the NET=newff(P, T, NEURON) function and NET'=train(NET, P, T) function in the technical computing language MATLAB, wherein the newff() function is used to create a previous Feed BP network, P is the input vector, T is the target vector, NEURON is the number of neurons in the hidden layer, the train() function is used to train a neural network, NET is the created feed-forward BP network; the model prediction function is NET' (Test), wherein Test is a prediction sample; The value Neurons={neuron 1 , neuron 2 ,..., neuron j } of j different hidden layer neuron numbers is set;
步骤15、将商品Ai的销售价格数量级归一化为量级bi,得到 Step 15. Normalize the order of magnitude of the sales price of commodity A i to order of magnitude b i to obtain
步骤16、将输入矢量P,目标矢量T带入训练函数NET=newff(P,T,NEURON)和NET′=train(NET,P,T),训练就j个不同网络netij=newff(P,T,Neurons),netij=train(netij,P,T);建立预测样本Test=[t1,t2,...,tz], Step 16, input vector P, target vector T are brought into training function NET=newff(P, T, NEURON) and NET'=train(NET, P, T), training just j different networks net ij =newff(P , T, Neurons), net ij = train(net ij , P, T); build prediction sample Test=[t 1 , t 2 ,..., t z ],
步骤17、商品Ai的第n+1天的j个预测值Yij=neti(Test),设商品Ai的第n+1天的最佳预测值为yi,yi∈Yij;Step 17. The j predicted value Y ij of commodity A i on day n+1 = net i (Test), assuming that the best predicted value of commodity A i on day n+1 is y i , y i ∈ Y ij ;
步骤18、定义耦合权重W=(w1,w2,w3),设商品Ai的第n+1天的三个最佳预测隐层神经元个数的值为Bneuroni1∈Neurons,Bneuroni2∈Neurons,Bneuroni3∈Neurons,求得最佳隐层神经元个数的值
步骤19、训练不变网络net=newff(P,T,Bneuron),net=train(net,P,T);Step 19, training invariant network net=newff (P, T, Bneuron), net=train (net, P, T);
步骤20、带入最佳预测值yi作为预测样本进行下一次预测,方法为新的预测样本[t1,t2,...,tz]中t1=上次预测样本[t1,t2,...,tz]中的t2,新的预测样本[t1,t2,...,tz]中t2=上次预测样本[t1,t2,...,tz]中的t3,…,新的预测样本[t1,t2,...,tz]中tz-1=上次预测样本[t1,t2,...,tz]中的tz,新的预测样本[t1,t2,...,tz]中tz=yi,得到新的预测样本Test=[t1,t2,...,tz],商品第n+2天的预测值yi=net(Test);Step 20, bring in the best predicted value y i as the forecast sample for the next forecast, the method is t 1 in the new forecast sample [t 1 , t 2 , ..., t z ] = last forecast sample [t 1 . _ _ _ _ _ _ _ _ .., t z ] in t 3 , ...,
步骤21、重复步骤20,得到商品Ai的所有预测值;重复步骤15到步骤20,得到数据集X中所有商品在不同数量级上的预测值,并得到最佳预测数量级O,O∈M。Step 21. Repeat step 20 to obtain all predicted values of commodity A i ; repeat steps 15 to 20 to obtain predicted values of all commodities in data set X at different orders of magnitude, and obtain the best predicted order of magnitude O, O∈M.
步骤1中所述抽取网页中商品的名称、型号、类型与价格数据是指,利用任意Web数据抽取算法,抽取商品在网页上显示的名称、型号、类型与价格数据;其中x1,x2,...,xn可以是第i个商品Ai从一个网页中抽取的n个价格数据,也可以是从多个网页中抽取的n个平均价格数据。Extracting the name, model, type and price data of the commodity in the webpage mentioned in
步骤2是对任一商品的价格数据计算获得该商品价格数据的量级。
步骤3到步骤5是针对任意一个商品在价格预测时的参数设定和预测模型选定,其中z值一般为3,5,7,D值一般为3,7。Steps 3 to 5 are for the parameter setting and forecasting model selection of any commodity in price forecasting, where the z value is generally 3, 5, 7, and the D value is generally 3, 7.
步骤6和步骤14中技术计算语言MATLAB是MathWorks公司的产品,版本为R2011b。The technical computing language MATLAB in step 6 and step 14 is a product of MathWorks, and its version is R2011b.
步骤6到步骤12是针对任意一个商品在一个网页中不同日期的价格数据在改进的RBF神经网络下的预测值,或多个网页中不同日期的平均值价格数据在改进的RBF神经网络下的预测值。Steps 6 to 12 are the predicted value of the price data of any commodity on different dates in a webpage under the improved RBF neural network, or the average price data of different dates in multiple webpages under the improved RBF neural network Predictive value.
步骤14到步骤20是针对任意一个商品在一个网页中不同日期的价格数据在改进的BP神经网络下的预测值,或多个网页中不同日期的平均值价格数据在改进的BP神经网络下的预测值。Steps 14 to 20 are the predicted value of the price data of any commodity on different dates in a webpage under the improved BP neural network, or the average price data of different dates in multiple webpages under the improved BP neural network Predictive value.
步骤6、步骤8、步骤14和步骤16中的输入矢量P为训练样本集,目标矢量T为训练测试预测值的数据集。The input vector P in step 6, step 8, step 14 and step 16 is the training sample set, and the target vector T is the data set of the training test prediction value.
步骤6中预先设定的j值一般为40,步骤14中预先设定的j值一般为10。The preset value of j in step 6 is generally 40, and the preset value of j in step 14 is generally 10.
步骤7和步骤15中是将任一商品的价格数据数量级归一化到统一的量级,商品的价格数据的数量级和归一化的量级相同,该商品的价格数据的数量级不进行归一化量级预处理;商品的价格数据的数量级和归一化的量级不同,该商品的价格数据的数量级进行归一化量级预处理,量级一般为1,10,100,1000。In step 7 and step 15, the magnitude of the price data of any commodity is normalized to a unified magnitude, the magnitude of the price data of the commodity is the same as the normalized magnitude, and the magnitude of the price data of the commodity is not normalized Quantitative magnitude preprocessing; the magnitude of the commodity price data is different from the normalized magnitude. The magnitude of the commodity price data is preprocessed with normalized magnitude. The magnitude is generally 1, 10, 100, 1000.
步骤10和步骤18中定义的耦合权重w=[2,4,2]。Coupling weight w=[2, 4, 2] defined in step 10 and step 18.
相比现有技术的各种价格预测中的数据预处理方法,本发明选取挖掘的网页商品的价格数据,利用改进的RBF神经网络和BP神经网络,计算商品价格原始数据的最佳量级,采用计算所得的最佳量级,对商品价格的原始数据进行统一的归一化量级处理;采用本发明的原始数据的归一化数量级的预处理方法,对于某一特定商品,降低了该商品的价格数据的波动范围;对于不同的商品,降低了不同商品的价格数据间的差异,弥补了现有价格预测方法因数据预处理方法应用于不同商品价格预测时的局限性,提高了预测方法的通用性的同时提高了预测的准确率。Compared with the data preprocessing methods in various price predictions of the prior art, the present invention selects the price data of webpage commodities mined, and uses the improved RBF neural network and BP neural network to calculate the optimal magnitude of the original data of commodity prices, Using the calculated optimal order of magnitude, the original data of the commodity price is processed in a unified normalized order of magnitude; the preprocessing method of the normalized order of magnitude of the original data of the present invention is used for a certain commodity, which reduces the The fluctuation range of commodity price data; for different commodities, the difference between the price data of different commodities is reduced, which makes up for the limitations of the existing price prediction method when the data preprocessing method is applied to the price prediction of different commodities, and improves the prediction The versatility of the method improves the prediction accuracy at the same time.
附图说明 Description of drawings
图1为本发明具体实施方式的流程图。Fig. 1 is a flowchart of a specific embodiment of the present invention.
具体实施方式 Detailed ways
下面结合附图对本发明的技术方案进行详细说明:The technical scheme of the present invention is described in detail below in conjunction with accompanying drawing:
如附图1所示,本发明实施方案按照以下步骤进行:As shown in accompanying drawing 1, embodiment of the present invention carries out according to the following steps:
步骤1、抽取网页中商品的名称、型号、类型与价格数据,建立有h个商品的数据集X={A1,A2,...,Ah},设第i个商品抽取的价格数据为n个,Ai={x1,x2,...,xn},其中i∈[1,h],x1,x2,...,xn指第Ai个商品抽取的n个价格数据;
步骤2、计算i个不同商品的价格量级,得到不同商品的价格量级M={b1,b2,...,bh};
步骤3、自定义一个包含数据个数为z的预测样本,共需预测价格个数D;Step 3. Customize a forecast sample that contains the number of data z, and the total number of predicted prices is D;
步骤4、选定预测模型;Step 4, select the prediction model;
步骤5、当选定的预测模型为RBF神经网络,执行步骤6到步骤12;当选定的预测模型为BP神经网络,执行步骤14到步骤21;Step 5, when the selected prediction model is RBF neural network, perform steps 6 to 12; when the selected prediction model is BP neural network, perform steps 14 to 21;
步骤6、设定模型训练函数为技术计算语言MATLAB中的newrbe(P,T,SPREAD)函数,该函数用于设计一个严格的径向基网络,其中P为输入矢量,T为目标矢量,SPREAD为径向基函数的分布;模型预测函数为技术计算语言MATLAB中的sim(′MODEL′,PARAMETERS)函数,此函数用于仿真一个神经网络,其中MODEL为训练好的网络模型,PARAMETERS为输入矢量;设定j个不同的径向基函数的分布值Spreads={spread1,spread2,...,spreadj};Step 6, set the model training function as the newrbe (P, T, SPREAD) function in the technical computing language MATLAB, which is used to design a strict radial basis network, where P is the input vector, T is the target vector, and SPREAD is the distribution of radial basis functions; the model prediction function is the sim('MODEL', PARAMETERS) function in the technical computing language MATLAB, which is used to simulate a neural network, where MODEL is the trained network model, and PARAMETERS is the input vector ;Set the distribution values of j different radial basis functions Spreads={spread 1 , spread 2 ,..., spread j };
步骤7、将商品Ai的销售价格数量级归一化为量级bi,得到 Step 7. Normalize the order of magnitude of the sales price of commodity A i to order of magnitude b i , and obtain
步骤8、将输入矢量P,目标矢量T带入训练函数newrbe(P,T,SPREAD),训练j个不同网络netij=newrbe(P,T,spreadj),建立预测样本Test=[t1,t2,...,tz], Step 8. Bring the input vector P and the target vector T into the training function newrbe(P, T, SPREAD), train j different networks net ij =newrbe(P, T, spread j ), and establish a prediction sample Test=[t 1 ,t 2 ,...,t z ],
步骤9、商品Ai的第n+1天的j个预测值Yij=sim(netij,Test),设商品Ai的第n+1天的最佳预测值为yi,yi∈Yij;Step 9. The j predicted value Y ij of commodity A i on day n+1 = sim(net ij , Test), and the best predicted value of commodity A i on day n+1 is y i , y i ∈ Y ij ;
步骤10、定义耦合权重W=(w1,w2,w3),设商品Ai的第n+1天的三个最佳预测径向基函数的分布的值为Bspreadi1∈Spreads,Bspreaddi2∈Spreads,Bspreadi3∈Spreads,求得最佳径向基函数的分布的值
步骤11、训练不变网络net=newrbe(P,T,Bspread);Step 11, training invariant network net=newrbe(P, T, Bspread);
步骤12、带入最佳预测值yi作为预测样本进行下一次预测,方法为新的预测样本[t1,t2,...,tz]中t1=上次预测样本[t1,t2,...,tz]中的t2,新的预测样本[t1,t2,...,tz]中t2=上次预测样本[t1,t2,...,tz]中的t3,…,新的预测样本[t1,t2,...,tz]]中tz-1=上次预测样本[t1,t2,...,tz]中的tz,新的预测样本[t1,t2,...,tz]中tz=yi,得到新的预测样本Test=[t1,t2,...,tz,商品第n+2天的预测值yi=sim(net,Test);Step 12. Bring in the best forecast value y i as the forecast sample for the next forecast. The method is t 1 = last forecast sample [t 1 ] in the new forecast sample [t 1 , t 2 , ..., t z ] . _ _ _ _ _ _ _ _ .., t z ] in t 3 , ..., new forecast sample [t 1 , t 2 , ..., t z ]] in t z-1 = last forecast sample [t 1 , t 2 , . t z in .., t z ] , t z = y i in the new prediction sample [t 1 , t 2 , ..., t z ], and get a new prediction sample Test=[t 1 , t 2 , ..., t z , the predicted value y i =sim(net, Test) of the product on the n+2th day;
步骤13、重复步骤12,得到商品Ai的所有预测值;重复步骤7到步骤12,得到数据集X中所有商品在不同数量级上的预测值,并得到最佳预测数量级O,O∈M;Step 13. Repeat step 12 to obtain all predicted values of commodity A i ; repeat steps 7 to 12 to obtain predicted values of all commodities in data set X at different orders of magnitude, and obtain the best predicted order of magnitude O, O∈M;
步骤14、设定模型训练函数为技术计算语言MATLAB中的NET=newff(P,T,NEURON)函数和NET′=train(NET,P,T)函数,其中newff()函数用于创建一个前馈BP网络,P为输入矢量,T为目标矢量,NEURON为隐层神经元个数,train()函数用于训练一个神经网络,NET为创建好的前馈BP网络;模型预测函数为NET′(Test),其中Test为预测样本;设定j个不同的隐层神经元个数的值Neurons={neuron1,neuron2,...,neuronj};Step 14, setting the model training function as the NET=newff(P, T, NEURON) function and NET'=train(NET, P, T) function in the technical computing language MATLAB, wherein the newff() function is used to create a previous Feed BP network, P is the input vector, T is the target vector, NEURON is the number of neurons in the hidden layer, the train() function is used to train a neural network, NET is the created feed-forward BP network; the model prediction function is NET' (Test), wherein Test is a prediction sample; The value Neurons={neuron 1 , neuron 2 ,..., neuron j } of j different hidden layer neuron numbers is set;
步骤15、将商品Ai的销售价格数量级归一化为量级bi,得到 Step 15. Normalize the order of magnitude of the sales price of commodity A i to order of magnitude b i to obtain
步骤16、将输入矢量P,目标矢量T带入训练函数NET=newff(P,T,NEURON)和NET′=train(NET,P,T),训练就j个不同网络netij=newff(P,T,Neurons),netij=train(netij,P,T);建立预测样本Test=[t1,t2,...,tz], Step 16, input vector P, target vector T are brought into training function NET=newff(P, T, NEURON) and NET'=train(NET, P, T), training just j different networks net ij =newff(P , T, Neurons), net ij = train(net ij , P, T); build prediction sample Test=[t 1 , t 2 ,..., t z ],
步骤17、商品Ai的第n+1天的j个预测值Yij=neti(Test),设商品Ai的第n+1天的最佳预测值为yi,yi∈Yij;Step 17. The j predicted value Y ij of commodity A i on day n+1 = net i (Test), assuming that the best predicted value of commodity A i on day n+1 is y i , y i ∈ Y ij ;
步骤18、定义耦合权重W=(w1,w2,w3),设商品Ai的第n+1天的三个最佳预测隐层神经元个数的值为Bneuroni1∈Neurons,Bneuroni2∈Neurons,Bneuroni3∈Neurons,求得最佳隐层神经元个数的值
步骤19、训练不变网络net=newff(P,T,Bneuron),net=train(net,P,T);Step 19, training invariant network net=newff (P, T, Bneuron), net=train (net, P, T);
步骤20、带入最佳预测值yi作为预测样本进行下一次预测,方法为新的预测样本[t1,t2,...,tz]中t1=上次预测样本[t1,t2,...,tz]中的t2,新的预测样本[t1,t2,...,tz]中t2=上次预测样本[t1,t2,...,tz]中的t3,…,新的预测样本[t1,t2,...,tz]中tz-1=上次预测样本[t1,t2,...,tz]中的tz,新的预测样本[t1,t2,...,tz]中tz=yi,得到新的预测样本Test=[t1,t2,...,tz],商品第n+2天的预测值yi=net(Test);Step 20, bring in the best predicted value y i as the forecast sample for the next forecast, the method is t 1 in the new forecast sample [t 1 , t 2 , ..., t z ] = last forecast sample [t 1 . _ _ _ _ _ _ _ _ .., t z ] in t 3 , ...,
步骤21、重复步骤20,得到商品Ai的所有预测值;重复步骤15到步骤20,得到数据集X中所有商品在不同数量级上的预测值,并得到最佳预测数量级O,O∈M。Step 21. Repeat step 20 to obtain all predicted values of commodity A i ; repeat steps 15 to 20 to obtain predicted values of all commodities in data set X at different orders of magnitude, and obtain the best predicted order of magnitude O, O∈M.
步骤1中所述抽取网页中商品的名称、型号、类型与价格数据是指,利用任意Web数据抽取算法,抽取商品在网页上显示的名称、型号、类型与价格数据;其中x1,x2,...,xn可以是第i个商品Ai从一个网页中抽取的n个价格数据,也可以是从多个网页中抽取的n个平均价格数据。Extracting the name, model, type and price data of the commodity in the webpage mentioned in
步骤2是对任一商品的价格数据计算获得该商品价格数据的量级。
步骤3到步骤5是针对任意一个商品在价格预测时的参数设定和预测模型选定,其中z值一般为3,5,7,D值一般为3,7。Steps 3 to 5 are for the parameter setting and forecasting model selection of any commodity in price forecasting, where the z value is generally 3, 5, 7, and the D value is generally 3, 7.
步骤6和步骤14中技术计算语言MATLAB是MathWorks公司的产品,版本为R2011b。The technical computing language MATLAB in step 6 and step 14 is a product of MathWorks, and its version is R2011b.
步骤6到步骤12是针对任意一个商品在一个网页中不同日期的价格数据在改进的RBF神经网络下的预测值,或多个网页中不同日期的平均值价格数据在改进的RBF神经网络下的预测值。Steps 6 to 12 are the predicted value of the price data of any commodity on different dates in a webpage under the improved RBF neural network, or the average price data of different dates in multiple webpages under the improved RBF neural network Predictive value.
:步骤14到步骤20是针对任意一个商品在一个网页中不同日期的价格数据在改进的BP神经网络下的预测值,或多个网页中不同日期的平均值价格数据在改进的BP神经网络下的预测值。: Step 14 to step 20 is the predicted value under the improved BP neural network for the price data of any commodity on different dates in a webpage, or the average price data of different dates in multiple webpages under the improved BP neural network predicted value of .
步骤6、步骤8、步骤14和步骤16中的输入矢量P为训练样本集,目标矢量T为训练测试预测值的数据集。The input vector P in step 6, step 8, step 14 and step 16 is the training sample set, and the target vector T is the data set of the training test prediction value.
步骤6中预先设定的j值一般为40,步骤14中预先设定的j值一般为10。The preset value of j in step 6 is generally 40, and the preset value of j in step 14 is generally 10.
步骤7和步骤15中是将任一商品的价格数据数量级归一化到统一的量级,商品的价格数据的数量级和归一化的量级相同,该商品的价格数据的数量级不进行归一化量级预处理;商品的价格数据的数量级和归一化的量级不同,该商品的价格数据的数量级进行归一化量级预处理,量级一般为1,10,100,1000。In step 7 and step 15, the magnitude of the price data of any commodity is normalized to a unified magnitude, the magnitude of the price data of the commodity is the same as the normalized magnitude, and the magnitude of the price data of the commodity is not normalized Quantitative magnitude preprocessing; the magnitude of the commodity price data is different from the normalized magnitude. The magnitude of the commodity price data is preprocessed with normalized magnitude. The magnitude is generally 1, 10, 100, 1000.
步骤10和步骤18中定义的耦合权重w=[2,4,2]。Coupling weight w=[2, 4, 2] defined in step 10 and step 18.
为了更好地说明本方法的有效性,利用从网页上抽取的8种不同人民币汇率从2011年1月1日至2011年12月31日的每天平均价格数据作为原始数据,计算得出原始数据的量级为1、10和100,对原始数据数量级进行归一化处理实验。In order to better illustrate the effectiveness of this method, the daily average price data of 8 different RMB exchange rates extracted from the webpage from January 1, 2011 to December 31, 2011 were used as the original data to calculate the original data The order of magnitude is 1, 10 and 100, and the normalization experiment is performed on the order of magnitude of the original data.
在改进的RBF神经网络的实验环境下,在不进行归一化量级预处理时,原始数据的实验结果为:澳大利亚元的平均误差为3.9%,港币的平均误差为11.82%,加拿大元的平均误差为640.84%,美元的平均误差为21571.04%,欧元的平均误差为1.66%,日元的平均误差为1.15%,瑞士法郎的平均误差为2.77%,新加坡元的平均误差为28959.17%,实验的平均误差为6399.04%;在数据量级归一化为100时,实验结果为:澳大利亚元的平均误差为3.9%,港币的平均误差为1.97%,加拿大元的平均误差为640.84%,美元的平均误差为21571.04%,欧元的平均误差为1.66%,日元的平均误差为1233177%,瑞士法郎的平均误差为2.77%,新加坡元的平均误差为28959.17%,实验的平均误差为160544.8%;在数据量级归一化为10时,实验结果为:澳大利亚元的平均误差为1.77%,港币的平均误差为0.94%,加拿大元的平均误差为0.39%,美元的平均误差为0.11%,欧元的平均误差为224438.5%,日元的平均误差为1.05%,瑞士法郎的平均误差为1.57%,新加坡元的平均误差为0.50%,实验的平均误差为28055.6%;在数据量级归一化为1时,实验结果为:澳大利亚元的平均误差为0.98%,港币的平均误差为0.94%,加拿大元的平均误差为0.41%,美元的平均误差为0.09%,欧元的平均误差为1.12%,日元的平均误差为1.29%,瑞士法郎的平均误差为1.91%,新加坡元的平均误差为0.16%,实验的平均误差为0.86%。结论是数据量级归一化为1时,取得了最好的预测结果,预测的平均准确率达99.14%。In the experimental environment of the improved RBF neural network, without normalization magnitude preprocessing, the experimental results of the original data are: the average error of the Australian dollar is 3.9%, the average error of the Hong Kong dollar is 11.82%, and the average error of the Canadian dollar is 3.9%. The average error is 640.84%, the average error of USD is 21571.04%, the average error of EUR is 1.66%, the average error of JPY is 1.15%, the average error of CHF is 2.77%, and the average error of SGD is 28959.17%. The average error is 6399.04%; when the data magnitude is normalized to 100, the experimental results are: the average error of the Australian dollar is 3.9%, the average error of the Hong Kong dollar is 1.97%, the average error of the Canadian dollar is 640.84%, and the average error of the US dollar is 640.84%. The average error is 21571.04%, the average error of the euro is 1.66%, the average error of the Japanese yen is 1233177%, the average error of the Swiss franc is 2.77%, the average error of the Singapore dollar is 28959.17%, and the average error of the experiment is 160544.8%; When the data magnitude is normalized to 10, the experimental results are: the average error of the Australian dollar is 1.77%, the average error of the Hong Kong dollar is 0.94%, the average error of the Canadian dollar is 0.39%, the average error of the US dollar is 0.11%, and the average error of the euro is 0.94%. The average error is 224438.5%, the average error is 1.05% for Japanese yen, 1.57% for Swiss franc, 0.50% for Singapore dollar, and 28055.6% for the experiment; normalized to 1 in the data magnitude , the experimental results are: the average error of the Australian dollar is 0.98%, the average error of the Hong Kong dollar is 0.94%, the average error of the Canadian dollar is 0.41%, the average error of the US dollar is 0.09%, the average error of the euro is 1.12%, the Japanese yen The average error is 1.29% for the Swiss franc, 1.91% for the Swiss franc, 0.16% for the Singapore dollar, and 0.86% for the experiment. The conclusion is that when the data magnitude is normalized to 1, the best prediction result is obtained, and the average prediction accuracy rate reaches 99.14%.
在改进的BP神经网络的实验环境下,在不进行归一化量级预处理时,原始数据的实验结果为:澳大利亚元的平均误差为1.31%,港币的平均误差为0.17%,加拿大元的平均误差为0.28%,美元的平均误差为0.26%,欧元的平均误差为1.41%,日元的平均误差为1.24%,瑞士法郎的平均误差为1.65%,新加坡元的平均误差为0.21%,实验的平均误差为0.82%;在数量级归一化为100时,实验结果为:澳大利亚元的平均误差为1.31%,港币的平均误差为0.24%,加拿大元的平均误差为0.46%,美元的平均误差为0.03%,欧元的平均误差为2.21%,日元的平均误差为1.14%,瑞士法郎的平均误差为1.59%,新加坡元的平均误差为2.98%,实验的平均误差为1.25%;在数据量级归一化为10时,实验结果为:澳大利亚元的平均误差为1.09%,港币的平均误差为0.28%,加拿大元的平均误差为0.13%,美元的平均误差为0.48%,欧元的平均误差为1.38%,日元的平均误差为2.54%,瑞士法郎的平均误差为1.93%,新加坡元的平均误差为0.06%,实验的平均误差为0.99%;在数据量级归一化为1时,实验结果为:澳大利亚元的平均误差为0.39%,港币的平均误差为0.18%,加拿大元的平均误差为0.37%,美元的平均误差为0.40%,欧元的平均误差为1.43%,日元的平均误差为1.18%,瑞士法郎的平均误差为1.74%,新加坡元的平均误差为0.28%,实验的平均误差为0.75%。结论是数据量级归一化为1时取得了最好的预测结果,预测的平均准确率高达99.25%。In the experimental environment of the improved BP neural network, without normalization magnitude preprocessing, the experimental results of the original data are: the average error of the Australian dollar is 1.31%, the average error of the Hong Kong dollar is 0.17%, and the average error of the Canadian dollar is 1.31%. The average error is 0.28%, the average error for USD is 0.26%, the average error for EUR is 1.41%, the average error for JPY is 1.24%, the average error for CHF is 1.65%, and the average error for SGD is 0.21%. The average error is 0.82%; when the order of magnitude is normalized to 100, the experimental results are: the average error of the Australian dollar is 1.31%, the average error of the Hong Kong dollar is 0.24%, the average error of the Canadian dollar is 0.46%, and the average error of the US dollar is 0.03%, the average error of the Euro is 2.21%, the average error of the Japanese Yen is 1.14%, the average error of the Swiss Franc is 1.59%, the average error of the Singapore Dollar is 2.98%, and the average error of the experiment is 1.25%. When level normalization is 10, the experimental results are: the average error of Australian dollar is 1.09%, the average error of Hong Kong dollar is 0.28%, the average error of Canadian dollar is 0.13%, the average error of US dollar is 0.48%, and the average error of euro is 1.38%, the average error of the Japanese Yen is 2.54%, the average error of the Swiss Franc is 1.93%, the average error of the Singapore Dollar is 0.06%, and the average error of the experiment is 0.99%. When the data magnitude is normalized to 1, The experimental results are: the average error of the Australian dollar is 0.39%, the average error of the Hong Kong dollar is 0.18%, the average error of the Canadian dollar is 0.37%, the average error of the US dollar is 0.40%, the average error of the euro is 1.43%, and the average error of the Japanese yen The error is 1.18%, the average error is 1.74% for Swiss francs, 0.28% for Singapore dollars, and 0.75% for experiments. The conclusion is that when the data magnitude is normalized to 1, the best prediction result is obtained, and the average prediction accuracy rate is as high as 99.25%.
以上实验数据说明了此数据预处理方法对同种类不同商品的通用性,为了说明此数据预处理方法对不同种类商品的通用性,利用从网页上抽取的10种不同农产品从2011年1月至2012年2月共59周的周平均价格数据作为原始数据,计算得出原始数据的量级为1和10,对原始数据数量级进行归一化预处理实验。The above experimental data shows the generality of this data preprocessing method for different commodities of the same type. In order to illustrate the generality of this data preprocessing method for different types of commodities, 10 different agricultural products extracted from The weekly average price data of 59 weeks in February 2012 was used as the original data, and the order of magnitude of the original data was calculated as 1 and 10, and the normalization preprocessing experiment was carried out on the order of magnitude of the original data.
在改进的RBF神经网络的实验环境下,在不进行归一化量级预处理时,原始数据的实验结果为:牛肉的平均误差为3149934%,豆油的平均误差为17.96%,鸡蛋的平均误差为1.61%,花生油的平均误差为2.89%,面粉的平均误差为0.11%,猪肉的平均误差为542574.4%,大米的平均误差为0.34%,白砂糖的平均误差为0.44%,调和油的平均误差为6.61%,羊肉的平均误差为325260%,实验的平均误差为401779.9%;在数据量级归一化为10时,实验结果为:牛肉的平均误差为3149934%,豆油的平均误差为17.96%,鸡蛋的平均误差为1.61%,花生油的平均误差为2.89%,面粉的平均误差为0.12%,猪肉的平均误差为542574.4%,大米的平均误差为0.34%,白砂糖的平均误差为0.44%,调和油的平均误差为6.61%,羊肉的平均误差为325260%,实验的平均误差为401779.9%;在数据量级归一化为1时,实验结果为:牛肉的平均误差为2.44%,豆油的平均误差为17.96%,鸡蛋的平均误差为1.61%,花生油的平均误差为0.91%,面粉的平均误差为0.11%,猪肉的平均误差为7.35%,大米的平均误差为0.34%,白砂糖的平均误差为0.44%,调和油的平均误差为0.13%,羊肉的平均误差为0.41%,实验的平均误差为3.17%。结论是数据量级归一化为1时的实验取得了最好的预测结果,预测的平均准确率达到96.83%。In the experimental environment of the improved RBF neural network, without normalization magnitude preprocessing, the experimental results of the original data are: the average error of beef is 3149934%, the average error of soybean oil is 17.96%, and the average error of eggs The average error of peanut oil is 1.61%, the average error of peanut oil is 2.89%, the average error of flour is 0.11%, the average error of pork is 542574.4%, the average error of rice is 0.34%, the average error of white sugar is 0.44%, the average error of blended oil is 6.61%, the average error of mutton is 325260%, and the average error of the experiment is 401779.9%; when the data magnitude is normalized to 10, the experimental results are: the average error of beef is 3149934%, and the average error of soybean oil is 17.96% , the average error of eggs is 1.61%, the average error of peanut oil is 2.89%, the average error of flour is 0.12%, the average error of pork is 542574.4%, the average error of rice is 0.34%, and the average error of white sugar is 0.44%, The average error of blended oil is 6.61%, the average error of mutton is 325260%, and the average error of experiment is 401779.9%. When the data level is normalized to 1, the experimental results are: the average error of beef is 2.44%, the The average error is 17.96%, the average error of eggs is 1.61%, the average error of peanut oil is 0.91%, the average error of flour is 0.11%, the average error of pork is 7.35%, the average error of rice is 0.34%, and the average error of white sugar The error is 0.44%, the average error of blended oil is 0.13%, the average error of mutton is 0.41%, and the average error of experiment is 3.17%. The conclusion is that the experiment when the data magnitude is normalized to 1 has achieved the best prediction results, and the average prediction accuracy rate reaches 96.83%.
本发明可与计算机系统结合,从而自动完成商品价格的预测。The invention can be combined with a computer system to automatically complete the forecast of commodity prices.
本发明创造性的提出了一种基于神经网络的多品种商品价格预测的数据预处理方法,并将该数据预处理方法应用于人民币汇率、农产品等商品价格数据的预处理,利用改进的RBF神经网络和BP神经网络在预处理后的价格数据上进行商品价格的预测,提高了预测方法的通用性,获得了更高的预测准确率,具有很高的实用价值。The present invention creatively proposes a data preprocessing method based on neural network multi-variety commodity price prediction, and applies the data preprocessing method to the preprocessing of commodity price data such as RMB exchange rate and agricultural products, and utilizes the improved RBF neural network Using the BP neural network to predict commodity prices on the preprocessed price data improves the versatility of the prediction method and obtains higher prediction accuracy, which has high practical value.
本发明提出的一种基于神经网络的多品种商品价格预测的数据预处理方法不但可以用于人民币汇率和农产品生产与销售领域价格预测时的数据预处理,也可以用于其他消费类商品价格预测时的数据预处理。A neural network-based data preprocessing method for multi-variety commodity price prediction proposed by the present invention can not only be used for data preprocessing in RMB exchange rate and agricultural product production and sales field price prediction, but also can be used for other consumer commodity price prediction time data preprocessing.
Claims (11)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210325368.6A CN102982229B (en) | 2012-09-06 | 2012-09-06 | The data preprocessing method of a kind of multi items price forecasting of commodity based on neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210325368.6A CN102982229B (en) | 2012-09-06 | 2012-09-06 | The data preprocessing method of a kind of multi items price forecasting of commodity based on neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102982229A true CN102982229A (en) | 2013-03-20 |
CN102982229B CN102982229B (en) | 2016-06-08 |
Family
ID=47856241
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210325368.6A Expired - Fee Related CN102982229B (en) | 2012-09-06 | 2012-09-06 | The data preprocessing method of a kind of multi items price forecasting of commodity based on neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102982229B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104715295A (en) * | 2015-04-03 | 2015-06-17 | 江苏物联网研究发展中心 | Chinese chemical fertilizer price index prediction method based on BP neural network |
CN104732411A (en) * | 2015-03-27 | 2015-06-24 | 中国农业科学院农业信息研究所 | Agricultural product consumption guide method based on BP neural network |
CN104732435A (en) * | 2015-04-03 | 2015-06-24 | 中国农业科学院农业信息研究所 | Agricultural product supply and demand matching system and method |
CN107203828A (en) * | 2017-06-22 | 2017-09-26 | 中北大学 | A kind of reinforcing bar price expectation method, system and platform |
CN108230043A (en) * | 2018-01-31 | 2018-06-29 | 安庆师范大学 | A kind of product area pricing method based on Grey Neural Network Model |
CN109508461A (en) * | 2018-12-29 | 2019-03-22 | 重庆猪八戒网络有限公司 | Order price prediction technique, terminal and medium based on Chinese natural language processing |
WO2019165692A1 (en) * | 2018-02-27 | 2019-09-06 | 平安科技(深圳)有限公司 | Carbon futures price prediction method, apparatus, computer device and storage medium |
CN111402042A (en) * | 2020-02-17 | 2020-07-10 | 中信建投证券股份有限公司 | Data analysis and display method for stock market large disc state analysis |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1353380A (en) * | 2000-11-14 | 2002-06-12 | 亚洲证券交易所有限公司 | Method of providing finance data comment and data processing system |
EP1901062A1 (en) * | 2005-05-23 | 2008-03-19 | Keio University | Method of taste measuring, taste sensor therefor and taste measuring apparatus |
CN101853480A (en) * | 2009-03-31 | 2010-10-06 | 北京邮电大学 | A Forex Trading Method Based on Neural Network Forecasting Model |
US20110087627A1 (en) * | 2009-10-08 | 2011-04-14 | General Electric Company | Using neural network confidence to improve prediction accuracy |
CN102934131A (en) * | 2010-04-14 | 2013-02-13 | 西门子公司 | Method for computer-aided learning of recurrent neural networks to model dynamical systems |
-
2012
- 2012-09-06 CN CN201210325368.6A patent/CN102982229B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1353380A (en) * | 2000-11-14 | 2002-06-12 | 亚洲证券交易所有限公司 | Method of providing finance data comment and data processing system |
EP1901062A1 (en) * | 2005-05-23 | 2008-03-19 | Keio University | Method of taste measuring, taste sensor therefor and taste measuring apparatus |
CN101853480A (en) * | 2009-03-31 | 2010-10-06 | 北京邮电大学 | A Forex Trading Method Based on Neural Network Forecasting Model |
US20110087627A1 (en) * | 2009-10-08 | 2011-04-14 | General Electric Company | Using neural network confidence to improve prediction accuracy |
CN102934131A (en) * | 2010-04-14 | 2013-02-13 | 西门子公司 | Method for computer-aided learning of recurrent neural networks to model dynamical systems |
Non-Patent Citations (3)
Title |
---|
吕欣: "基于神经网络股票价格预测模型及系统的研究", 《中国优秀硕士学位论文全文数据库 经济与管理科学辑 》 * |
李自珍: "基于神经网络的期货价格预测与模型实现", 《中国优秀硕士学位论文全文数据库 经济与管理科学辑 》 * |
罗长寿: "基于神经网络与遗传算法的蔬菜市场价格预测方法研究", 《科技通报》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104732411A (en) * | 2015-03-27 | 2015-06-24 | 中国农业科学院农业信息研究所 | Agricultural product consumption guide method based on BP neural network |
CN104732411B (en) * | 2015-03-27 | 2018-01-26 | 中国农业科学院农业信息研究所 | A Consumption Guidance Method of Agricultural Products Based on BP Neural Network |
CN104715295A (en) * | 2015-04-03 | 2015-06-17 | 江苏物联网研究发展中心 | Chinese chemical fertilizer price index prediction method based on BP neural network |
CN104732435A (en) * | 2015-04-03 | 2015-06-24 | 中国农业科学院农业信息研究所 | Agricultural product supply and demand matching system and method |
CN107203828A (en) * | 2017-06-22 | 2017-09-26 | 中北大学 | A kind of reinforcing bar price expectation method, system and platform |
CN108230043A (en) * | 2018-01-31 | 2018-06-29 | 安庆师范大学 | A kind of product area pricing method based on Grey Neural Network Model |
WO2019165692A1 (en) * | 2018-02-27 | 2019-09-06 | 平安科技(深圳)有限公司 | Carbon futures price prediction method, apparatus, computer device and storage medium |
CN109508461A (en) * | 2018-12-29 | 2019-03-22 | 重庆猪八戒网络有限公司 | Order price prediction technique, terminal and medium based on Chinese natural language processing |
CN111402042A (en) * | 2020-02-17 | 2020-07-10 | 中信建投证券股份有限公司 | Data analysis and display method for stock market large disc state analysis |
CN111402042B (en) * | 2020-02-17 | 2023-10-27 | 中信建投证券股份有限公司 | Data analysis and display method for stock market big disk shape analysis |
Also Published As
Publication number | Publication date |
---|---|
CN102982229B (en) | 2016-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102982229B (en) | The data preprocessing method of a kind of multi items price forecasting of commodity based on neural network | |
Pang et al. | An innovative neural network approach for stock market prediction | |
Lu et al. | A CNN‐LSTM‐based model to forecast stock prices | |
Rezaei et al. | Stock price prediction using deep learning and frequency decomposition | |
Kanwal et al. | BiCuDNNLSTM-1dCNN—A hybrid deep learning-based predictive model for stock price prediction | |
Lin et al. | Empirical mode decomposition–based least squares support vector regression for foreign exchange rate forecasting | |
Falavigna | Financial ratings with scarce information: A neural network approach | |
Hu et al. | New CBR adaptation method combining with problem–solution relational analysis for mechanical design | |
Ruíz et al. | Parallel memetic algorithm for training recurrent neural networks for the energy efficiency problem | |
Deng et al. | A novel hybrid optimization algorithm of computational intelligence techniques for highway passenger volume prediction | |
Agami et al. | A neural network based dynamic forecasting model for Trend Impact Analysis | |
Dalal et al. | TLIA: Time-series forecasting model using long short-term memory integrated with artificial neural networks for volatile energy markets | |
Nourbakhsh et al. | Combining LSTM and CNN methods and fundamental analysis for stock price trend prediction | |
Pang et al. | Stock Market Prediction based on Deep Long Short Term Memory Neural Network. | |
Jabeen et al. | An LSTM based forecasting for major stock sectors using COVID sentiment | |
Babaei et al. | GPT classifications, with application to credit lending | |
Ye et al. | The prediction of stock price based on improved wavelet neural network | |
Shahzadi et al. | A novel data driven approach for combating energy theft in urbanized smart grids using artificial intelligence | |
Shobeiry et al. | Smart short-term load forecasting through coordination of LSTM-based models and feature engineering methods during the COVID-19 pandemic | |
CN115080868A (en) | Product pushing method, product pushing device, computer equipment, storage medium and program product | |
Zhang et al. | Market-level integrated detection against cyber attacks in real-time market operations by self-supervised learning | |
Abouhassan et al. | Why Use Evolving Neuro-Fuzzy and Spiking Neural Networks for incremental and explainable learning of time series? A case study on predictive modelling of trade imports and outlier detection | |
Quan-Yin et al. | A novel efficient adaptive sliding window model for week-ahead price forecasting | |
Abdelaziz et al. | An Epsilon constraint method for selecting Indicators for use in Neural Networks for stock market forecasting | |
ERTUĞRUL | A novel randomized recurrent artificial neural network approach: recurrent random vector functional link network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 223005 Jiangsu city of Huaian Province Higher Education Park Mei Cheng Road No. 1 Huaiyin Institute of Technology computer and software engineering, building 11, room 416 Applicant after: HUAIYIN INSTITUTE OF TECHNOLOGY Address before: 223003 Jiangsu city of Huaian Province Higher Education Park Mei Cheng Road No. 1 Applicant before: HUAIYIN INSTITUTE OF TECHNOLOGY |
|
CB03 | Change of inventor or designer information |
Inventor after: Zhu Quanyin Inventor after: Zhou Hong Inventor after: Yin Yonghua Inventor after: Yan Yunyang Inventor after: Cao Suqun Inventor after: Zhu Fujian Inventor after: Li Xiang Inventor after: Hu Ronglin Inventor after: Chen Ting Inventor after: Jin Ying Inventor before: Zhu Quanyin Inventor before: Yin Yonghua Inventor before: Yan Yunyang Inventor before: Chen Ting Inventor before: Cao Suqun |
|
COR | Change of bibliographic data | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CP02 | Change in the address of a patent holder |
Address after: 223400 8th floor, Anton building, 10 Haian Road, Lianshui County, Jiangsu. Patentee after: HUAIYIN INSTITUTE OF TECHNOLOGY Address before: 223005 Room 416, Building 11, School of Computer and Software Engineering, Huaiyin Institute of Technology, No. 1 Meizheng East Road, Huai'an Higher Education Park, Jiangsu Province Patentee before: HUAIYIN INSTITUTE OF TECHNOLOGY |
|
CP02 | Change in the address of a patent holder | ||
TR01 | Transfer of patent right |
Effective date of registration: 20190409 Address after: 223005 No. 9 Haikou Road, Huaian Economic and Technological Development Zone, Jiangsu Province Patentee after: HUAIAN FUN SOFWARE CO.,LTD. Address before: 223400 8th floor, Anton building, 10 Haian Road, Lianshui County, Jiangsu. Patentee before: HUAIYIN INSTITUTE OF TECHNOLOGY |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20210408 Address after: 214135 509, building D, Xingye building, 97-1, Linghu Avenue, Xinwu District, Wuxi City, Jiangsu Province Patentee after: Wuxi manlai Software Co.,Ltd. Address before: 223005 No. 9 Haikou Road, Huaian Economic and Technological Development Zone, Jiangsu Province Patentee before: HUAIAN FUN SOFWARE Co.,Ltd. |
|
TR01 | Transfer of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160608 Termination date: 20210906 |