CN102982229A - Multi-assortment commodity price expectation data pre-processing method based on neural networks - Google Patents

Multi-assortment commodity price expectation data pre-processing method based on neural networks Download PDF

Info

Publication number
CN102982229A
CN102982229A CN2012103253686A CN201210325368A CN102982229A CN 102982229 A CN102982229 A CN 102982229A CN 2012103253686 A CN2012103253686 A CN 2012103253686A CN 201210325368 A CN201210325368 A CN 201210325368A CN 102982229 A CN102982229 A CN 102982229A
Authority
CN
China
Prior art keywords
commodity
price
data
magnitude
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012103253686A
Other languages
Chinese (zh)
Other versions
CN102982229B (en
Inventor
朱全银
尹永华
严云洋
陈婷
曹苏群
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Manlai Software Co ltd
Original Assignee
Huaiyin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaiyin Institute of Technology filed Critical Huaiyin Institute of Technology
Priority to CN201210325368.6A priority Critical patent/CN102982229B/en
Publication of CN102982229A publication Critical patent/CN102982229A/en
Application granted granted Critical
Publication of CN102982229B publication Critical patent/CN102982229B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a multi-assortment commodity price expectation data pre-processing method based on neural networks. The best order of magnitude of commodity price data which is obtained from websites is calculated by an improved radical basis function (RBF) neural networks and an improved back propagation (BP) artificial neural networks. The calculated best order of magnitude is used to preprocess normalized order of magnitude of the commodity price. Expectation accuracy of the RBF neural networks and the BP neural networks is improved. Generality of the RBF neural networks and the BP neural networks for expectation of different kinds of commodity prices is improved.

Description

一种基于神经网络的多品种商品价格预测的数据预处理方法A data preprocessing method for multi-variety commodity price prediction based on neural network

技术领域 technical field

本发明属于数据处理领域,特别涉及一种基于神经网络的多品种商品价格预测的数据预处理方法,可应用于商品价格预测分析与商品销售决策支持系统中的商品价格预测的数据预处理。The invention belongs to the field of data processing, in particular to a neural network-based data preprocessing method for multi-variety commodity price prediction, which can be applied to commodity price prediction data preprocessing in commodity price prediction analysis and commodity sales decision support systems.

背景技术 Background technique

商品价格的预测方法是市场预测分析与商品生产销售决策的基础,是市场预测领域中的一个重要问题,在商品生产、销售等很多问题中起着关键作用,而预测方法中的数据预处理方法对预测方法的通用性和准确性有着很大的影响。由于网络技术的发展与网络商店的普及,因此近年来,人们越来越重视对商品价格的预测方法的研究。商品价格的预测问题可以看作是基于时间序列的数据处理与数据分析问题,分为数据获取、数据处理与预测模型三个方面。股票市场、期货市场、电力市场等公开价格数据获取较为容易,用于价格预测的模型主要有最小二乘回归、神经网络、灰色马尔科夫链、小波理论和GM(1,1)模型等。针对消费类商品价格数据的获取方法,商品价格数据预处理方法和动态价格预测,2010年至2012年,朱全银等给出了商品销售数据抽取与数据挖掘的方法以及基于Web的商品价格的预处理方法和动态预测方法(Quanyin Zhu,Yunyang Yan,Jin Ding and Yu Zhang.The Commodities PriceExtracting for Shop Online,2010International Conference on Future Information Technology andManagement Engineering,Changzhou,Jiangsu,Chian,Dec.2010,Vol.2,pp.317-320;Quanyin Zhu,Yunyang Yan,Jin Ding and Jin Qian.The Case Study for Price Extracting of Mobile Phone SellOnline.IEEE 2nd International Conference on Software Engineering and Service Science,Beijing,Chian,July.2011,pp.281-295;Quanyin Zhu,Sunqun Cao,Jin Ding and Zhengyin Hah.Research onthe Price Forecast without Complete Data based on Web Mining,2011Distributed Computing andApplications to Business,Engineering and Science,Wuxi,Jiangsu,Chian,Oct.2011,pp.120-123;Quanyin Zhu,Hong Zhou,Yunyang Yan,Jin Qian and Pei Zhou.Commodities Price DynamicTrend Analysis Based on Web Mining.The International Conference on Multimedia InformationNetworking and Security,Shanghai,Chian,Nov.2011,pp.524-527;Jianping Deng,Fengwen Cao,Quanyin Zhu,and Yu Zhang.The Web Data Extracting and Application for Shop Online Based onCommodities Classified.Communications in Computer and Information Science,Vol.234(4):189-197;Quanyin Zhu,Suqun Cao,Pei Zhou,Yunyang Yan,Hong Zhou.Integrated Price Forecastbased on Dichotomy Backfilling and Disturbance Factor Algorithm.International Review onComputers and Software,2011.Vol.6(6):1089-1093;Quan-yin Zhu,Pei Zhou,Yun-Yang Yan,Yong-Hua Yin.Exchange Rate Forecasting based on Adaptive Sliding Window and RBF NeuralNetwork.International Review on Computers and Software,2011.Vol.6(7):1290-1296;Jiajun Zong,Quanyin Zhu.Price Forecasting for Agricultural Products Based on BP and RBF Neural.ICSESS2012,p.607-610;Hong Zhou,Quanyin Zhu,Pei Zhou.A Hybrid Price Forecasting Based onLinear Backfilling and Sliding Window Algorithm.International Review on Computers andSoftware,2011.Vol.6(6):1131-1134;王红艳,朱全银,严云洋,钱进.商品价格数据的两种WEB挖掘算法比较.微电子学与计算机.2011.Vol.28(19):168-172)。The commodity price prediction method is the basis of market forecast analysis and commodity production and sales decision-making. It is an important issue in the field of market forecasting and plays a key role in many problems such as commodity production and sales. The data preprocessing method It has a great influence on the generality and accuracy of the forecasting method. Due to the development of network technology and the popularization of online stores, people have paid more and more attention to the research of commodity price forecasting methods in recent years. The problem of commodity price forecasting can be regarded as a problem of data processing and data analysis based on time series, which is divided into three aspects: data acquisition, data processing and forecasting model. It is relatively easy to obtain public price data such as stock market, futures market, and electricity market. The models used for price prediction mainly include least squares regression, neural network, gray Markov chain, wavelet theory, and GM (1,1) model. Aiming at the acquisition method of consumer commodity price data, commodity price data preprocessing method and dynamic price forecasting, from 2010 to 2012, Zhu Quanyin et al. gave commodity sales data extraction and data mining methods and web-based commodity price preprocessing Methods and Dynamic Forecasting Methods (Quanyin Zhu, Yunyang Yan, Jin Ding and Yu Zhang. The Commodities Price Extracting for Shop Online, 2010 International Conference on Future Information Technology and Management Engineering, Changzhou, Jiangsu, Chian, Dec.2010, Vol.2, pp. 317-320; Quanyin Zhu, Yunyang Yan, Jin Ding and Jin Qian. The Case Study for Price Extracting of Mobile Phone SellOnline. IEEE 2nd International Conference on Software Engineering and Service Science, Beijing, Chian, July.2011, pp.281- 295; Quanyin Zhu, Sunqun Cao, Jin Ding and Zhengyin Hah. Research on the Price Forecast without Complete Data based on Web Mining, 2011 Distributed Computing and Applications to Business, Engineering and Science, Wuxi, Jiangsu, Chian, Oct. 102011, pp. 123; Quanyin Zhu, Hong Zhou, Yunyang Yan, Jin Qian and Pei Zhou. Commodities Price Dynamic Trend Analysis Based on Web Mining. The International Conference on Multimedia Information Networking and Security, Shanghai, Chian, No v.2011, pp.524-527; Jianping Deng, Fengwen Cao, Quanyin Zhu, and Yu Zhang. The Web Data Extracting and Application for Shop Online Based on Commodities Classified. Communications in Computer and Information Science, Vol.234(4): 189-197; Quanyin Zhu, Suqun Cao, Pei Zhou, Yunyang Yan, Hong Zhou. Integrated Price Forecast based on Dichotomy Backfilling and Disturbance Factor Algorithm. International Review on Computers and Software, 2011. Vol.6(6): 1089-1093; -yin Zhu, Pei Zhou, Yun-Yang Yan, Yong-Hua Yin. Exchange Rate Forecasting based on Adaptive Sliding Window and RBF NeuralNetwork. International Review on Computers and Software, 2011.Vol.6(7): 1290-1296; Jiajun Zong, Quanyin Zhu. Price Forecasting for Agricultural Products Based on BP and RBF Neural. ICSESS2012, p.607-610; Hong Zhou, Quanyin Zhu, Pei Zhou. A Hybrid Price Forecasting Based on Linear Backfilling and Sliding or view on Compiling Window. International Alg andSoftware, 2011.Vol.6(6): 1131-1134; Wang Hongyan, Zhu Quanyin, Yan Yunyang, Qian Jin. Comparison of Two WEB Mining Algorithms for Commodity Price Data. Microelectronics and Computers. 2011.Vol.28(19): 168 -172).

RBF(Radical Basis Function)神经网络:RBF (Radical Basis Function) neural network:

RBF是一种前馈式神经网络,它模拟了人脑中局部调整、相互覆盖接受域的神经网络结构,具有很强的生物背景和逼近任意非线性函数的能力。它是一种三层结构的前馈网络:第一层为输入层,有信号源节点组成。第二层为隐含层,隐单元的变换函数式是一种局部分布的非负非线性函数,它对中心点径向对称且衰减。隐含层的单元数由所描述问题的需要确定。第三层为输出层,网络的输出是隐单元输出的线性加权。其中,输入层节点只传递输入信号到隐含层;隐含层的基函数为非线性的,它对输入信号产生一个局部化的响应,即每一个隐含节点有一个参数矢量称之为中心。该中心用来与网络输入矢量相比较以产生径向对称响应,仅当输入落在一个很小的指定区域中时,隐含节点才做出有意义的非零响应,响应值在0到1之间,输入与基函数中心的距离越近,隐节点响应越大;输出单元是线性的,即输出单元对隐节点输出进行线性加权组合。RBF is a feed-forward neural network, which simulates the neural network structure of the human brain with local adjustments and mutual coverage of the receptive field. It has a strong biological background and the ability to approximate any nonlinear function. It is a feed-forward network with a three-layer structure: the first layer is the input layer, which is composed of signal source nodes. The second layer is the hidden layer. The transformation function of the hidden unit is a locally distributed non-negative nonlinear function, which is radially symmetrical and attenuated to the center point. The number of units in the hidden layer is determined by the needs of the described problem. The third layer is the output layer, and the output of the network is the linear weighting of the hidden unit output. Among them, the input layer node only transmits the input signal to the hidden layer; the basis function of the hidden layer is nonlinear, and it produces a localized response to the input signal, that is, each hidden node has a parameter vector called the center . The center is used to compare with the network input vector to produce a radially symmetric response. Only when the input falls in a small specified area, the hidden node makes a meaningful non-zero response, and the response value is between 0 and 1. Between, the closer the distance between the input and the center of the basis function, the greater the response of the hidden node; the output unit is linear, that is, the output unit performs a linear weighted combination of the output of the hidden node.

BP(Back Propagation)神经网络:BP (Back Propagation) neural network:

BP是一种按误差逆传播算法训练的多层前馈网络。它能学习和存贮大量的输入-输出模式映射关系,而无需事前揭示描述这种映射关系的数学方程。它的学习规则是使用最速下降法,通过反向传播来不断调整网络的权值和阈值,使网络的误差平方和最小。BP神经网络是一种三层前馈网络,包括输入层、隐层和输出层。输入层各神经元负责接收来自外界的输入信息,并传递给中间层各神经元;中间层是内部信息处理层,负责信息变换,根据信息变化能力的需求,中间层可以设计为单隐层或者多隐层结构;最后一个隐层传递到输出层各神经元的信息,经进一步处理后,完成一次学习的正向传播处理过程,由输出层向外界输出信息处理结果。当实际输出与期望输出不符时,进入误差的反向传播阶段。误差通过输出层,按误差梯度下降的方式修正各层权值,向隐层、输入层逐层反传。周而复始的信息正向传播和误差反向传播过程,是各层权值不断调整的过程,也是神经网络学习训练的过程,此过程一直进行到网络输出的误差减少到可以接受的程度,或者预先设定的学习次数为止。BP is a multilayer feed-forward network trained by the error backpropagation algorithm. It can learn and store a large number of input-output pattern mappings without revealing the mathematical equations describing such mappings in advance. Its learning rule is to use the steepest descent method to continuously adjust the weights and thresholds of the network through backpropagation to minimize the sum of squared errors of the network. BP neural network is a three-layer feedforward network, including input layer, hidden layer and output layer. Each neuron in the input layer is responsible for receiving input information from the outside world and passing it to each neuron in the middle layer; the middle layer is the internal information processing layer, which is responsible for information transformation. According to the requirements of information change capability, the middle layer can be designed as a single hidden layer or Multi-hidden layer structure; the information transmitted from the last hidden layer to each neuron in the output layer, after further processing, completes a forward propagation process of learning, and the output layer outputs information processing results to the outside world. When the actual output does not match the expected output, enter the error backpropagation stage. The error passes through the output layer, corrects the weights of each layer according to the error gradient descent method, and then propagates back to the hidden layer and input layer layer by layer. The repeated process of information forward propagation and error back propagation is a process of continuous adjustment of the weights of each layer, and also a process of neural network learning and training. This process continues until the error of the network output is reduced to an acceptable level, or the pre-set up to the specified number of studies.

以上算法在用于价格预测时,无论是预测准确率,还是算法学习时间上都存在着很大的不确定性。算法中用到的技术计算语言MATLAB中的函数部分参数自定义的不确定性,增加了算法学习时间上和预测精度上的不确定性,这种不确定性使算法在用于商品价格的预测中存在很大的局限性。为了能更好的利用以上算法,提出了很多改进的价格预测方法:基于BP神经网络模型的k-means聚类股价预测;基于BP神经网络的自适应算法的IPO抑价预测;基于组合BP神经网络的时间序列模型的农产品价格预测模型;一种改进的基于小波变换和RBF神经网络的原油价格预测;基于动态RBF神经网络的非线性时间序列预测等。在提出的改进预测方法中,这些预测方法的针对性都较强,缺乏通用性,改进的预测方法只适用于一种商品或者同一类商品,而且预测方法的定参性使预测方法缺乏灵活性,在面对同一类不同种商品时不能保证价格预测的准确性。缺乏灵活性和通用性使这些改进的预测方法不能满足广大的销售商对不同消费种类商品市场预测分析与商品销售决策的迫切需求,因此,需要找到一种能够适用于不同种类商品价格或同种类不同商品价格的预测方法,或找到一种针对不同种类商品价格的数据预处理方法,以获得预测方法更好的通用性和更高的预测准确率。When the above algorithms are used for price prediction, there is a great deal of uncertainty in both the prediction accuracy and the learning time of the algorithm. The uncertainty of some function parameters in MATLAB, the technical computing language used in the algorithm, increases the uncertainty of the algorithm learning time and prediction accuracy. This uncertainty makes the algorithm used in the prediction of commodity prices There are great limitations in . In order to make better use of the above algorithms, many improved price prediction methods have been proposed: k-means cluster stock price prediction based on BP neural network model; IPO underpricing prediction based on adaptive algorithm of BP neural network; Agricultural product price prediction model based on network time series model; an improved crude oil price prediction based on wavelet transform and RBF neural network; nonlinear time series prediction based on dynamic RBF neural network, etc. Among the improved prediction methods proposed, these prediction methods are highly pertinent and lack universality. The improved prediction methods are only applicable to one commodity or the same type of commodity, and the fixed parameters of the prediction method make the prediction method inflexible. , the accuracy of price prediction cannot be guaranteed when faced with the same type of different commodities. The lack of flexibility and versatility makes these improved forecasting methods unable to meet the urgent needs of the vast number of sellers for market forecast analysis and commodity sales decisions of different types of commodities. The prediction method of different commodity prices, or find a data preprocessing method for different types of commodity prices, so as to obtain better versatility and higher prediction accuracy of the prediction method.

发明内容 Contents of the invention

本发明的目的是将归一化原始数据数量级方法与改进的RBF神经网络和BP神经网预测方法结合,利用改进的RBF神经网络和BP神经网络对网页挖掘的商品价格数据计算其最佳数量级,用计算得出的最佳数量级对商品价格数进行归一化数据量级的预处理,之后利用改进的RBF神经网络和BP神将网络进行商品价格的预测,提高RBF神经网络和BP神经网络的预测准确率,同时提高RBF神经网络和BP神经网络用于不同商品价格预测的通用性。The purpose of the present invention is to combine the normalized original data order of magnitude method with the improved RBF neural network and BP neural network prediction method, and utilize the improved RBF neural network and BP neural network to calculate its optimal order of magnitude for the commodity price data of web mining, Use the calculated optimal order of magnitude to preprocess the normalized data magnitude of the commodity price, and then use the improved RBF neural network and BP neural network to predict the commodity price, and improve the performance of the RBF neural network and BP neural network. Prediction accuracy, while improving the versatility of RBF neural network and BP neural network for different commodity price predictions.

本发明的技术方案是通过归一化原始数据数量级方法对网页挖取的数据进行预处理,在实现归一化数量级后的数据集上利用改进的RBF神经网络和BP神经网络计算得出商品价格数据的最佳量级,用计算得出的最佳数量级对商品价格数进行归一化数据量级的预处理,进而完成商品的市场价格预测。The technical solution of the present invention is to preprocess the data excavated from the webpage through the method of normalizing the order of magnitude of the original data, and use the improved RBF neural network and BP neural network to calculate the commodity price on the data set after realizing the normalized order of magnitude The optimal magnitude of the data, using the calculated optimal magnitude to preprocess the normalized data magnitude of the commodity price, and then complete the market price prediction of the commodity.

为便于理解本发明方案,首先对本发明的理论基础进行描述如下:For the convenience of understanding the scheme of the present invention, at first the theoretical basis of the present invention is described as follows:

在基于神经网络的价格预测领域中,提出了很多改进的用于价格预测的数据预处理方法,并都取得了明显的改进效果。但这些改进方法针对性较强,忽视了预测方法的灵活性和通用性,使改进的价格预测方法存在很大的局限性。归一化原始数据数量级的数据预处理方法能很好的提高预测方法的通用性和预测准确率。归一化原始数据数量级方法,对于某一商品的价格数据,相对降低了商品价格数据的波动范围,提高了预测方法的稳定性,同时提高了预测方法对于该商品价格预测时的准确率;对于不同商品的价格数据,相对降低了不同商品价格数据间的差异,同时对于某一特定商品,相对降低了该商品价格数据的波动范围,提高了预测方法的稳定性的同时增强了预测方法的通用性,获得了更高的预测准确率;利用改进的RBF神经网络和BP神经网络在归一化量级后的价格数据上实现商品的价格预测,获得更高的商品价格预测准确率。In the field of price prediction based on neural network, many improved data preprocessing methods for price prediction have been proposed, and all of them have achieved obvious improvement effects. However, these improved methods are highly targeted, ignoring the flexibility and versatility of the forecasting method, which makes the improved price forecasting method have great limitations. The data preprocessing method of normalizing the order of magnitude of the original data can improve the versatility and prediction accuracy of the prediction method. The normalized original data order of magnitude method, for the price data of a commodity, relatively reduces the fluctuation range of the commodity price data, improves the stability of the prediction method, and improves the accuracy of the prediction method for the price prediction of the commodity; The price data of different commodities relatively reduces the difference between the price data of different commodities. At the same time, for a specific commodity, the fluctuation range of the commodity price data is relatively reduced, which improves the stability of the prediction method and enhances the generality of the prediction method. The improved RBF neural network and BP neural network are used to realize the price prediction of commodities on the normalized price data and obtain higher prediction accuracy of commodity prices.

具体的说,本发明方案通过如下各步骤实现归一化原始数据数量级与改进的RBF神经网络和BP神经网络的商品价格预测:Specifically, the scheme of the present invention realizes the normalized original data order of magnitude and the commodity price prediction of the improved RBF neural network and BP neural network through the following steps:

步骤1、抽取网页中商品的名称、型号、类型与价格数据,建立有h个商品的数据集X={A1,A2,...,Ah},设第i个商品抽取的价格数据为n个,Ai={x1,x2,...,xn},其中i∈[1,h],x1,x2,...,xn指第Ai个商品抽取的n个价格数据;Step 1. Extract the name, model, type and price data of the commodities in the webpage, and establish a data set X={A 1 , A 2 ,...,A h } with h commodities, and set the extracted price of the i-th commodity There are n pieces of data, A i = {x 1 , x 2 , ..., x n }, where i∈[1, h], x 1 , x 2 , ..., x n refers to the item A i The extracted n price data;

步骤2、计算i个不同商品的价格量级,得到不同商品的价格量级M={b1,b2,...,bh};Step 2. Calculate the price magnitudes of i different commodities, and obtain the price magnitudes of different commodities M={b 1 , b 2 ,..., b h };

步骤3、自定义一个包含数据个数为z的预测样本,共需预测价格个数D;Step 3. Customize a forecast sample that contains the number of data z, and the total number of predicted prices is D;

步骤4、选定预测模型;Step 4, select the prediction model;

步骤5、当选定的预测模型为RBF神经网络,执行步骤6到步骤12;当选定的预测模型为BP神经网络,执行步骤14到步骤21;Step 5, when the selected prediction model is RBF neural network, perform steps 6 to 12; when the selected prediction model is BP neural network, perform steps 14 to 21;

步骤6、设定模型训练函数为技术计算语言MATLAB中的newrbe(P,T,SPREAD)函数,该函数用于设计一个严格的径向基网络,其中P为输入矢量,T为目标矢量,SPREAD为径向基函数的分布;模型预测函数为技术计算语言MATLAB中的sim(′MODEL′,PARAMETERS)函数,此函数用于仿真一个神经网络,其中MODEL为训练好的网络模型,PARAMETERS为输入矢量;设定j个不同的径向基函数的分布值Spreads={spread1,spread2,...,spreadj};Step 6, set the model training function as the newrbe (P, T, SPREAD) function in the technical computing language MATLAB, which is used to design a strict radial basis network, where P is the input vector, T is the target vector, and SPREAD is the distribution of radial basis functions; the model prediction function is the sim('MODEL', PARAMETERS) function in the technical computing language MATLAB, which is used to simulate a neural network, where MODEL is the trained network model, and PARAMETERS is the input vector ;Set the distribution values of j different radial basis functions Spreads={spread 1 , spread 2 ,..., spread j };

步骤7、将商品Ai的销售价格数量级归一化为量级bi,得到

Figure BSA00000773631700031
Step 7. Normalize the order of magnitude of the sales price of commodity A i to order of magnitude b i , and obtain
Figure BSA00000773631700031

步骤8、将输入矢量P,目标矢量T带入训练函数newrbe(P,T,SPREAD),训练j个不同网络netij=newrbe(P,T,spreadj),建立预测样本Test=[t1,t2,...,tz],

Figure BSA00000773631700041
Step 8. Bring the input vector P and the target vector T into the training function newrbe(P, T, SPREAD), train j different networks net ij =newrbe(P, T, spread j ), and establish a prediction sample Test=[t 1 ,t 2 ,...,t z ],
Figure BSA00000773631700041

步骤9、商品Ai的第n+1天的j个预测值Yij=sim(netij,Test),设商品Ai的第n+1天的最佳预测值为yi,yi∈YijStep 9. The j predicted value Y ij of commodity A i on day n+1 = sim(net ij , Test), and the best predicted value of commodity A i on day n+1 is y i , y i ∈ Y ij ;

步骤10、定义耦合权重W=(w1,w2,w3),设商品Ai的第n+1天的三个最佳预测径向基函数的分布的值为Bspreadi1∈Spreads,Bspreadi2∈Spreads,Bspreadi3∈Spreads,求得最佳径向基函数的分布的值 Bspread = Bspread i 1 * w 1 + Bespread i 2 * w 2 + Bspread i 3 * w 3 w 1 + w 2 + w 3 ; Step 10. Define the coupling weight W=(w 1 , w 2 , w 3 ), and set the distribution values of the three best forecast radial basis functions of commodity A i on day n+1 as Bspread i1 ∈ Spreads, Bspread i2 ∈ Spreads, Bspread i3 ∈ Spreads, find the value of the distribution of the best radial basis function Bspread = Bspread i 1 * w 1 + Bespread i 2 * w 2 + Bspread i 3 * w 3 w 1 + w 2 + w 3 ;

步骤11、训练不变网络net=newrbe(P,T,Bspread);Step 11, training invariant network net=newrbe(P, T, Bspread);

步骤12、带入最佳预测值yi作为预测样本进行下一次预测,方法为新的预测样本[t1,t2,...,tz]中t1=上次预测样本[t1,t2,...,tz]中的t2,新的预测样本[t1,t2,...,tz]中t2=上次预测样本[t1,t2,...,tz]中的t3,…,新的预测样本[t1,t2,...,tz]中tz-1=上次预测样本[t1,t2,...,tz]中的tz,新的预测样本[t1,t2,...,tz]中tz=yi,得到新的预测样本Test=[t1,t2,...,tz],商品第n+2天的预测值yi=sim(net,Test);Step 12. Bring in the best forecast value y i as the forecast sample for the next forecast. The method is t 1 = last forecast sample [t 1 ] in the new forecast sample [t 1 , t 2 , ..., t z ] . _ _ _ _ _ _ _ _ .., t z ] in t 3 , ..., t z- 1 in the new forecast sample [t 1 , t 2 , ..., t z ] = last forecast sample [t 1 , t 2 , .. ., t z ], t z in the new prediction sample [t 1 , t 2 ,..., t z ], t z =y i , and get a new prediction sample Test=[t 1 , t 2 ,. .., t z ], the predicted value of the product on day n+2 yi=sim(net, Test);

步骤13、重复步骤12,得到商品Ai的所有预测值;重复步骤7到步骤12,得到数据集X中所有商品在不同数量级上的预测值,并得到最佳预测数量级O,O∈M;Step 13. Repeat step 12 to obtain all predicted values of commodity A i ; repeat steps 7 to 12 to obtain predicted values of all commodities in data set X at different orders of magnitude, and obtain the best predicted order of magnitude O, O∈M;

步骤14、设定模型训练函数为技术计算语言MATLAB中的NET=newff(P,T,NEURON)函数和NET′=train(NET,P,T)函数,其中newff()函数用于创建一个前馈BP网络,P为输入矢量,T为目标矢量,NEURON为隐层神经元个数,train()函数用于训练一个神经网络,NET为创建好的前馈BP网络;模型预测函数为NET′(Test),其中Test为预测样本;设定j个不同的隐层神经元个数的值Neurons={neuron1,neuron2,...,neuronj};Step 14, setting the model training function as the NET=newff(P, T, NEURON) function and NET'=train(NET, P, T) function in the technical computing language MATLAB, wherein the newff() function is used to create a previous Feed BP network, P is the input vector, T is the target vector, NEURON is the number of neurons in the hidden layer, the train() function is used to train a neural network, NET is the created feed-forward BP network; the model prediction function is NET' (Test), wherein Test is a prediction sample; The value Neurons={neuron 1 , neuron 2 ,..., neuron j } of j different hidden layer neuron numbers is set;

步骤15、将商品Ai的销售价格数量级归一化为量级bi,得到

Figure BSA00000773631700043
Step 15. Normalize the order of magnitude of the sales price of commodity A i to order of magnitude b i to obtain
Figure BSA00000773631700043

步骤16、将输入矢量P,目标矢量T带入训练函数NET=newff(P,T,NEURON)和NET′=train(NET,P,T),训练就j个不同网络netij=newff(P,T,Neurons),netij=train(netij,P,T);建立预测样本Test=[t1,t2,...,tz],

Figure BSA00000773631700044
Step 16, input vector P, target vector T are brought into training function NET=newff(P, T, NEURON) and NET'=train(NET, P, T), training just j different networks net ij =newff(P , T, Neurons), net ij = train(net ij , P, T); build prediction sample Test=[t 1 , t 2 ,..., t z ],
Figure BSA00000773631700044

步骤17、商品Ai的第n+1天的j个预测值Yij=neti(Test),设商品Ai的第n+1天的最佳预测值为yi,yi∈YijStep 17. The j predicted value Y ij of commodity A i on day n+1 = net i (Test), assuming that the best predicted value of commodity A i on day n+1 is y i , y iY ij ;

步骤18、定义耦合权重W=(w1,w2,w3),设商品Ai的第n+1天的三个最佳预测隐层神经元个数的值为Bneuroni1∈Neurons,Bneuroni2∈Neurons,Bneuroni3∈Neurons,求得最佳隐层神经元个数的值 Bneuron = Bneuron i 1 * w 1 + Bneuron i 2 * w 2 + Bneuron i 3 * w 3 w 1 + w 2 + w 3 ; Step 18. Define the coupling weight W=(w 1 , w 2 , w 3 ), and set the value of the three best predicted hidden layer neurons on day n+1 of commodity A i as Bneuron i1 ∈ Neurons, Bneuron i2 ∈ Neurons, Bneuron i3 ∈ Neurons, find the value of the optimal number of neurons in the hidden layer Bneuron = Bneuron i 1 * w 1 + Bneuron i 2 * w 2 + Bneuron i 3 * w 3 w 1 + w 2 + w 3 ;

步骤19、训练不变网络net=newff(P,T,Bneuron),net=train(net,P,T);Step 19, training invariant network net=newff (P, T, Bneuron), net=train (net, P, T);

步骤20、带入最佳预测值yi作为预测样本进行下一次预测,方法为新的预测样本[t1,t2,...,tz]中t1=上次预测样本[t1,t2,...,tz]中的t2,新的预测样本[t1,t2,...,tz]中t2=上次预测样本[t1,t2,...,tz]中的t3,…,新的预测样本[t1,t2,...,tz]中tz-1=上次预测样本[t1,t2,...,tz]中的tz,新的预测样本[t1,t2,...,tz]中tz=yi,得到新的预测样本Test=[t1,t2,...,tz],商品第n+2天的预测值yi=net(Test);Step 20, bring in the best predicted value y i as the forecast sample for the next forecast, the method is t 1 in the new forecast sample [t 1 , t 2 , ..., t z ] = last forecast sample [t 1 . _ _ _ _ _ _ _ _ .., t z ] in t 3 , ..., t z- 1 in the new forecast sample [t 1 , t 2 , ..., t z ] = last forecast sample [t 1 , t 2 , .. ., t z ], t z in the new prediction sample [t 1 , t 2 ,..., t z ], t z =y i , and get a new prediction sample Test=[t 1 , t 2 ,. .., t z ], the predicted value y i =net(Test) of the commodity on the n+2th day;

步骤21、重复步骤20,得到商品Ai的所有预测值;重复步骤15到步骤20,得到数据集X中所有商品在不同数量级上的预测值,并得到最佳预测数量级O,O∈M。Step 21. Repeat step 20 to obtain all predicted values of commodity A i ; repeat steps 15 to 20 to obtain predicted values of all commodities in data set X at different orders of magnitude, and obtain the best predicted order of magnitude O, O∈M.

步骤1中所述抽取网页中商品的名称、型号、类型与价格数据是指,利用任意Web数据抽取算法,抽取商品在网页上显示的名称、型号、类型与价格数据;其中x1,x2,...,xn可以是第i个商品Ai从一个网页中抽取的n个价格数据,也可以是从多个网页中抽取的n个平均价格数据。Extracting the name, model, type and price data of the commodity in the webpage mentioned in step 1 refers to extracting the name, model, type and price data of the commodity displayed on the webpage by using any web data extraction algorithm; where x 1 , x 2 ,..., x n can be n pieces of price data extracted from one webpage for the i-th commodity A i , or can be n pieces of average price data extracted from multiple webpages.

步骤2是对任一商品的价格数据计算获得该商品价格数据的量级。Step 2 is to calculate the price data of any commodity to obtain the magnitude of the commodity price data.

步骤3到步骤5是针对任意一个商品在价格预测时的参数设定和预测模型选定,其中z值一般为3,5,7,D值一般为3,7。Steps 3 to 5 are for the parameter setting and forecasting model selection of any commodity in price forecasting, where the z value is generally 3, 5, 7, and the D value is generally 3, 7.

步骤6和步骤14中技术计算语言MATLAB是MathWorks公司的产品,版本为R2011b。The technical computing language MATLAB in step 6 and step 14 is a product of MathWorks, and its version is R2011b.

步骤6到步骤12是针对任意一个商品在一个网页中不同日期的价格数据在改进的RBF神经网络下的预测值,或多个网页中不同日期的平均值价格数据在改进的RBF神经网络下的预测值。Steps 6 to 12 are the predicted value of the price data of any commodity on different dates in a webpage under the improved RBF neural network, or the average price data of different dates in multiple webpages under the improved RBF neural network Predictive value.

步骤14到步骤20是针对任意一个商品在一个网页中不同日期的价格数据在改进的BP神经网络下的预测值,或多个网页中不同日期的平均值价格数据在改进的BP神经网络下的预测值。Steps 14 to 20 are the predicted value of the price data of any commodity on different dates in a webpage under the improved BP neural network, or the average price data of different dates in multiple webpages under the improved BP neural network Predictive value.

步骤6、步骤8、步骤14和步骤16中的输入矢量P为训练样本集,目标矢量T为训练测试预测值的数据集。The input vector P in step 6, step 8, step 14 and step 16 is the training sample set, and the target vector T is the data set of the training test prediction value.

步骤6中预先设定的j值一般为40,步骤14中预先设定的j值一般为10。The preset value of j in step 6 is generally 40, and the preset value of j in step 14 is generally 10.

步骤7和步骤15中是将任一商品的价格数据数量级归一化到统一的量级,商品的价格数据的数量级和归一化的量级相同,该商品的价格数据的数量级不进行归一化量级预处理;商品的价格数据的数量级和归一化的量级不同,该商品的价格数据的数量级进行归一化量级预处理,量级一般为1,10,100,1000。In step 7 and step 15, the magnitude of the price data of any commodity is normalized to a unified magnitude, the magnitude of the price data of the commodity is the same as the normalized magnitude, and the magnitude of the price data of the commodity is not normalized Quantitative magnitude preprocessing; the magnitude of the commodity price data is different from the normalized magnitude. The magnitude of the commodity price data is preprocessed with normalized magnitude. The magnitude is generally 1, 10, 100, 1000.

步骤10和步骤18中定义的耦合权重w=[2,4,2]。Coupling weight w=[2, 4, 2] defined in step 10 and step 18.

相比现有技术的各种价格预测中的数据预处理方法,本发明选取挖掘的网页商品的价格数据,利用改进的RBF神经网络和BP神经网络,计算商品价格原始数据的最佳量级,采用计算所得的最佳量级,对商品价格的原始数据进行统一的归一化量级处理;采用本发明的原始数据的归一化数量级的预处理方法,对于某一特定商品,降低了该商品的价格数据的波动范围;对于不同的商品,降低了不同商品的价格数据间的差异,弥补了现有价格预测方法因数据预处理方法应用于不同商品价格预测时的局限性,提高了预测方法的通用性的同时提高了预测的准确率。Compared with the data preprocessing methods in various price predictions of the prior art, the present invention selects the price data of webpage commodities mined, and uses the improved RBF neural network and BP neural network to calculate the optimal magnitude of the original data of commodity prices, Using the calculated optimal order of magnitude, the original data of the commodity price is processed in a unified normalized order of magnitude; the preprocessing method of the normalized order of magnitude of the original data of the present invention is used for a certain commodity, which reduces the The fluctuation range of commodity price data; for different commodities, the difference between the price data of different commodities is reduced, which makes up for the limitations of the existing price prediction method when the data preprocessing method is applied to the price prediction of different commodities, and improves the prediction The versatility of the method improves the prediction accuracy at the same time.

附图说明 Description of drawings

图1为本发明具体实施方式的流程图。Fig. 1 is a flowchart of a specific embodiment of the present invention.

具体实施方式 Detailed ways

下面结合附图对本发明的技术方案进行详细说明:The technical scheme of the present invention is described in detail below in conjunction with accompanying drawing:

如附图1所示,本发明实施方案按照以下步骤进行:As shown in accompanying drawing 1, embodiment of the present invention carries out according to the following steps:

步骤1、抽取网页中商品的名称、型号、类型与价格数据,建立有h个商品的数据集X={A1,A2,...,Ah},设第i个商品抽取的价格数据为n个,Ai={x1,x2,...,xn},其中i∈[1,h],x1,x2,...,xn指第Ai个商品抽取的n个价格数据;Step 1. Extract the name, model, type and price data of the commodities in the webpage, and establish a data set X={A 1 , A 2 ,...,A h } with h commodities, and set the extracted price of the i-th commodity There are n pieces of data, A i = {x 1 , x 2 , ..., x n }, where i∈[1, h], x 1 , x 2 , ..., x n refers to the item A i The extracted n price data;

步骤2、计算i个不同商品的价格量级,得到不同商品的价格量级M={b1,b2,...,bh};Step 2. Calculate the price magnitudes of i different commodities, and obtain the price magnitudes of different commodities M={b 1 , b 2 ,..., b h };

步骤3、自定义一个包含数据个数为z的预测样本,共需预测价格个数D;Step 3. Customize a forecast sample that contains the number of data z, and the total number of predicted prices is D;

步骤4、选定预测模型;Step 4, select the prediction model;

步骤5、当选定的预测模型为RBF神经网络,执行步骤6到步骤12;当选定的预测模型为BP神经网络,执行步骤14到步骤21;Step 5, when the selected prediction model is RBF neural network, perform steps 6 to 12; when the selected prediction model is BP neural network, perform steps 14 to 21;

步骤6、设定模型训练函数为技术计算语言MATLAB中的newrbe(P,T,SPREAD)函数,该函数用于设计一个严格的径向基网络,其中P为输入矢量,T为目标矢量,SPREAD为径向基函数的分布;模型预测函数为技术计算语言MATLAB中的sim(′MODEL′,PARAMETERS)函数,此函数用于仿真一个神经网络,其中MODEL为训练好的网络模型,PARAMETERS为输入矢量;设定j个不同的径向基函数的分布值Spreads={spread1,spread2,...,spreadj};Step 6, set the model training function as the newrbe (P, T, SPREAD) function in the technical computing language MATLAB, which is used to design a strict radial basis network, where P is the input vector, T is the target vector, and SPREAD is the distribution of radial basis functions; the model prediction function is the sim('MODEL', PARAMETERS) function in the technical computing language MATLAB, which is used to simulate a neural network, where MODEL is the trained network model, and PARAMETERS is the input vector ;Set the distribution values of j different radial basis functions Spreads={spread 1 , spread 2 ,..., spread j };

步骤7、将商品Ai的销售价格数量级归一化为量级bi,得到

Figure BSA00000773631700061
Step 7. Normalize the order of magnitude of the sales price of commodity A i to order of magnitude b i , and obtain
Figure BSA00000773631700061

步骤8、将输入矢量P,目标矢量T带入训练函数newrbe(P,T,SPREAD),训练j个不同网络netij=newrbe(P,T,spreadj),建立预测样本Test=[t1,t2,...,tz],

Figure BSA00000773631700062
Step 8. Bring the input vector P and the target vector T into the training function newrbe(P, T, SPREAD), train j different networks net ij =newrbe(P, T, spread j ), and establish a prediction sample Test=[t 1 ,t 2 ,...,t z ],
Figure BSA00000773631700062

步骤9、商品Ai的第n+1天的j个预测值Yij=sim(netij,Test),设商品Ai的第n+1天的最佳预测值为yi,yi∈YijStep 9. The j predicted value Y ij of commodity A i on day n+1 = sim(net ij , Test), and the best predicted value of commodity A i on day n+1 is y i , y i ∈ Y ij ;

步骤10、定义耦合权重W=(w1,w2,w3),设商品Ai的第n+1天的三个最佳预测径向基函数的分布的值为Bspreadi1∈Spreads,Bspreaddi2∈Spreads,Bspreadi3∈Spreads,求得最佳径向基函数的分布的值 Bspread = Bspread i 1 * w 1 + Bespread i 2 * w 2 + Bspread i 3 * w 3 w 1 + w 2 + w 3 ; Step 10. Define the coupling weight W=(w 1 , w 2 , w 3 ), and set the distribution values of the three best predicted radial basis functions of commodity A i on day n+1 as Bspread i1 ∈ Spreads, Bspreadd i2 ∈ Spreads, Bspread i3 ∈ Spreads, find the value of the distribution of the best radial basis function Bspread = Bspread i 1 * w 1 + Bespread i 2 * w 2 + Bspread i 3 * w 3 w 1 + w 2 + w 3 ;

步骤11、训练不变网络net=newrbe(P,T,Bspread);Step 11, training invariant network net=newrbe(P, T, Bspread);

步骤12、带入最佳预测值yi作为预测样本进行下一次预测,方法为新的预测样本[t1,t2,...,tz]中t1=上次预测样本[t1,t2,...,tz]中的t2,新的预测样本[t1,t2,...,tz]中t2=上次预测样本[t1,t2,...,tz]中的t3,…,新的预测样本[t1,t2,...,tz]]中tz-1=上次预测样本[t1,t2,...,tz]中的tz,新的预测样本[t1,t2,...,tz]中tz=yi,得到新的预测样本Test=[t1,t2,...,tz,商品第n+2天的预测值yi=sim(net,Test);Step 12. Bring in the best forecast value y i as the forecast sample for the next forecast. The method is t 1 = last forecast sample [t 1 ] in the new forecast sample [t 1 , t 2 , ..., t z ] . _ _ _ _ _ _ _ _ .., t z ] in t 3 , ..., new forecast sample [t 1 , t 2 , ..., t z ]] in t z-1 = last forecast sample [t 1 , t 2 , . t z in .., t z ] , t z = y i in the new prediction sample [t 1 , t 2 , ..., t z ], and get a new prediction sample Test=[t 1 , t 2 , ..., t z , the predicted value y i =sim(net, Test) of the product on the n+2th day;

步骤13、重复步骤12,得到商品Ai的所有预测值;重复步骤7到步骤12,得到数据集X中所有商品在不同数量级上的预测值,并得到最佳预测数量级O,O∈M;Step 13. Repeat step 12 to obtain all predicted values of commodity A i ; repeat steps 7 to 12 to obtain predicted values of all commodities in data set X at different orders of magnitude, and obtain the best predicted order of magnitude O, O∈M;

步骤14、设定模型训练函数为技术计算语言MATLAB中的NET=newff(P,T,NEURON)函数和NET′=train(NET,P,T)函数,其中newff()函数用于创建一个前馈BP网络,P为输入矢量,T为目标矢量,NEURON为隐层神经元个数,train()函数用于训练一个神经网络,NET为创建好的前馈BP网络;模型预测函数为NET′(Test),其中Test为预测样本;设定j个不同的隐层神经元个数的值Neurons={neuron1,neuron2,...,neuronj};Step 14, setting the model training function as the NET=newff(P, T, NEURON) function and NET'=train(NET, P, T) function in the technical computing language MATLAB, wherein the newff() function is used to create a previous Feed BP network, P is the input vector, T is the target vector, NEURON is the number of neurons in the hidden layer, the train() function is used to train a neural network, NET is the created feed-forward BP network; the model prediction function is NET' (Test), wherein Test is a prediction sample; The value Neurons={neuron 1 , neuron 2 ,..., neuron j } of j different hidden layer neuron numbers is set;

步骤15、将商品Ai的销售价格数量级归一化为量级bi,得到

Figure BSA00000773631700064
Step 15. Normalize the order of magnitude of the sales price of commodity A i to order of magnitude b i to obtain
Figure BSA00000773631700064

步骤16、将输入矢量P,目标矢量T带入训练函数NET=newff(P,T,NEURON)和NET′=train(NET,P,T),训练就j个不同网络netij=newff(P,T,Neurons),netij=train(netij,P,T);建立预测样本Test=[t1,t2,...,tz],

Figure BSA00000773631700071
Step 16, input vector P, target vector T are brought into training function NET=newff(P, T, NEURON) and NET'=train(NET, P, T), training just j different networks net ij =newff(P , T, Neurons), net ij = train(net ij , P, T); build prediction sample Test=[t 1 , t 2 ,..., t z ],
Figure BSA00000773631700071

步骤17、商品Ai的第n+1天的j个预测值Yij=neti(Test),设商品Ai的第n+1天的最佳预测值为yi,yi∈YijStep 17. The j predicted value Y ij of commodity A i on day n+1 = net i (Test), assuming that the best predicted value of commodity A i on day n+1 is y i , y iY ij ;

步骤18、定义耦合权重W=(w1,w2,w3),设商品Ai的第n+1天的三个最佳预测隐层神经元个数的值为Bneuroni1∈Neurons,Bneuroni2∈Neurons,Bneuroni3∈Neurons,求得最佳隐层神经元个数的值 Bneuron = Bneuron i 1 * w 1 + Bneuron i 2 * w 2 + Bneuron i 3 * w 3 w 1 + w 2 + w 3 ; Step 18. Define the coupling weight W=(w 1 , w 2 , w 3 ), and set the value of the three best predicted hidden layer neurons on day n+1 of commodity A i as Bneuron i1 ∈ Neurons, Bneuron i2 ∈ Neurons, Bneuron i3 ∈ Neurons, find the value of the optimal number of neurons in the hidden layer Bneuron = Bneuron i 1 * w 1 + Bneuron i 2 * w 2 + Bneuron i 3 * w 3 w 1 + w 2 + w 3 ;

步骤19、训练不变网络net=newff(P,T,Bneuron),net=train(net,P,T);Step 19, training invariant network net=newff (P, T, Bneuron), net=train (net, P, T);

步骤20、带入最佳预测值yi作为预测样本进行下一次预测,方法为新的预测样本[t1,t2,...,tz]中t1=上次预测样本[t1,t2,...,tz]中的t2,新的预测样本[t1,t2,...,tz]中t2=上次预测样本[t1,t2,...,tz]中的t3,…,新的预测样本[t1,t2,...,tz]中tz-1=上次预测样本[t1,t2,...,tz]中的tz,新的预测样本[t1,t2,...,tz]中tz=yi,得到新的预测样本Test=[t1,t2,...,tz],商品第n+2天的预测值yi=net(Test);Step 20, bring in the best predicted value y i as the forecast sample for the next forecast, the method is t 1 in the new forecast sample [t 1 , t 2 , ..., t z ] = last forecast sample [t 1 . _ _ _ _ _ _ _ _ .., t z ] in t 3 , ..., t z- 1 in the new forecast sample [t 1 , t 2 , ..., t z ] = last forecast sample [t 1 , t 2 , .. ., t z ], t z in the new prediction sample [t 1 , t 2 ,..., t z ], t z =y i , and get a new prediction sample Test=[t 1 , t 2 ,. .., t z ], the predicted value y i =net(Test) of the commodity on the n+2th day;

步骤21、重复步骤20,得到商品Ai的所有预测值;重复步骤15到步骤20,得到数据集X中所有商品在不同数量级上的预测值,并得到最佳预测数量级O,O∈M。Step 21. Repeat step 20 to obtain all predicted values of commodity A i ; repeat steps 15 to 20 to obtain predicted values of all commodities in data set X at different orders of magnitude, and obtain the best predicted order of magnitude O, O∈M.

步骤1中所述抽取网页中商品的名称、型号、类型与价格数据是指,利用任意Web数据抽取算法,抽取商品在网页上显示的名称、型号、类型与价格数据;其中x1,x2,...,xn可以是第i个商品Ai从一个网页中抽取的n个价格数据,也可以是从多个网页中抽取的n个平均价格数据。Extracting the name, model, type and price data of the commodity in the webpage mentioned in step 1 refers to extracting the name, model, type and price data of the commodity displayed on the webpage by using any web data extraction algorithm; where x 1 , x 2 ,..., x n can be n pieces of price data extracted from one webpage for the i-th commodity A i , or can be n pieces of average price data extracted from multiple webpages.

步骤2是对任一商品的价格数据计算获得该商品价格数据的量级。Step 2 is to calculate the price data of any commodity to obtain the magnitude of the commodity price data.

步骤3到步骤5是针对任意一个商品在价格预测时的参数设定和预测模型选定,其中z值一般为3,5,7,D值一般为3,7。Steps 3 to 5 are for the parameter setting and forecasting model selection of any commodity in price forecasting, where the z value is generally 3, 5, 7, and the D value is generally 3, 7.

步骤6和步骤14中技术计算语言MATLAB是MathWorks公司的产品,版本为R2011b。The technical computing language MATLAB in step 6 and step 14 is a product of MathWorks, and its version is R2011b.

步骤6到步骤12是针对任意一个商品在一个网页中不同日期的价格数据在改进的RBF神经网络下的预测值,或多个网页中不同日期的平均值价格数据在改进的RBF神经网络下的预测值。Steps 6 to 12 are the predicted value of the price data of any commodity on different dates in a webpage under the improved RBF neural network, or the average price data of different dates in multiple webpages under the improved RBF neural network Predictive value.

:步骤14到步骤20是针对任意一个商品在一个网页中不同日期的价格数据在改进的BP神经网络下的预测值,或多个网页中不同日期的平均值价格数据在改进的BP神经网络下的预测值。: Step 14 to step 20 is the predicted value under the improved BP neural network for the price data of any commodity on different dates in a webpage, or the average price data of different dates in multiple webpages under the improved BP neural network predicted value of .

步骤6、步骤8、步骤14和步骤16中的输入矢量P为训练样本集,目标矢量T为训练测试预测值的数据集。The input vector P in step 6, step 8, step 14 and step 16 is the training sample set, and the target vector T is the data set of the training test prediction value.

步骤6中预先设定的j值一般为40,步骤14中预先设定的j值一般为10。The preset value of j in step 6 is generally 40, and the preset value of j in step 14 is generally 10.

步骤7和步骤15中是将任一商品的价格数据数量级归一化到统一的量级,商品的价格数据的数量级和归一化的量级相同,该商品的价格数据的数量级不进行归一化量级预处理;商品的价格数据的数量级和归一化的量级不同,该商品的价格数据的数量级进行归一化量级预处理,量级一般为1,10,100,1000。In step 7 and step 15, the magnitude of the price data of any commodity is normalized to a unified magnitude, the magnitude of the price data of the commodity is the same as the normalized magnitude, and the magnitude of the price data of the commodity is not normalized Quantitative magnitude preprocessing; the magnitude of the commodity price data is different from the normalized magnitude. The magnitude of the commodity price data is preprocessed with normalized magnitude. The magnitude is generally 1, 10, 100, 1000.

步骤10和步骤18中定义的耦合权重w=[2,4,2]。Coupling weight w=[2, 4, 2] defined in step 10 and step 18.

为了更好地说明本方法的有效性,利用从网页上抽取的8种不同人民币汇率从2011年1月1日至2011年12月31日的每天平均价格数据作为原始数据,计算得出原始数据的量级为1、10和100,对原始数据数量级进行归一化处理实验。In order to better illustrate the effectiveness of this method, the daily average price data of 8 different RMB exchange rates extracted from the webpage from January 1, 2011 to December 31, 2011 were used as the original data to calculate the original data The order of magnitude is 1, 10 and 100, and the normalization experiment is performed on the order of magnitude of the original data.

在改进的RBF神经网络的实验环境下,在不进行归一化量级预处理时,原始数据的实验结果为:澳大利亚元的平均误差为3.9%,港币的平均误差为11.82%,加拿大元的平均误差为640.84%,美元的平均误差为21571.04%,欧元的平均误差为1.66%,日元的平均误差为1.15%,瑞士法郎的平均误差为2.77%,新加坡元的平均误差为28959.17%,实验的平均误差为6399.04%;在数据量级归一化为100时,实验结果为:澳大利亚元的平均误差为3.9%,港币的平均误差为1.97%,加拿大元的平均误差为640.84%,美元的平均误差为21571.04%,欧元的平均误差为1.66%,日元的平均误差为1233177%,瑞士法郎的平均误差为2.77%,新加坡元的平均误差为28959.17%,实验的平均误差为160544.8%;在数据量级归一化为10时,实验结果为:澳大利亚元的平均误差为1.77%,港币的平均误差为0.94%,加拿大元的平均误差为0.39%,美元的平均误差为0.11%,欧元的平均误差为224438.5%,日元的平均误差为1.05%,瑞士法郎的平均误差为1.57%,新加坡元的平均误差为0.50%,实验的平均误差为28055.6%;在数据量级归一化为1时,实验结果为:澳大利亚元的平均误差为0.98%,港币的平均误差为0.94%,加拿大元的平均误差为0.41%,美元的平均误差为0.09%,欧元的平均误差为1.12%,日元的平均误差为1.29%,瑞士法郎的平均误差为1.91%,新加坡元的平均误差为0.16%,实验的平均误差为0.86%。结论是数据量级归一化为1时,取得了最好的预测结果,预测的平均准确率达99.14%。In the experimental environment of the improved RBF neural network, without normalization magnitude preprocessing, the experimental results of the original data are: the average error of the Australian dollar is 3.9%, the average error of the Hong Kong dollar is 11.82%, and the average error of the Canadian dollar is 3.9%. The average error is 640.84%, the average error of USD is 21571.04%, the average error of EUR is 1.66%, the average error of JPY is 1.15%, the average error of CHF is 2.77%, and the average error of SGD is 28959.17%. The average error is 6399.04%; when the data magnitude is normalized to 100, the experimental results are: the average error of the Australian dollar is 3.9%, the average error of the Hong Kong dollar is 1.97%, the average error of the Canadian dollar is 640.84%, and the average error of the US dollar is 640.84%. The average error is 21571.04%, the average error of the euro is 1.66%, the average error of the Japanese yen is 1233177%, the average error of the Swiss franc is 2.77%, the average error of the Singapore dollar is 28959.17%, and the average error of the experiment is 160544.8%; When the data magnitude is normalized to 10, the experimental results are: the average error of the Australian dollar is 1.77%, the average error of the Hong Kong dollar is 0.94%, the average error of the Canadian dollar is 0.39%, the average error of the US dollar is 0.11%, and the average error of the euro is 0.94%. The average error is 224438.5%, the average error is 1.05% for Japanese yen, 1.57% for Swiss franc, 0.50% for Singapore dollar, and 28055.6% for the experiment; normalized to 1 in the data magnitude , the experimental results are: the average error of the Australian dollar is 0.98%, the average error of the Hong Kong dollar is 0.94%, the average error of the Canadian dollar is 0.41%, the average error of the US dollar is 0.09%, the average error of the euro is 1.12%, the Japanese yen The average error is 1.29% for the Swiss franc, 1.91% for the Swiss franc, 0.16% for the Singapore dollar, and 0.86% for the experiment. The conclusion is that when the data magnitude is normalized to 1, the best prediction result is obtained, and the average prediction accuracy rate reaches 99.14%.

在改进的BP神经网络的实验环境下,在不进行归一化量级预处理时,原始数据的实验结果为:澳大利亚元的平均误差为1.31%,港币的平均误差为0.17%,加拿大元的平均误差为0.28%,美元的平均误差为0.26%,欧元的平均误差为1.41%,日元的平均误差为1.24%,瑞士法郎的平均误差为1.65%,新加坡元的平均误差为0.21%,实验的平均误差为0.82%;在数量级归一化为100时,实验结果为:澳大利亚元的平均误差为1.31%,港币的平均误差为0.24%,加拿大元的平均误差为0.46%,美元的平均误差为0.03%,欧元的平均误差为2.21%,日元的平均误差为1.14%,瑞士法郎的平均误差为1.59%,新加坡元的平均误差为2.98%,实验的平均误差为1.25%;在数据量级归一化为10时,实验结果为:澳大利亚元的平均误差为1.09%,港币的平均误差为0.28%,加拿大元的平均误差为0.13%,美元的平均误差为0.48%,欧元的平均误差为1.38%,日元的平均误差为2.54%,瑞士法郎的平均误差为1.93%,新加坡元的平均误差为0.06%,实验的平均误差为0.99%;在数据量级归一化为1时,实验结果为:澳大利亚元的平均误差为0.39%,港币的平均误差为0.18%,加拿大元的平均误差为0.37%,美元的平均误差为0.40%,欧元的平均误差为1.43%,日元的平均误差为1.18%,瑞士法郎的平均误差为1.74%,新加坡元的平均误差为0.28%,实验的平均误差为0.75%。结论是数据量级归一化为1时取得了最好的预测结果,预测的平均准确率高达99.25%。In the experimental environment of the improved BP neural network, without normalization magnitude preprocessing, the experimental results of the original data are: the average error of the Australian dollar is 1.31%, the average error of the Hong Kong dollar is 0.17%, and the average error of the Canadian dollar is 1.31%. The average error is 0.28%, the average error for USD is 0.26%, the average error for EUR is 1.41%, the average error for JPY is 1.24%, the average error for CHF is 1.65%, and the average error for SGD is 0.21%. The average error is 0.82%; when the order of magnitude is normalized to 100, the experimental results are: the average error of the Australian dollar is 1.31%, the average error of the Hong Kong dollar is 0.24%, the average error of the Canadian dollar is 0.46%, and the average error of the US dollar is 0.03%, the average error of the Euro is 2.21%, the average error of the Japanese Yen is 1.14%, the average error of the Swiss Franc is 1.59%, the average error of the Singapore Dollar is 2.98%, and the average error of the experiment is 1.25%. When level normalization is 10, the experimental results are: the average error of Australian dollar is 1.09%, the average error of Hong Kong dollar is 0.28%, the average error of Canadian dollar is 0.13%, the average error of US dollar is 0.48%, and the average error of euro is 1.38%, the average error of the Japanese Yen is 2.54%, the average error of the Swiss Franc is 1.93%, the average error of the Singapore Dollar is 0.06%, and the average error of the experiment is 0.99%. When the data magnitude is normalized to 1, The experimental results are: the average error of the Australian dollar is 0.39%, the average error of the Hong Kong dollar is 0.18%, the average error of the Canadian dollar is 0.37%, the average error of the US dollar is 0.40%, the average error of the euro is 1.43%, and the average error of the Japanese yen The error is 1.18%, the average error is 1.74% for Swiss francs, 0.28% for Singapore dollars, and 0.75% for experiments. The conclusion is that when the data magnitude is normalized to 1, the best prediction result is obtained, and the average prediction accuracy rate is as high as 99.25%.

以上实验数据说明了此数据预处理方法对同种类不同商品的通用性,为了说明此数据预处理方法对不同种类商品的通用性,利用从网页上抽取的10种不同农产品从2011年1月至2012年2月共59周的周平均价格数据作为原始数据,计算得出原始数据的量级为1和10,对原始数据数量级进行归一化预处理实验。The above experimental data shows the generality of this data preprocessing method for different commodities of the same type. In order to illustrate the generality of this data preprocessing method for different types of commodities, 10 different agricultural products extracted from The weekly average price data of 59 weeks in February 2012 was used as the original data, and the order of magnitude of the original data was calculated as 1 and 10, and the normalization preprocessing experiment was carried out on the order of magnitude of the original data.

在改进的RBF神经网络的实验环境下,在不进行归一化量级预处理时,原始数据的实验结果为:牛肉的平均误差为3149934%,豆油的平均误差为17.96%,鸡蛋的平均误差为1.61%,花生油的平均误差为2.89%,面粉的平均误差为0.11%,猪肉的平均误差为542574.4%,大米的平均误差为0.34%,白砂糖的平均误差为0.44%,调和油的平均误差为6.61%,羊肉的平均误差为325260%,实验的平均误差为401779.9%;在数据量级归一化为10时,实验结果为:牛肉的平均误差为3149934%,豆油的平均误差为17.96%,鸡蛋的平均误差为1.61%,花生油的平均误差为2.89%,面粉的平均误差为0.12%,猪肉的平均误差为542574.4%,大米的平均误差为0.34%,白砂糖的平均误差为0.44%,调和油的平均误差为6.61%,羊肉的平均误差为325260%,实验的平均误差为401779.9%;在数据量级归一化为1时,实验结果为:牛肉的平均误差为2.44%,豆油的平均误差为17.96%,鸡蛋的平均误差为1.61%,花生油的平均误差为0.91%,面粉的平均误差为0.11%,猪肉的平均误差为7.35%,大米的平均误差为0.34%,白砂糖的平均误差为0.44%,调和油的平均误差为0.13%,羊肉的平均误差为0.41%,实验的平均误差为3.17%。结论是数据量级归一化为1时的实验取得了最好的预测结果,预测的平均准确率达到96.83%。In the experimental environment of the improved RBF neural network, without normalization magnitude preprocessing, the experimental results of the original data are: the average error of beef is 3149934%, the average error of soybean oil is 17.96%, and the average error of eggs The average error of peanut oil is 1.61%, the average error of peanut oil is 2.89%, the average error of flour is 0.11%, the average error of pork is 542574.4%, the average error of rice is 0.34%, the average error of white sugar is 0.44%, the average error of blended oil is 6.61%, the average error of mutton is 325260%, and the average error of the experiment is 401779.9%; when the data magnitude is normalized to 10, the experimental results are: the average error of beef is 3149934%, and the average error of soybean oil is 17.96% , the average error of eggs is 1.61%, the average error of peanut oil is 2.89%, the average error of flour is 0.12%, the average error of pork is 542574.4%, the average error of rice is 0.34%, and the average error of white sugar is 0.44%, The average error of blended oil is 6.61%, the average error of mutton is 325260%, and the average error of experiment is 401779.9%. When the data level is normalized to 1, the experimental results are: the average error of beef is 2.44%, the The average error is 17.96%, the average error of eggs is 1.61%, the average error of peanut oil is 0.91%, the average error of flour is 0.11%, the average error of pork is 7.35%, the average error of rice is 0.34%, and the average error of white sugar The error is 0.44%, the average error of blended oil is 0.13%, the average error of mutton is 0.41%, and the average error of experiment is 3.17%. The conclusion is that the experiment when the data magnitude is normalized to 1 has achieved the best prediction results, and the average prediction accuracy rate reaches 96.83%.

本发明可与计算机系统结合,从而自动完成商品价格的预测。The invention can be combined with a computer system to automatically complete the forecast of commodity prices.

本发明创造性的提出了一种基于神经网络的多品种商品价格预测的数据预处理方法,并将该数据预处理方法应用于人民币汇率、农产品等商品价格数据的预处理,利用改进的RBF神经网络和BP神经网络在预处理后的价格数据上进行商品价格的预测,提高了预测方法的通用性,获得了更高的预测准确率,具有很高的实用价值。The present invention creatively proposes a data preprocessing method based on neural network multi-variety commodity price prediction, and applies the data preprocessing method to the preprocessing of commodity price data such as RMB exchange rate and agricultural products, and utilizes the improved RBF neural network Using the BP neural network to predict commodity prices on the preprocessed price data improves the versatility of the prediction method and obtains higher prediction accuracy, which has high practical value.

本发明提出的一种基于神经网络的多品种商品价格预测的数据预处理方法不但可以用于人民币汇率和农产品生产与销售领域价格预测时的数据预处理,也可以用于其他消费类商品价格预测时的数据预处理。A neural network-based data preprocessing method for multi-variety commodity price prediction proposed by the present invention can not only be used for data preprocessing in RMB exchange rate and agricultural product production and sales field price prediction, but also can be used for other consumer commodity price prediction time data preprocessing.

Claims (11)

1.一种基于神经网络的多品种商品价格预测的数据预处理方法,其特征在于:利用改进的RBF神经网络和BP神经网络对网页挖掘的商品价格数据计算其最佳数量级,用计算得出的最佳数量级对商品价格数进行归一化数据量级的预处理,进而提高RBF神经网络和BP神经网络的预测准确率,也提高了RBF神经网络和BP神经网络用于不同商品价格预测的通用性,具体包括以下步骤:1. A data preprocessing method based on neural network-based multi-species commodity price prediction, characterized in that: utilize improved RBF neural network and BP neural network to calculate its optimal order of magnitude for the commodity price data of webpage mining, and obtain The optimal order of magnitude preprocesses the normalized data magnitude of commodity prices, thereby improving the prediction accuracy of RBF neural network and BP neural network, and also improving the performance of RBF neural network and BP neural network for price prediction of different commodities Versatility, specifically including the following steps: 步骤1、抽取网页中商品的名称、型号、类型与价格数据,建立有h个商品的数据集X={X1,A2,...,Ah},设第i个商品抽取的价格数据为n个,Ai={x1,x2,...,xn},其中i∈[1,h],x1,x2,...,xn指第Ai个商品抽取的n个价格数据;Step 1. Extract the name, model, type and price data of commodities in the webpage, and establish a data set X={X 1 , A 2 ,...,A h } with h commodities, and set the extracted price of the i-th commodity There are n pieces of data, A i = {x 1 , x 2 , ..., x n }, where i∈[1, h], x 1 , x 2 , ..., x n refers to the item A i The extracted n price data; 步骤2、计算i个不同商品的价格量级,得到不同商品的价格量级M={b1,b2,...,bh};Step 2. Calculate the price magnitudes of i different commodities, and obtain the price magnitudes of different commodities M={b 1 , b 2 ,..., b h }; 步骤3、自定义一个包含数据个数为z的预测样本,共需预测价格个数D;Step 3. Customize a forecast sample that contains the number of data z, and the total number of predicted prices is D; 步骤4、选定预测模型;Step 4, select the prediction model; 步骤5、当选定的预测模型为RBF神经网络,执行步骤6到步骤12;当选定的预测模型为BP神经网络,执行步骤14到步骤21;Step 5, when the selected prediction model is RBF neural network, perform steps 6 to 12; when the selected prediction model is BP neural network, perform steps 14 to 21; 步骤6、设定模型训练函数为技术计算语言MATLAB中的newrbe(P,T,SPREAD)函数,该函数用于设计一个严格的径向基网络,其中P为输入矢量,T为目标矢量,SPREAD为径向基函数的分布;模型预测函数为技术计算语言MATLAB中的sim(′MODEL′,PARAMETERS)函数,此函数用于仿真一个神经网络,其中MODEL为训练好的网络模型,PARAMETERS为输入矢量;设定j个不同的径向基函数的分布值Spreads={spread1,spread2,...,spreadj};Step 6, set the model training function as the newrbe (P, T, SPREAD) function in the technical computing language MATLAB, which is used to design a strict radial basis network, where P is the input vector, T is the target vector, and SPREAD is the distribution of radial basis functions; the model prediction function is the sim('MODEL', PARAMETERS) function in the technical computing language MATLAB, which is used to simulate a neural network, where MODEL is the trained network model, and PARAMETERS is the input vector ;Set the distribution values of j different radial basis functions Spreads={spread 1 , spread 2 ,..., spread j }; 步骤7、将商品Ai的销售价格数量级归一化为量级bi,得到
Figure FSA00000773631600011
Step 7. Normalize the order of magnitude of the sales price of commodity A i to order of magnitude b i , and obtain
Figure FSA00000773631600011
步骤8、将输入矢量P,目标矢量T带入训练函数newrbe(P,T,SPREAD),训练j个不同网络netij=newrbe(P,T,spreadj),建立预测样本Test=[t1,t2,...,tz],
Figure FSA00000773631600012
Step 8. Bring the input vector P and the target vector T into the training function newrbe(P, T, SPREAD), train j different networks net ij =newrbe(P, T, spread j ), and establish a prediction sample Test=[t 1 ,t 2 ,...,t z ],
Figure FSA00000773631600012
步骤9、商品Ai的第n+1天的j个预测值Yij=sim(netij,Test),设商品Ai的第n+1天的最佳预测值为yi,ti∈YijStep 9. The j predicted value Y ij of commodity A i on day n+1 = sim(net ij , Test), and the best predicted value of commodity A i on day n+1 is y i , t i ∈ Y ij ; 步骤10、定义耦合权重W=(w1,w2,w3),设商品Ai的第n+1天的三个最佳预测径向基函数的分布的值为Bspreadi1∈Spreads,Bspreadi2∈Spreads,Bspreadi3∈Spreads,求得最佳径向基函数的分布的值 Bspread = Bspread i 1 * w 1 + Bespread i 2 * w 2 + Bspread i 3 * w 3 w 1 + w 2 + w 3 ; Step 10. Define the coupling weight W=(w 1 , w 2 , w 3 ), and set the distribution values of the three best forecast radial basis functions of commodity A i on day n+1 as Bspread i1 ∈ Spreads, Bspread i2 ∈ Spreads, Bspread i3 ∈ Spreads, find the value of the distribution of the best radial basis function Bspread = Bspread i 1 * w 1 + Bespread i 2 * w 2 + Bspread i 3 * w 3 w 1 + w 2 + w 3 ; 步骤11、训练不变网络net=newrbe(P,T,Bspread);Step 11, training invariant network net=newrbe(P, T, Bspread); 步骤12、带入最佳预测值yi作为预测样本进行下一次预测,方法为新的预测样本[t1,t2,...,tz]中t1=上次预测样本[t1,t2,...,tz]中的t2,新的预测样本[t1,t2,...,tz]中t2=上次预测样本[t1,t2,...,tz]中的t3,…,新的预测样本[t1,t2,...,tz]中tz-1=上次预测样本[t1,t2,...,tz]中的tz,新的预测样本[t1,t2,...,tz]]中tz=yi,得到新的预测样本Test=[t1,t2,...,tz],商品第n+2天的预测值yi=sim(net,Test);Step 12. Bring in the best forecast value y i as the forecast sample for the next forecast. The method is t 1 = last forecast sample [t 1 ] in the new forecast sample [t 1 , t 2 , ..., t z ] . _ _ _ _ _ _ _ _ .., t z ] in t 3 , ..., t z- 1 in the new forecast sample [t 1 , t 2 , ..., t z ] = last forecast sample [t 1 , t 2 , .. ., t z ] in t z , in the new forecast sample [t 1 , t 2 ,..., t z ]], t z =y i , get the new forecast sample Test=[t 1 , t 2 , ..., t z ], the predicted value y i of the product on the n+2th day = sim(net, Test); 步骤13、重复步骤12,得到商品Ai的所有预测值;重复步骤7到步骤12,得到数据集X中所有商品在不同数量级上的预测值,并得到最佳预测数量级O,O∈M;Step 13. Repeat step 12 to obtain all predicted values of commodity A i ; repeat steps 7 to 12 to obtain predicted values of all commodities in data set X at different orders of magnitude, and obtain the best predicted order of magnitude O, O∈M; 步骤14、设定模型训练函数为技术计算语言MATLAB中的NET=newff(P,T,NEURON)函数和NET′=train(NET,P,T)函数,其中newff()函数用于创建一个前馈BP网络,P为输入矢量,T为目标矢量,NEURON为隐层神经元个数,train()函数用于训练一个神经网络,NET为创建好的前馈BP网络;模型预测函数为NET′(Test),其中Test为预测样本;设定j个不同的隐层神经元个数的值Neurons={neuron1,neuron2,...,neuronj};Step 14, setting the model training function as the NET=newff(P, T, NEURON) function and NET'=train(NET, P, T) function in the technical computing language MATLAB, wherein the newff() function is used to create a previous Feed BP network, P is the input vector, T is the target vector, NEURON is the number of neurons in the hidden layer, the train() function is used to train a neural network, NET is the created feed-forward BP network; the model prediction function is NET' (Test), wherein Test is a prediction sample; The value Neurons={neuron 1 , neuron 2 ,..., neuron j } of j different hidden layer neuron numbers is set; 步骤15、将商品Ai的销售价格数量级归一化为量级bi,得到
Figure FSA00000773631600021
Step 15. Normalize the order of magnitude of the sales price of commodity A i to order of magnitude b i to obtain
Figure FSA00000773631600021
步骤16、将输入矢量P,目标矢量T带入训练函数NET=newff(P,T,NEURON)和NET′=train(NET,P,T),训练就j个不同网络netij=newff(P,T,Neurons),netij=train(netij,P,T);建立预测样本Test=[t1,t2,...,tz],
Figure FSA00000773631600022
Step 16, input vector P, target vector T are brought into training function NET=newff(P, T, NEURON) and NET'=train(NET, P, T), training just j different networks net ij =newff(P , T, Neurons), net ij = train(net ij , P, T); build prediction sample Test=[t 1 , t 2 ,..., t z ],
Figure FSA00000773631600022
步骤17、商品Ai的第n+1天的j个预测值Yij=neti(Test),设商品Ai的第n+1天的最佳预测值为yi,yi∈YijStep 17. The j predicted value Y ij of commodity A i on day n+1 = net i (Test), assuming that the best predicted value of commodity A i on day n+1 is y i , y iY ij ; 步骤18、定义耦合权重W=(w1,w2,w3),设商品Ai的第n+1天的三个最佳预测隐层神经元个数的值为Bneuroni1∈Neurons,Bneuroni2∈Neurons,Bneuroni3∈Neurons,求得最佳隐层神经元个数的值 Bneuron = Bneuron i 1 * w 1 + Bneuron i 2 * w 2 + Bneuron i 3 * w 3 w 1 + w 2 + w 3 ; Step 18. Define the coupling weight W=(w 1 , w 2 , w 3 ), and set the value of the three best predicted hidden layer neurons on day n+1 of commodity A i as Bneuron i1 ∈ Neurons, Bneuron i2 ∈ Neurons, Bneuron i3 ∈ Neurons, find the value of the optimal number of neurons in the hidden layer Bneuron = Bneuron i 1 * w 1 + Bneuron i 2 * w 2 + Bneuron i 3 * w 3 w 1 + w 2 + w 3 ; 步骤19、训练不变网络net=newff(P,T,Bneuron),net=train(net,P,T);Step 19, training invariant network net=newff (P, T, Bneuron), net=train (net, P, T); 步骤20、带入最佳预测值yi作为预测样本进行下一次预测,方法为新的预测样本[t1,t2,...,tz]中t1=上次预测样本[t1,t2,...,tz]中的t2,新的预测样本[t1,t2,...,tz]中t2=上次预测样本[t1,t2,...,tz]中的t3,…,新的预测样本[t1,t2,...,tz]中tz-1=上次预测样本[t1,t2,...,tz]中的tz,新的预测样本[t1,t2,...,tz]中tz=yi,得到新的预测样本Test=[t1,t2,...,tz],商品第n+2天的预测值yi=net(Test);Step 20, bring in the best predicted value y i as the forecast sample for the next forecast, the method is t 1 in the new forecast sample [t 1 , t 2 , ..., t z ] = last forecast sample [t 1 . _ _ _ _ _ _ _ _ .., t z ] in t 3 , ..., t z- 1 in the new forecast sample [t 1 , t 2 , ..., t z ] = last forecast sample [t 1 , t 2 , .. ., t z ], t z in the new prediction sample [t 1 , t 2 ,..., t z ], t z =y i , and get a new prediction sample Test=[t 1 , t 2 ,. .., t z ], the predicted value y i =net(Test) of the commodity on the n+2th day; 步骤21、重复步骤20,得到商品Ai的所有预测值;重复步骤15到步骤20,得到数据集X中所有商品在不同数量级上的预测值,并得到最佳预测数量级O,O∈M。Step 21. Repeat step 20 to obtain all predicted values of commodity A i ; repeat steps 15 to 20 to obtain predicted values of all commodities in data set X at different orders of magnitude, and obtain the best predicted order of magnitude O, O∈M.
2.根据权利要求1所述的一种基于神经网络的多品种商品价格预测的数据预处理方法,其特征在于:步骤1中所述抽取网页中商品的名称、型号、类型与价格数据是指,利用任意Web数据抽取算法,抽取商品在网页上显示的名称、型号、类型与价格数据;其中x1,x2,...,xn可以是第i个商品Ai从一个网页中抽取的n个价格数据,也可以是从多个网页中抽取的n个平均价格数据。2. The data preprocessing method of a kind of neural network-based multi-variety commodity price prediction according to claim 1, characterized in that: the name, model, type and price data of the commodity in the web page extracted as described in step 1 refer to , use any web data extraction algorithm to extract the name, model, type and price data of the product displayed on the web page; where x 1 , x 2 ,..., x n can be the i-th product A i extracted from a web page n pieces of price data, or n pieces of average price data extracted from multiple web pages. 3.根据权利要求1所述的一种基于神经网络的多品种商品价格预测的数据预处理方法,其特征在于:步骤2是对任一商品的价格数据计算获得该商品价格数据的量级。3. A neural network-based data preprocessing method for multi-variety commodity price prediction according to claim 1, characterized in that: step 2 is to calculate the price data of any commodity to obtain the magnitude of the commodity price data. 4.根据权利要求1所述的一种基于神经网络的多品种商品价格预测的数据预处理方法,其特征在于:步骤3到步骤5是针对任意一个商品在价格预测时的参数设定和预测模型选定,其中z值一般为3,5,7,D值一般为3,7。4. A neural network-based data preprocessing method for multi-species commodity price prediction according to claim 1, characterized in that: Steps 3 to 5 are for parameter setting and prediction of any commodity in price prediction The model is selected, where the z value is generally 3, 5, 7, and the D value is generally 3, 7. 5.根据权利要求1所述的一种基于神经网络的多品种商品价格预测的数据预处理方法,其特征在于:步骤6和步骤14中技术计算语言MATLAB是MathWorks公司的产品,版本为R2011b。5. The data preprocessing method of a kind of neural network-based multi-variety commodity price prediction according to claim 1, characterized in that: in step 6 and step 14, the technical computing language MATLAB is a product of MathWorks, and its version is R2011b. 6.根据权利要求1所述的一种基于神经网络的多品种商品价格预测的数据预处理方法,其特征在于:步骤6到步骤12是针对任意一个商品在一个网页中不同日期的价格数据在改进的RBF神经网络下的预测值,或多个网页中不同日期的平均值价格数据在改进的RBF神经网络下的预测值。6. The data preprocessing method of a kind of neural network-based multi-variety commodity price prediction according to claim 1, characterized in that: step 6 to step 12 is for any one commodity in a webpage for price data of different dates in The predicted value under the improved RBF neural network, or the predicted value of the average price data on different dates in multiple web pages under the improved RBF neural network. 7.根据权利要求1所述的一种基于神经网络的多品种商品价格预测的数据预处理方法,其特征在于:步骤14到步骤20是针对任意一个商品在一个网页中不同日期的价格数据在改进的BP神经网络下的预测值,或多个网页中不同日期的平均值价格数据在改进的BP神经网络下的预测值。7. The data preprocessing method of a kind of neural network-based multi-variety commodity price prediction according to claim 1, characterized in that: step 14 to step 20 is for any one commodity in a webpage for price data of different dates in The predicted value under the improved BP neural network, or the predicted value of the average price data of different dates in multiple web pages under the improved BP neural network. 8.根据权利要求1所述的一种基于神经网络的多品种商品价格预测的数据处理发方法,其特征在于:步骤6、步骤8、步骤14和步骤16中的输入矢量P为训练样本集,目标矢量T为训练测试预测值的数据集。8. The data processing method of a kind of neural network-based multi-species commodity price prediction according to claim 1, characterized in that: the input vector P in step 6, step 8, step 14 and step 16 is a training sample set , and the target vector T is the dataset for training and testing predictors. 9.根据权利要求1所述的一种基于神经网络的多品种商品价格预测的数据处理发方法,其特征在于:步骤6中预先设定的j值一般为40,步骤14中预先设定的j值一般为10。9. The data processing and sending method of a kind of neural network-based multi-variety commodity price prediction according to claim 1, characterized in that: the preset j value in step 6 is generally 40, and the preset j value in step 14 is generally 40. The value of j is generally 10. 10.根据权利要求1所述的一种基于神经网络的多品种商品价格预测的数据处理发方法,其特征在于:步骤7和步骤15中是将任一商品的价格数据数量级归一化到统一的量级,商品的价格数据的数量级和归一化的量级相同,该商品的价格数据的数量级不进行归一化量级预处理;商品的价格数据的数量级和归一化的量级不同,该商品的价格数据的数量级进行归一化量级预处理,量级一般为1,10,100,1000。10. The data processing method of a kind of neural network-based multi-variety commodity price prediction according to claim 1, characterized in that: in step 7 and step 15, the order of magnitude of the price data of any commodity is normalized to a unified The magnitude of the commodity's price data is the same as the normalized magnitude, and the magnitude of the commodity's price data is not subjected to normalized magnitude preprocessing; the magnitude of the commodity's price data is different from the normalized magnitude , the order of magnitude of the price data of the product is preprocessed with normalized order of magnitude, and the order of magnitude is generally 1, 10, 100, 1000. 11.根据权利要求1所述的一种基于神经网络的多品种商品价格预测的数据预处理方法,其特征在于:步骤10和步骤18中定义的耦合权重w=[2,4,2]。11. A neural network-based data preprocessing method for multi-variety commodity price prediction according to claim 1, characterized in that: the coupling weight w=[2, 4, 2] defined in step 10 and step 18.
CN201210325368.6A 2012-09-06 2012-09-06 The data preprocessing method of a kind of multi items price forecasting of commodity based on neural network Expired - Fee Related CN102982229B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210325368.6A CN102982229B (en) 2012-09-06 2012-09-06 The data preprocessing method of a kind of multi items price forecasting of commodity based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210325368.6A CN102982229B (en) 2012-09-06 2012-09-06 The data preprocessing method of a kind of multi items price forecasting of commodity based on neural network

Publications (2)

Publication Number Publication Date
CN102982229A true CN102982229A (en) 2013-03-20
CN102982229B CN102982229B (en) 2016-06-08

Family

ID=47856241

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210325368.6A Expired - Fee Related CN102982229B (en) 2012-09-06 2012-09-06 The data preprocessing method of a kind of multi items price forecasting of commodity based on neural network

Country Status (1)

Country Link
CN (1) CN102982229B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104715295A (en) * 2015-04-03 2015-06-17 江苏物联网研究发展中心 Chinese chemical fertilizer price index prediction method based on BP neural network
CN104732411A (en) * 2015-03-27 2015-06-24 中国农业科学院农业信息研究所 Agricultural product consumption guide method based on BP neural network
CN104732435A (en) * 2015-04-03 2015-06-24 中国农业科学院农业信息研究所 Agricultural product supply and demand matching system and method
CN107203828A (en) * 2017-06-22 2017-09-26 中北大学 A kind of reinforcing bar price expectation method, system and platform
CN108230043A (en) * 2018-01-31 2018-06-29 安庆师范大学 A kind of product area pricing method based on Grey Neural Network Model
CN109508461A (en) * 2018-12-29 2019-03-22 重庆猪八戒网络有限公司 Order price prediction technique, terminal and medium based on Chinese natural language processing
WO2019165692A1 (en) * 2018-02-27 2019-09-06 平安科技(深圳)有限公司 Carbon futures price prediction method, apparatus, computer device and storage medium
CN111402042A (en) * 2020-02-17 2020-07-10 中信建投证券股份有限公司 Data analysis and display method for stock market large disc state analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1353380A (en) * 2000-11-14 2002-06-12 亚洲证券交易所有限公司 Method of providing finance data comment and data processing system
EP1901062A1 (en) * 2005-05-23 2008-03-19 Keio University Method of taste measuring, taste sensor therefor and taste measuring apparatus
CN101853480A (en) * 2009-03-31 2010-10-06 北京邮电大学 A Forex Trading Method Based on Neural Network Forecasting Model
US20110087627A1 (en) * 2009-10-08 2011-04-14 General Electric Company Using neural network confidence to improve prediction accuracy
CN102934131A (en) * 2010-04-14 2013-02-13 西门子公司 Method for computer-aided learning of recurrent neural networks to model dynamical systems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1353380A (en) * 2000-11-14 2002-06-12 亚洲证券交易所有限公司 Method of providing finance data comment and data processing system
EP1901062A1 (en) * 2005-05-23 2008-03-19 Keio University Method of taste measuring, taste sensor therefor and taste measuring apparatus
CN101853480A (en) * 2009-03-31 2010-10-06 北京邮电大学 A Forex Trading Method Based on Neural Network Forecasting Model
US20110087627A1 (en) * 2009-10-08 2011-04-14 General Electric Company Using neural network confidence to improve prediction accuracy
CN102934131A (en) * 2010-04-14 2013-02-13 西门子公司 Method for computer-aided learning of recurrent neural networks to model dynamical systems

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
吕欣: "基于神经网络股票价格预测模型及系统的研究", 《中国优秀硕士学位论文全文数据库 经济与管理科学辑 》 *
李自珍: "基于神经网络的期货价格预测与模型实现", 《中国优秀硕士学位论文全文数据库 经济与管理科学辑 》 *
罗长寿: "基于神经网络与遗传算法的蔬菜市场价格预测方法研究", 《科技通报》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104732411A (en) * 2015-03-27 2015-06-24 中国农业科学院农业信息研究所 Agricultural product consumption guide method based on BP neural network
CN104732411B (en) * 2015-03-27 2018-01-26 中国农业科学院农业信息研究所 A Consumption Guidance Method of Agricultural Products Based on BP Neural Network
CN104715295A (en) * 2015-04-03 2015-06-17 江苏物联网研究发展中心 Chinese chemical fertilizer price index prediction method based on BP neural network
CN104732435A (en) * 2015-04-03 2015-06-24 中国农业科学院农业信息研究所 Agricultural product supply and demand matching system and method
CN107203828A (en) * 2017-06-22 2017-09-26 中北大学 A kind of reinforcing bar price expectation method, system and platform
CN108230043A (en) * 2018-01-31 2018-06-29 安庆师范大学 A kind of product area pricing method based on Grey Neural Network Model
WO2019165692A1 (en) * 2018-02-27 2019-09-06 平安科技(深圳)有限公司 Carbon futures price prediction method, apparatus, computer device and storage medium
CN109508461A (en) * 2018-12-29 2019-03-22 重庆猪八戒网络有限公司 Order price prediction technique, terminal and medium based on Chinese natural language processing
CN111402042A (en) * 2020-02-17 2020-07-10 中信建投证券股份有限公司 Data analysis and display method for stock market large disc state analysis
CN111402042B (en) * 2020-02-17 2023-10-27 中信建投证券股份有限公司 Data analysis and display method for stock market big disk shape analysis

Also Published As

Publication number Publication date
CN102982229B (en) 2016-06-08

Similar Documents

Publication Publication Date Title
CN102982229B (en) The data preprocessing method of a kind of multi items price forecasting of commodity based on neural network
Pang et al. An innovative neural network approach for stock market prediction
Lu et al. A CNN‐LSTM‐based model to forecast stock prices
Rezaei et al. Stock price prediction using deep learning and frequency decomposition
Kanwal et al. BiCuDNNLSTM-1dCNN—A hybrid deep learning-based predictive model for stock price prediction
Lin et al. Empirical mode decomposition–based least squares support vector regression for foreign exchange rate forecasting
Falavigna Financial ratings with scarce information: A neural network approach
Hu et al. New CBR adaptation method combining with problem–solution relational analysis for mechanical design
Ruíz et al. Parallel memetic algorithm for training recurrent neural networks for the energy efficiency problem
Deng et al. A novel hybrid optimization algorithm of computational intelligence techniques for highway passenger volume prediction
Agami et al. A neural network based dynamic forecasting model for Trend Impact Analysis
Dalal et al. TLIA: Time-series forecasting model using long short-term memory integrated with artificial neural networks for volatile energy markets
Nourbakhsh et al. Combining LSTM and CNN methods and fundamental analysis for stock price trend prediction
Pang et al. Stock Market Prediction based on Deep Long Short Term Memory Neural Network.
Jabeen et al. An LSTM based forecasting for major stock sectors using COVID sentiment
Babaei et al. GPT classifications, with application to credit lending
Ye et al. The prediction of stock price based on improved wavelet neural network
Shahzadi et al. A novel data driven approach for combating energy theft in urbanized smart grids using artificial intelligence
Shobeiry et al. Smart short-term load forecasting through coordination of LSTM-based models and feature engineering methods during the COVID-19 pandemic
CN115080868A (en) Product pushing method, product pushing device, computer equipment, storage medium and program product
Zhang et al. Market-level integrated detection against cyber attacks in real-time market operations by self-supervised learning
Abouhassan et al. Why Use Evolving Neuro-Fuzzy and Spiking Neural Networks for incremental and explainable learning of time series? A case study on predictive modelling of trade imports and outlier detection
Quan-Yin et al. A novel efficient adaptive sliding window model for week-ahead price forecasting
Abdelaziz et al. An Epsilon constraint method for selecting Indicators for use in Neural Networks for stock market forecasting
ERTUĞRUL A novel randomized recurrent artificial neural network approach: recurrent random vector functional link network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 223005 Jiangsu city of Huaian Province Higher Education Park Mei Cheng Road No. 1 Huaiyin Institute of Technology computer and software engineering, building 11, room 416

Applicant after: HUAIYIN INSTITUTE OF TECHNOLOGY

Address before: 223003 Jiangsu city of Huaian Province Higher Education Park Mei Cheng Road No. 1

Applicant before: HUAIYIN INSTITUTE OF TECHNOLOGY

CB03 Change of inventor or designer information

Inventor after: Zhu Quanyin

Inventor after: Zhou Hong

Inventor after: Yin Yonghua

Inventor after: Yan Yunyang

Inventor after: Cao Suqun

Inventor after: Zhu Fujian

Inventor after: Li Xiang

Inventor after: Hu Ronglin

Inventor after: Chen Ting

Inventor after: Jin Ying

Inventor before: Zhu Quanyin

Inventor before: Yin Yonghua

Inventor before: Yan Yunyang

Inventor before: Chen Ting

Inventor before: Cao Suqun

COR Change of bibliographic data
C14 Grant of patent or utility model
GR01 Patent grant
CP02 Change in the address of a patent holder

Address after: 223400 8th floor, Anton building, 10 Haian Road, Lianshui County, Jiangsu.

Patentee after: HUAIYIN INSTITUTE OF TECHNOLOGY

Address before: 223005 Room 416, Building 11, School of Computer and Software Engineering, Huaiyin Institute of Technology, No. 1 Meizheng East Road, Huai'an Higher Education Park, Jiangsu Province

Patentee before: HUAIYIN INSTITUTE OF TECHNOLOGY

CP02 Change in the address of a patent holder
TR01 Transfer of patent right

Effective date of registration: 20190409

Address after: 223005 No. 9 Haikou Road, Huaian Economic and Technological Development Zone, Jiangsu Province

Patentee after: HUAIAN FUN SOFWARE CO.,LTD.

Address before: 223400 8th floor, Anton building, 10 Haian Road, Lianshui County, Jiangsu.

Patentee before: HUAIYIN INSTITUTE OF TECHNOLOGY

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210408

Address after: 214135 509, building D, Xingye building, 97-1, Linghu Avenue, Xinwu District, Wuxi City, Jiangsu Province

Patentee after: Wuxi manlai Software Co.,Ltd.

Address before: 223005 No. 9 Haikou Road, Huaian Economic and Technological Development Zone, Jiangsu Province

Patentee before: HUAIAN FUN SOFWARE Co.,Ltd.

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160608

Termination date: 20210906