CN107239532B

CN107239532B - Data mining method and device

Info

Publication number: CN107239532B
Application number: CN201710397745.XA
Authority: CN
Inventors: 李满天; 王玉; 张波
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2017-05-31
Filing date: 2017-05-31
Publication date: 2020-07-31
Anticipated expiration: 2037-05-31
Also published as: CN107239532A

Abstract

The invention discloses a data mining method and device, and relates to the technical field of data mining. The method comprises the following steps: inputting the content data of the target into a content Convolution Neural Network (CNN) to obtain the content characteristic vector of the target; and inputting the content characteristic vector of the target into a deep neural network DNN to obtain the estimated data of the target. Therefore, more accurate estimated data of the target can be efficiently obtained under the condition that historical data of the target does not need to be provided.

Description

Data mining method and device

Technical Field

The invention relates to the technical field of data mining, in particular to a data mining method and device.

Background

The logistics scheduling is an important link of e-commerce website operation, and efficient logistics scheduling depends on accurate commodity sales volume estimation. Therefore, optimizing the sales volume estimation method is very important for reducing the operation cost of the e-commerce website and improving the shopping experience of the user. If the sales volume of the target commodity under the target time and region conditions can be accurately estimated, the inventory of the target commodity in each region can be scheduled and stocked in advance, or the consumer can be guided to reasonably arrange consumption in advance, so that the freight transportation and inventory cost can be saved, and the shopping experience of the user is improved.

In a traditional sales estimation method, historical sales data of a target commodity are accumulated, and actual sales data of the target commodity are estimated according to the historical sales data. The method is too dependent on accumulated historical sales data of the target commodity, and can not be used for doing a reverse, so that the accuracy of estimating the sales of the new commodity is poor, the overstocked new commodity or insufficient stock is easily caused, and further loss is caused.

In the sales estimation method in the prior art, the attributes of the target commodity are analyzed, and the target commodity is associated with other commodities by using the attributes of the target commodity. For example, the attributes of the book include author, content classification, and the like. Then, the statistical sales data of the target commodity, the related commodities or the attribute aggregate commodity are counted from different attribute dimensions. And finally, predicting the sales volume of the target commodity according to the obtained statistical sales volume data of each dimension, such as the statistical sales volume data of the author and the content classification. The problem with this approach is that the strategy of associating the goods is simple and the number of associated goods is too limited. For E-commerce commodities with various categories and brands, the method is over simplified, and has strong subjectivity, so that the accuracy of the estimated data is low. Meanwhile, a large amount of manual feature engineering work is needed in the estimation process, such as extracting statistical data according to multiple dimensions, designing various rules to extract related commodities and the like, the work depends on the experience of engineers, is easily influenced by subjectivity, the extracted feature quality is uneven, a large amount of invalid and even counterproductive manual features exist, the time is very long for optimizing the feature extraction rules, and the efficiency is low.

Disclosure of Invention

The invention solves the technical problem of how to efficiently acquire more accurate estimated data of a target under the condition of not providing historical data of the target.

According to an aspect of an embodiment of the present invention, there is provided a data mining method, including: inputting the content data of the target into a content Convolution Neural Network (CNN) to obtain the content characteristic vector of the target; and inputting the content characteristic vector of the target into a deep neural network DNN to obtain the estimated data of the target.

In some embodiments, the method further comprises: inputting historical data of the target into a statistic CNN to obtain a statistic characteristic vector of the target; and cascading the content characteristic vector of the target and the statistical characteristic vector of the target, and inputting the concatenated content characteristic vector and the statistical characteristic vector of the target into a deep neural network DNN to obtain the estimated data of the target.

In some embodiments, inputting the content data of the target into the content convolutional neural network CNN, and acquiring the content feature vector of the target includes: encoding the text description of the target into a two-dimensional matrix by using a text depth representation model; and inputting the content CNN into the two-dimensional matrix, acquiring a text characteristic vector of the target, and taking the text characteristic vector as the content characteristic vector of the target.

In some embodiments, inputting the content data of the target into the content convolutional neural network CNN, and acquiring the content feature vector of the target includes: coding the target picture into a multidimensional matrix according to the pixel value of the target picture; and inputting the content CNN into the multi-dimensional matrix, acquiring the picture characteristic vector of the target, and taking the picture characteristic vector as the content characteristic vector of the target.

In some embodiments, the content CNN includes a first content CNN and a second CNN; inputting the content data of the target into a content Convolutional Neural Network (CNN), and acquiring the content feature vector of the target comprises the following steps: encoding the text description of the target into a two-dimensional matrix by using a text depth representation model; inputting the two-dimensional matrix into the first content CNN to obtain a text characteristic vector of a target; coding the target picture into a multidimensional matrix according to the pixel value of the target picture; inputting the multidimensional matrix into second content CNN to obtain a picture characteristic vector of a target; and taking the text feature vector and the picture feature vector as content feature vectors of the target.

In some embodiments, inputting the historical data of the target into the statistics CNN, and obtaining the statistical feature vector of the target includes: cascading historical data of a target in different time periods into a time domain statistical sequence; and inputting the time domain statistical sequence into a statistical CNN, acquiring a time domain statistical characteristic vector of the target, and taking the time domain statistical characteristic vector as the statistical characteristic vector of the target.

In some embodiments, inputting the historical data of the target into the statistics CNN, and obtaining the statistical feature vector of the target includes: discretizing a statistical region of the target into a plurality of sub-regions; encoding historical data of a target in different sub-regions in the same time period into a region statistical matrix; and inputting the region statistical matrix into a statistical CNN, acquiring a region statistical characteristic vector of the target, and taking the region statistical characteristic vector as the statistical characteristic vector of the target.

In some embodiments, the statistical CNN includes a first statistical CNN and a second statistical CNN; inputting the historical data of the target into a statistic CNN, and acquiring the statistic feature vector of the target comprises the following steps: cascading the historical sales of the target in different time periods into a time domain statistical sequence; inputting the time domain statistical sequence into a first statistical CNN to obtain a time domain statistical characteristic vector of a target; discretizing a statistical region of the target into a plurality of sub-regions; encoding historical data of a target in different sub-regions in the same time period into a region statistical matrix; inputting the region statistical matrix into a second statistical CNN to obtain a region statistical characteristic vector of the target; and taking the time domain statistical feature vector and the region statistical feature vector as the statistical feature vector of the target.

In some embodiments, the content CNN is composed of a convolutional layer, a pooling layer, a fully-connected layer, and an output layer.

In some embodiments, the statistical CNN consists of a convolutional layer, a pooling layer, and an output layer.

In some embodiments, the DNN is comprised of a fully-connected layer and an output layer, with individual neurons in the fully-connected layer performing nonlinear activation function operations.

In some embodiments, the method further comprises: calculating the average square error of the actual data and the estimated data of the target; and transferring the average square error from an output layer of the combined neural network consisting of the content CNN, the statistical CNN and the DNN to an input layer by using a back propagation algorithm so as to iteratively update the parameters of each layer in the combined neural network.

In some embodiments, the method further comprises: and training a combined neural network consisting of the content CNN, the statistical CNN and the DNN by using the content data of the sample and the historical data of the sample.

According to another aspect of the embodiments of the present invention, there is provided a data mining apparatus including: the content characteristic vector acquisition module is configured to input the content data of the target into a content Convolutional Neural Network (CNN) and acquire the content characteristic vector of the target; and the estimated data acquisition module is configured to input the content feature vector of the target into the deep neural network DNN to acquire estimated data of the target.

In some embodiments, the apparatus further includes a statistical feature vector obtaining module configured to input the historical data of the target into a statistical CNN, and obtain a statistical feature vector of the target; the prediction data acquisition module is further configured to: and cascading the content characteristic vector of the target and the statistical characteristic vector of the target, and inputting the concatenated content characteristic vector and the statistical characteristic vector of the target into a deep neural network DNN to obtain the estimated data of the target.

In some embodiments, the content feature vector acquisition module is configured to: encoding the text description of the target into a two-dimensional matrix by using a text depth representation model; and inputting the content CNN into the two-dimensional matrix, acquiring a text characteristic vector of the target, and taking the text characteristic vector as the content characteristic vector of the target.

In some embodiments, the content feature vector acquisition module is configured to: coding the target picture into a multidimensional matrix according to the pixel value of the target picture; and inputting the content CNN into the multi-dimensional matrix, acquiring the picture characteristic vector of the target, and taking the picture characteristic vector as the content characteristic vector of the target.

In some embodiments, the content CNN includes a first content CNN and a second content CNN; the content feature vector acquisition module is configured to: encoding the text description of the target into a two-dimensional matrix by using a text depth representation model; inputting the two-dimensional matrix into the first content CNN to obtain a text characteristic vector of a target; coding the target picture into a multidimensional matrix according to the pixel value of the target picture; inputting the multidimensional matrix into second content CNN to obtain a picture characteristic vector of a target; and taking the text feature vector and the picture feature vector as content feature vectors of the target.

In some embodiments, the statistical feature vector acquisition module is configured to: cascading historical data of a target in different time periods into a time domain statistical sequence; and inputting the time domain statistical sequence into a statistical CNN, acquiring a time domain statistical characteristic vector of the target, and taking the time domain statistical characteristic vector as the statistical characteristic vector of the target.

In some embodiments, the statistical feature vector acquisition module is configured to: discretizing a statistical region of the target into a plurality of sub-regions; encoding historical data of a target in different sub-regions in the same time period into a region statistical matrix; and inputting the region statistical matrix into a statistical CNN, acquiring a region statistical characteristic vector of the target, and taking the region statistical characteristic vector as the statistical characteristic vector of the target.

In some embodiments, the statistical CNN includes a first statistical CNN and a second statistical CNN; the statistical feature vector acquisition module is configured to: cascading the historical sales of the target in different time periods into a time domain statistical sequence; inputting the time domain statistical sequence into a first statistical CNN to obtain a time domain statistical characteristic vector of a target; discretizing a statistical region of the target into a plurality of sub-regions; encoding historical data of a target in different sub-regions in the same time period into a region statistical matrix; inputting the region statistical matrix into a second statistical CNN to obtain a region statistical characteristic vector of the target; and taking the time domain statistical feature vector and the region statistical feature vector as the statistical feature vector of the target.

In some embodiments, the DNN is composed of a fully-connected layer and an output layer, and each neuron in the fully-connected layer performs a nonlinear activation function operation to obtain the prediction data of the target.

In some embodiments, the apparatus further comprises a parameter update module configured to: calculating the average square error of the actual data and the estimated data of the target; and transferring the average square error from an output layer of the combined neural network consisting of the content CNN, the statistical CNN and the DNN to an input layer by using a back propagation algorithm so as to iteratively update the parameters of each layer in the combined neural network.

In some embodiments, the apparatus further includes a training module configured to train a combined neural network of the content CNN, the statistical CNN, and the DNN using the content data of the sample and the historical data of the sample.

According to still another aspect of an embodiment of the present invention, there is provided a data mining apparatus including: a memory; and a processor coupled to the memory, the processor configured to perform the aforementioned data mining method based on instructions stored in the memory.

According to yet another aspect of the embodiments of the present invention, there is provided a computer-readable storage medium storing computer instructions which, when executed by a processor, implement the aforementioned data mining method.

According to the data mining method provided by the invention, the content characteristic vector of the target can be obtained through the statistical convolution neural network, and further the estimated data of the target can be obtained through the deep neural network. Therefore, more accurate estimated data of the target can be efficiently obtained under the condition that historical data of the target does not need to be provided.

Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 shows a schematic diagram of a combined neural network composed of content CNN, statistical CNN, and DNN.

Fig. 2 shows a network structure diagram of the content CNN.

Fig. 3 shows a network structure diagram of the statistical CNN.

Fig. 4 shows a network structure diagram of DNN.

FIG. 5 is a flow diagram illustrating one embodiment of a data mining method of the present invention.

FIG. 6 is a flow diagram illustrating another embodiment of a data mining method of the present invention.

Fig. 7 is a schematic structural diagram of an embodiment of the data mining device of the present invention.

Fig. 8 is a block diagram illustrating another embodiment of the data mining device of the present invention.

Fig. 9 is a block diagram showing still another embodiment of the data mining device of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The technical scheme provided by the invention relates to CNN (Convolutional Neural Network) and DNN (Deep Neural Network). Fig. 1 shows a schematic diagram of a combined neural network composed of content CNN, statistical CNN, and DNN.

Taking a product as an example, the input of the content CNN is content data of a target product, and the content CNN can analyze the target product from the content perspective and extract a content feature vector of the target product. And the input of the statistical CNN is sales data of the target commodity, and the statistical CNN automatically extracts different dimensionality statistical characteristic vectors of the target commodity through a multilayer convolutional neural network and a multi-size sliding convolution window. The input of the DNN is a feature vector formed by splicing the content feature vector of the target commodity and the statistical feature vector of the target commodity, the DNN can learn high-order combination features, and finally, the DNN is linearly combined to obtain the estimated data of the target commodity.

Before the combined neural network composed of the content CNN, the statistical CNN, and the DNN is used to obtain the estimated data of the target, the combined neural network composed of the content CNN, the statistical CNN, and the DNN needs to be trained by using the content data of the sample commodity and the historical data of the sample commodity. In order to enable the input training data to meet the data input requirements of the neural network, the raw data of the sample commodity needs to be preprocessed.

As for the content data of the article, a text description or an article picture of the article may be encoded as the content data.

For example, if the text description of the commodity is "iPadMini tablet 32 GB", the text description may be divided into 3 words "iPadMini", "tablet", "32 GB", and then the text description of the target may be encoded into a two-dimensional matrix as shown in table 1 using the text depth representation model Word2 Vec. The two-dimensional matrix shown in table 1 is a 50-row 3-column matrix, each column of the matrix represents a word vector corresponding to one word, and the word vectors are connected according to the appearance sequence to generate the coding two-dimensional matrix. Each row of the matrix represents a certain dimension (50 dimensions in total) of each word in the word vector space.

TABLE 1

For another example, a commercial picture with three channels of RGB may be scaled to a picture with 100 × 100 pixels, and then a pixel value of each channel is represented by 8-bit integer, where the pixel value range is 0-255. since the picture has three channels of RGB and a total of 100 × 100 pixels, the pixel value of each channel is encoded by the pixels, and the picture may be encoded to an encoding matrix of 100 × 100 × 3.

For the historical data of the commodities, the historical sales of the commodities in different time periods can be coded into the historical data, and the historical sales of the commodities in different subareas in the same time period can also be coded into the historical data.

For example, historical sales data for a certain commodity over the past 5 days are shown in Table 2. The data may be concatenated into a time domain statistical sequence 10, 15, 5, 4, 3.

10

15

5

4

3

TABLE 2

For another example, the map may be divided into squares with equal size according to the space, so as to establish a region statistical matrix, where each element in the region statistical matrix corresponds to sales data of the commodity in each divided square region, for example, as shown in table 3. Wherein, X represents that the sales volume of the commodities in the area is unknown and needs to be estimated.

1	0	1	0	0
					0	0	X	0	0
0	3	5	0	0
					0	0	0	6	0
0	0	0	0	0

TABLE 3

After the preprocessing of the original data is completed, the preprocessed regional statistical matrix can be input into the combined neural network so as to train the combined neural network.

Fig. 2 shows a network structure diagram of the content CNN. As can be seen from fig. 2, the content CNN is composed of a convolutional layer, a pooling layer, a fully-connected layer, and an output layer. The convolutional layer and the pooling layer learn local information of the commodity, then the local feature information is merged to the fully-connected layer, the fully-connected layer learns global information including the commodity, and finally a content feature vector, for example, a 50-dimensional vector, is output as an input of the DNN.

It will be understood by those skilled in the art that RNN (Recurrent Neural Networks) may also be used in place of content CNN when extracting content feature vectors from textual descriptions of commodities.

Fig. 3 shows a network structure diagram of the statistical CNN. As can be seen from fig. 3, the statistical CNN consists of convolutional layers, pooling layers, and output layers. The input of the statistical CNN may be sales data of the goods in time (the sales in the time period is represented by the color shade). In the training process, if the time domain sequence shown in table 2 is input, the actual sales of the sample commodity needs to be input, such as the time domain statistical sequences 10, 15, 5, 4, 3 and 6; if the region sequence shown in table 3 is input, the actual sales (X value) of the sample commodity in the specific region at the corresponding time needs to be input at the same time.

In contrast to the content CNN shown in fig. 2, the statistical CNN does not have a fully connected layer, because the fully connected layer destroys the locality of the statistical data. The locality here means that for the statistical features extracted here, each feature is only related to a sales sequence in a certain period of time, and does not contain global sales information. If the full connection layer is inserted into the statistical CNN, the statistical feature integration is performed at the end of the CNN, and the statistical feature integration processing is more reasonable in the subsequent DNN. For example, the output of the statistical CNN may be a 10-dimensional vector as the input of the DNN.

Fig. 4 shows a network structure diagram of DNN. As can be seen from fig. 4, the DNN consists of a fully connected layer and an output layer. The input to the DNN network is a one-dimensional feature vector connected by the content feature vector and the statistical feature vector, and may be, for example, a 60-dimensional feature vector formed by concatenating a 50-dimensional content feature vector and a 10-dimensional statistical feature vector.

Each neuron in the full connection layer performs nonlinear activation function operation such as relu or sigmoid on the input content characteristic vector or the input cascaded content characteristic vector and statistical characteristic vector, learns a complex combination function, and finally outputs the estimated sales amount of the commodity, wherein the estimated sales amount can be a floating point number. Generally, the relu algorithm is preferred over the sigmoid algorithm because the relu algorithm does not have a large saturation region like the sigmoid algorithm. The larger saturation region of the sigmoid algorithm causes the gradient back propagation algorithm to cause the gradient diffusion problem, so that the DNN network cannot be trained effectively.

One embodiment of the data mining method provided by the present invention is described below in conjunction with FIG. 5.

FIG. 5 is a flow diagram illustrating one embodiment of a data mining method of the present invention. As shown in fig. 5, the data mining method of this embodiment includes:

step S502, inputting the content data of the target into the content CNN, and acquiring the content characteristic vector of the target.

Three implementations of step S502 are listed below.

(1) The text description of the target is encoded into a two-dimensional matrix using a text depth representation model. And inputting the two-dimensional matrix into the first content CNN, acquiring a text characteristic vector of the target, and taking the text characteristic vector as a content characteristic vector of the target.

(2) And encoding the picture of the target into a multi-dimensional matrix according to the pixel value of the picture of the target. And inputting the second content CNN into the multi-dimensional matrix, acquiring the picture characteristic vector of the target, and taking the picture characteristic vector as the content characteristic vector of the target.

(3) The text description of the target is encoded into a two-dimensional matrix using a text depth representation model. And inputting the two-dimensional matrix into the first content CNN to obtain the text characteristic vector of the target. And encoding the picture of the target into a multi-dimensional matrix according to the pixel value of the picture of the target. And inputting the multidimensional matrix into the second content CNN to obtain the picture characteristic vector of the target. And taking the text feature vector and the picture feature vector as content feature vectors of the target.

The first content CNN and the second content CNN may be obtained by the content CNN training method. Since the content data preprocessing process of the target is similar to the content data preprocessing process of the sample, the content data preprocessing process of the target is not described herein again.

Step S504, the content feature vector of the target is input into a deep neural network DNN, and estimated data of the target is obtained.

For a new product, the content feature vector of the new product can be normally extracted according to the method of the embodiment, but the statistical feature cannot be obtained and thus can be set to 0, so that the DNN network can only estimate the sales volume through the content feature vector. As the relevance between the target commodity and other commodities is not calculated manually, the DNN network is directly used for prediction. If fully trained with a large amount of sample data, the DNN network can learn estimates of sales for other commodities that are similar to the new commodity (the estimated reference features include content statistics). Therefore, for new products, reasonable sales volume estimation can be carried out according to the content characteristics of the new products.

The CNN and DNN-based data mining method provided in the above embodiment can extract content feature vectors from high-dimensional content data such as texts and images of commodities, and perform related commodity analysis and new commodity sales prediction based on the content feature vectors, and compared with a conventional rule-based method, the CNN and DNN-based data mining method has the advantages of higher generalization capability, better expansion capability, quantization capability, and no influence from subjective factors, so that a more accurate new commodity sales prediction effect can be obtained. Meanwhile, the CNN is more flexible than a method for manually adjusting the statistical rules, can intelligently learn and adjust parameters, and has higher efficiency than the method for manually adjusting the statistical rules; the DNN can intelligently learn parameters such as feature scale transformation, feature combination and the like, has higher processing efficiency than manual features, and is more accurate in estimation.

The inventor also finds that the sales volume estimation method in the prior art does not consider the influence of the relevance of the regions on the sales volume of the commodity, for example, the distance between the place A and the place B is closer than that between the place A and the place C, and the popularity is more similar, so that the sales volume of the target commodity in the place A should be more related to the place B. To this end, the present invention provides another embodiment of a data mining method to solve this technical problem.

FIG. 6 is a flow diagram illustrating another embodiment of a data mining method of the present invention. As shown in fig. 6, the data mining method of this embodiment includes:

step S602, inputting the content data of the target into the content convolutional neural network CNN, and acquiring the content feature vector of the target. A specific implementation manner may refer to step S502.

Step S604, inputting the historical data of the target into the statistical CNN, and obtaining the statistical feature vector of the target.

Three implementations of step S604 are listed below.

(1) Cascading historical data of a target in different time periods into a time domain statistical sequence; and inputting the time domain statistical sequence into the first statistical CNN, acquiring a time domain statistical feature vector of the target, and taking the time domain statistical feature vector as the statistical feature vector of the target.

For example, after the data shown in table 2 is input into the first statistical CNN, the first statistical CNN is processed by checking it with a convolution kernel having a window size of 3, for example, where the weights of the convolution kernels are initialized to 1/3,1/3, and 1/3. Then the results obtained after performing the convolution operation on the 5-day sales data series are shown in table 4. This example corresponds to a 3 day sliding window, with a 3 day average of sales being counted. For practical application, the convolution kernel weight can be automatically learned through an algorithm, the size of the convolution kernel can be set to be multiple and different scales, and the convolution operation can be stacked in multiple layers to form a deep network.

10

15

5

TABLE 4

(2) Discretizing a statistical region of the target into a plurality of sub-regions; encoding historical data of a target in different sub-regions in the same time period into a region statistical matrix; and inputting the region statistical matrix into the second statistical CNN, acquiring a region statistical characteristic vector of the target, and taking the region statistical characteristic vector as the statistical characteristic vector of the target.

For example, to calculate the sales of the "X" region in table 3, the convolution statistics of the window 3 × 3 around the target region may be calculated, and similarly, the weight value of the convolution kernel may be automatically learned by a machine learning method, and the number and size of the convolution kernels may be configured by experiments.

(3) Cascading the historical sales of the target in different time periods into a time domain statistical sequence; inputting the time domain statistical sequence into a first statistical CNN to obtain a time domain statistical characteristic vector of a target; discretizing a statistical region of the target into a plurality of sub-regions; encoding historical data of a target in different sub-regions in the same time period into a region statistical matrix; inputting the region statistical matrix into a second statistical CNN to obtain a region statistical characteristic vector of the target; and taking the time domain statistical feature vector and the region statistical feature vector as the statistical feature vector of the target.

The first statistical CNN and the second statistical CNN may be obtained by the statistical CNN training method. Since the target historical data preprocessing process is similar to the sample historical data preprocessing process, the target historical data preprocessing process is not described herein.

Step S606, the content characteristic vector of the target and the statistical characteristic vector of the target are input into the deep neural network DNN after being cascaded, and the estimated data of the target are obtained.

In the above embodiment, more comprehensive neural network modeling is performed for the problem of estimating the actual sales volume of the target commodity under the target time and region conditions. By establishing three types of content-related features, time-domain-related features and region-related features related to the target, extracting, processing and encoding input data suitable for neural network processing from three aspects of content attribute association, time-domain association and region association, and finally, by means of a combined neural network structure of content CNN, statistical CNN and DNN, the three types of features are effectively processed and combined, a large amount of manual feature processing work is saved, and accurate estimation of target commodity sales volume is efficiently completed.

In some embodiments, after obtaining the estimated data of the target, the average square error between the actual data and the estimated data of the target may also be calculated. And then, transferring the average square error from an output layer of the combined neural network consisting of the content CNN, the statistical CNN and the DNN to an input layer by using a back propagation algorithm so as to iteratively update parameters of each layer in the combined neural network. Therefore, the data of the subsequent target can be estimated more accurately through the updated combined neural network.

A data mining device of one embodiment of the present invention is described below in conjunction with fig. 7.

Fig. 7 is a schematic structural diagram of an embodiment of the data mining device of the present invention. As shown in fig. 7, the data mining device 70 of this embodiment includes:

a content feature vector obtaining module 702, configured to input the content data of the target into the content convolutional neural network CNN, and obtain a content feature vector of the target.

The estimated data obtaining module 704 is configured to input the content feature vector of the target into the deep neural network DNN, and obtain estimated data of the target.

In some embodiments, the content feature vector acquisition module 702 is configured to: encoding the text description of the target into a two-dimensional matrix by using a text depth representation model; and inputting the two-dimensional matrix into the first content CNN, acquiring a text characteristic vector of the target, and taking the text characteristic vector as a content characteristic vector of the target.

In some embodiments, the content feature vector acquisition module 702 is configured to: coding the target picture into a multidimensional matrix according to the pixel value of the target picture; and inputting the second content CNN into the multi-dimensional matrix, acquiring the picture characteristic vector of the target, and taking the picture characteristic vector as the content characteristic vector of the target.

In some embodiments, the content feature vector acquisition module 702 is configured to: encoding the text description of the target into a two-dimensional matrix by using a text depth representation model; inputting the two-dimensional matrix into the first content CNN to obtain a text characteristic vector of a target; coding the target picture into a multidimensional matrix according to the pixel value of the target picture; inputting the multidimensional matrix into second content CNN to obtain a picture characteristic vector of a target; and taking the text feature vector and the picture feature vector as content feature vectors of the target.

In some embodiments, the apparatus 70 further includes a statistical feature vector obtaining module 703 configured to input the history data of the target into the statistics CNN to obtain a statistical feature vector of the target. Prediction data acquisition module 704 is further configured to: and cascading the content characteristic vector of the target and the statistical characteristic vector of the target, and inputting the concatenated content characteristic vector and the statistical characteristic vector of the target into a deep neural network DNN to obtain the estimated data of the target.

In some embodiments, the statistical feature vector acquisition module 703 is configured to: cascading historical data of a target in different time periods into a time domain statistical sequence; and inputting the time domain statistical sequence into the first statistical CNN, acquiring a time domain statistical feature vector of the target, and taking the time domain statistical feature vector as the statistical feature vector of the target.

In some embodiments, the statistical feature vector acquisition module 703 is configured to: discretizing a statistical region of the target into a plurality of sub-regions; encoding historical data of a target in different sub-regions in the same time period into a region statistical matrix; and inputting the region statistical matrix into the second statistical CNN, acquiring a region statistical characteristic vector of the target, and taking the region statistical characteristic vector as the statistical characteristic vector of the target.

In some embodiments, the statistical feature vector acquisition module 703 is configured to: cascading the historical sales of the target in different time periods into a time domain statistical sequence; inputting the time domain statistical sequence into a first statistical CNN to obtain a time domain statistical characteristic vector of a target; discretizing a statistical region of the target into a plurality of sub-regions; encoding historical data of a target in different sub-regions in the same time period into a region statistical matrix; inputting the region statistical matrix into a second statistical CNN to obtain a region statistical characteristic vector of the target; and taking the time domain statistical feature vector and the region statistical feature vector as the statistical feature vector of the target.

In some embodiments, the apparatus 70 further comprises a parameter update module 705 configured to: calculating the average square error of the actual data and the estimated data of the target; and transferring the average square error from an output layer of the combined neural network consisting of the content CNN, the statistical CNN and the DNN to an input layer by using a back propagation algorithm so as to iteratively update the parameters of each layer in the combined neural network.

In some embodiments, the apparatus further comprises a training module 701 configured to train a combined neural network composed of the content CNN, the statistical CNN, and the DNN using the content data of the sample and the history data of the sample.

Fig. 8 is a block diagram illustrating another embodiment of the data mining device of the present invention. As shown in fig. 8, the data mining device 80 of this embodiment includes: a memory 810 and a processor 820 coupled to the memory 810, the processor 820 being configured to perform the data mining method of any of the preceding embodiments based on instructions stored in the memory 810.

The memory 810 may include, for example, a system memory, a fixed non-volatile storage medium, and the like, where the system memory stores, for example, an operating system, application programs, a Boot loader (Boot L loader), and other programs.

Fig. 9 is a block diagram showing still another embodiment of the data mining device of the present invention. As shown in fig. 9, the apparatus 90 of this embodiment includes: the memory 810 and the processor 820 may further include an input/output interface 930, a network interface 940, a storage interface 950, and the like. These

interfaces

930, 940, 950 and the memory 810 and the processor 820 may be connected by a bus 950, for example. The input/output interface 930 provides a connection interface for input/output devices such as a display, a mouse, a keyboard, and a touch screen. The network interface 940 provides a connection interface for various networking devices. The storage interface 950 provides a connection interface for external storage devices such as an SD card and a usb disk.

The present invention also includes a computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the data mining method of any of the foregoing embodiments.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A method of data mining, comprising:

inputting the content data of the target into a content convolution neural network CNN, and acquiring the content characteristic vector of the target: the content CNN comprises a first content CNN and a second content CNN, and text description of a target is coded into a two-dimensional matrix by using a text depth representation model; inputting the two-dimensional matrix into first content CNN to obtain a text characteristic vector of a target; coding the target picture into a multidimensional matrix according to the pixel value of the target picture; inputting the multidimensional matrix into second content CNN to obtain a picture characteristic vector of a target; taking the text feature vector and the picture feature vector as content feature vectors of a target;

inputting historical data of the target into a statistic CNN to obtain a statistic characteristic vector of the target;

and cascading the content characteristic vector of the target and the statistical characteristic vector of the target, and inputting the concatenated content characteristic vector and the statistical characteristic vector of the target into a deep neural network DNN to obtain the estimated data of the target.

2. The method of claim 1, wherein the inputting the historical data of the target into the statistical CNN, and the obtaining the statistical feature vector of the target comprises:

cascading historical data of a target in different time periods into a time domain statistical sequence;

and inputting the time domain statistical sequence into the statistical CNN to obtain a time domain statistical characteristic vector of a target, and taking the time domain statistical characteristic vector as the statistical characteristic vector of the target.

3. The method of claim 1, wherein the inputting the historical data of the target into the statistical CNN, and the obtaining the statistical feature vector of the target comprises:

discretizing a statistical region of the target into a plurality of sub-regions;

encoding historical data of a target in different sub-regions in the same time period into a region statistical matrix;

and inputting the region statistical matrix into the statistical CNN to obtain a region statistical characteristic vector of the target, and taking the region statistical characteristic vector as the statistical characteristic vector of the target.

4. The method of claim 1, wherein the statistical CNN comprises a first statistical CNN and a second statistical CNN;

the inputting the historical data of the target into the statistical CNN and the obtaining the statistical feature vector of the target comprise:

cascading the historical sales of the target in different time periods into a time domain statistical sequence;

inputting the time domain statistical sequence into a first statistical CNN to obtain a time domain statistical characteristic vector of a target;

inputting the region statistical matrix into a second statistical CNN to obtain a region statistical characteristic vector of the target;

and taking the time domain statistical feature vector and the region statistical feature vector as the statistical feature vector of the target.

5. The method of any one of claims 1 to 4, wherein the content CNN is composed of a convolutional layer, a pooling layer, a fully-connected layer, and an output layer.

6. The method of any of claims 1 to 4, wherein the statistical CNN consists of convolutional layers, pooling layers, and output layers.

7. The method of claim 1, in which the DNN is comprised of a fully-connected layer and an output layer, each neuron in the fully-connected layer performing a nonlinear activation function operation.

8. The method of claim 1, wherein the method further comprises:

calculating the average square error of the actual data and the estimated data of the target;

and transferring the average square error from an output layer to an input layer of a combined neural network consisting of the content CNN, the statistical CNN and the DNN by using a back propagation algorithm so as to iteratively update the parameters of each layer in the combined neural network.

9. The method of claim 1, wherein the method further comprises:

and training a combined neural network consisting of the content CNN, the statistical CNN and the DNN by using the content data of the sample and the historical data of the sample.

10. A data mining device, comprising:

the content characteristic vector acquisition module is configured to input the content data of the target into a content Convolutional Neural Network (CNN) and acquire the content characteristic vector of the target; the content CNN includes a first content CNN and a second content CNN; the content feature vector acquisition module is specifically configured to: encoding the text description of the target into a two-dimensional matrix by using a text depth representation model;

inputting the two-dimensional matrix into first content CNN to obtain a text characteristic vector of a target; coding the target picture into a multidimensional matrix according to the pixel value of the target picture; inputting the multidimensional matrix into second content CNN to obtain a picture characteristic vector of a target; taking the text feature vector and the picture feature vector as content feature vectors of a target;

the characteristic vector acquisition module is configured to input the historical data of the target into the statistical CNN and acquire the statistical characteristic vector of the target;

a prediction data acquisition module configured to: and cascading the content characteristic vector of the target and the statistical characteristic vector of the target, and inputting the concatenated content characteristic vector and the statistical characteristic vector of the target into a deep neural network DNN to obtain the estimated data of the target.

11. The apparatus of claim 10, wherein the statistical feature vector acquisition module is configured to:

12. The apparatus of claim 10, wherein the statistical feature vector acquisition module is configured to:

13. The apparatus of claim 10, wherein the statistical CNN comprises a first statistical CNN and a second statistical CNN;

the statistical feature vector acquisition module is configured to:

14. The apparatus of any of claims 10 to 13, wherein the content CNN is comprised of a convolutional layer, a pooling layer, a fully-connected layer, and an output layer.

15. The apparatus of any one of claims 10-13, wherein the statistical CNN is comprised of a convolutional layer, a pooling layer, and an output layer.

16. The apparatus of claim 10, in which the DNN is comprised of a fully-connected layer and an output layer, each neuron in the fully-connected layer performing a nonlinear activation function operation.

17. The apparatus of claim 10, wherein the apparatus further comprises a parameter update module configured to:

18. The apparatus of claim 10, further comprising a training module configured to train a combined neural network of the content CNN, the statistical CNN, and the DNN using content data of samples and historical data of samples.

19. A data mining device, comprising:

a memory; and

a processor coupled to the memory, the processor configured to perform the data mining method of any of claims 1 to 9 based on instructions stored in the memory.

20. A computer-readable storage medium storing computer instructions which, when executed by a processor, implement a data mining method as claimed in any one of claims 1 to 9.