CN110807646A

CN110807646A - Data analysis method, device and computer readable storage medium

Info

Publication number: CN110807646A
Application number: CN201810881786.0A
Authority: CN
Inventors: 花志祥; 陈珊珊; 周默
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2018-08-06
Filing date: 2018-08-06
Publication date: 2020-02-18

Abstract

The invention discloses a data analysis method, a data analysis device and a computer readable storage medium, and relates to the field of data processing. The data analysis method comprises the following steps: sorting the categories related to the operation of the same user according to operation time to generate an operation sequence, wherein the generated operation sequence comprises initialized vectors corresponding to each category; generating a model word2vec by adopting an operation sequence training word vector to obtain an updated vector of each category; and determining the associated categories according to the distances between the vectors of different categories. The method of the embodiment treats a plurality of categories related to user operation in a certain sequence as one sentence. By using the thought of training word vectors according to the context of words in natural language processing, the vectors of categories can be trained according to the operation sequence of users on the categories. Therefore, the accuracy of the category correlation analysis can be improved.

Description

Data analysis method, device and computer readable storage medium

Technical Field

The present invention relates to the field of data processing, and in particular, to a data analysis method, apparatus, and computer-readable storage medium.

Background

In the field of e-commerce, based on massive behavior data of users on an e-commerce platform, the relationship between commodities in categories and different categories can be explored, and further, data rules existing among the commodities and the categories can be obtained. Therefore, the method is beneficial to analyzing and calculating the similar commodities, the competitive products and the substitutes by combining specific business and scene requirements.

Disclosure of Invention

The inventor analyzes the related technology and finds that, at present, when the category correlation analysis is performed, the category correlation degree analysis is mainly performed based on which categories the user browses or purchases in a period of time. After further research, the inventor finds that the related technology does not pay attention to the sequence of browsing categories and purchasing commodities of the user, so that the accuracy of the analysis result of the related technology is not high.

The embodiment of the invention aims to solve the technical problem that: the accuracy of the class correlation analysis is improved.

According to a first aspect of some embodiments of the present invention there is provided a data analysis method comprising: sorting the categories related to the operation of the same user according to operation time to generate an operation sequence, wherein the generated operation sequence comprises initialized vectors corresponding to each category; generating a model word2vec by adopting an operation sequence training word vector to obtain an updated vector of each category; and determining the associated categories according to the distances between the vectors of different categories.

In some embodiments, training the word vector generation model word2vec using the operation sequence, and obtaining the updated vector for each class includes: obtaining a preset number of adjacent product vectors in an operation sequence as a training subsequence, inputting the training subsequence into a word2vec model, taking one product vector in the training subsequence as an expected output product vector, and taking other product vectors as input product vectors; calculating the sum of the input category vectors as a projection vector; determining a conditional probability according to the product of the projection vector and the parameter vector, wherein the conditional probability is the conditional probability of obtaining an expected output item vector when the input item vector is used as a context; and updating the parameter vector and the input category vector by taking the target function based on the conditional probability as a training target.

In some embodiments, the data analysis method further comprises: constructing a tree structure with the categories as leaf nodes according to the ratio of the operation times of each category to the sum of the operation times of all the categories, wherein the distance from the leaf node to the root node of each category is in a negative correlation relation with the ratio corresponding to each category; the conditional probability is the product of the probabilities of all nodes on the path from the root node to the leaf node corresponding to the expected output category vector, the probability of each node on the path is determined by a logistic regression function, and the parameter of the logistic regression function is the product of the projection vector and the parameter vector.

In some embodiments, the data analysis method further comprises: acquiring an operation record of each user, wherein the operation record comprises a user identifier, operation time and a category related to the operation; and screening out the operation records of the users with the browsed categories less than the preset value so as to sort the categories related to the operation of the same user according to the operation time after screening.

In some embodiments, the data analysis method further comprises: acquiring an associated category of which the association degree with the category concerned by the user to be recommended is higher than a preset value; and recommending the associated categories to the user to be recommended.

In some embodiments, the user to be recommended is not concerned with the associated categories before recommending the associated categories to the user to be recommended.

In some embodiments, the operation is a browse operation, a stow operation, or an add shopping cart operation.

According to a second aspect of the embodiments of the present invention, there is provided a data analysis apparatus including: the operation sequence generation module is configured to sort the categories related to the operation of the same user according to operation time and generate an operation sequence, wherein the generated operation sequence comprises initialized vectors corresponding to each category; the class vector updating module is configured to adopt the operation sequence training word vector to generate a model word2vec and obtain the updated vector of each class; and the associated category determining module is configured to determine the associated categories according to the distances between the vectors of the different categories.

In some embodiments, the item class vector updating module is further configured to obtain a preset number of adjacent item class vectors in the operation sequence as training subsequences, input the training subsequences into the word2vec model, and use one item class vector in the training subsequences as an expected output item class vector and use other item class vectors as input item class vectors; calculating the sum of the input category vectors as a projection vector; determining a conditional probability according to the product of the projection vector and the parameter vector, wherein the conditional probability is the conditional probability of obtaining an expected output item vector when the input item vector is used as a context; and updating the parameter vector and the input category vector by taking the target function based on the conditional probability as a training target.

In some embodiments, the data analysis device further comprises: the tree structure building module is configured to build a tree structure with the categories as leaf nodes according to the proportion of the operation times of each category in the sum of the operation times of all the categories, wherein the distance from the leaf node to the root node of each category is in a negative correlation relation with the proportion corresponding to the category; the conditional probability is the product of the probabilities of all nodes on the path from the root node to the leaf node corresponding to the expected output category vector, the probability of each node on the path is determined by a logistic regression function, and the parameter of the logistic regression function is the product of the projection vector and the parameter vector.

In some embodiments, the data analysis device further comprises: the data screening module is configured to obtain an operation record of each user, wherein the operation record comprises a user identifier, operation time and a category related to the operation; and screening out the operation records of the users with the browsed categories less than the preset value so as to sort the categories related to the operation of the same user according to the operation time after screening.

In some embodiments, the data analysis device further comprises: the category recommendation module is configured to acquire an associated category of which the association degree with the category concerned by the user to be recommended is higher than a preset value; and recommending the associated categories to the user to be recommended.

According to a third aspect of some embodiments of the present invention, there is provided a data analysis apparatus comprising: a memory; and a processor coupled to the memory, the processor configured to perform any of the foregoing data analysis methods based on instructions stored in the memory.

According to a fourth aspect of some embodiments of the present invention, there is provided a computer-readable storage medium having a computer program stored thereon, wherein the program, when executed by a processor, implements any one of the data analysis methods described above.

Some embodiments of the above invention have the following advantages or benefits: the method of the embodiment of the invention treats a plurality of categories related to user operation with a certain sequence as a sentence for processing. By using the thought of training word vectors according to the context of words in natural language processing, the vectors of categories can be trained according to the operation sequence of users on the categories. Therefore, the accuracy of the category correlation analysis can be improved.

Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic flow diagram of a data analysis method according to some embodiments of the invention.

Fig. 2 is a flow diagram illustrating a method for determining a class vector according to some embodiments of the invention.

FIG. 3 is a flow diagram of a tree structure construction method according to some embodiments of the invention.

FIG. 4 is a flow diagram illustrating a data cleansing method according to some embodiments of the invention.

FIG. 5 is a flow diagram of an item recommendation method according to some embodiments of the invention.

FIG. 6 is a schematic diagram of a data analysis device according to some embodiments of the invention.

Fig. 7 is a schematic structural diagram of a data analysis apparatus according to another embodiment of the present invention.

FIG. 8 is a schematic diagram of a data analysis device according to further embodiments of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.

Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

FIG. 1 is a schematic flow diagram of a data analysis method according to some embodiments of the invention. As shown in fig. 1, the data analysis method of this embodiment includes steps S102 to S106.

In step S102, the categories related to the operation of the same user are sorted by operation time, and an operation sequence is generated, where the generated operation sequence includes initialized vectors corresponding to each category.

The user's operations may include, for example, browsing, collecting, adding a shopping cart, and so forth. Because some users repeatedly compare articles belonging to different categories in the shopping process, the operation sequence formed by the operations can well reflect the psychological activities of the users, thereby reflecting the relevance among the categories related to the operations. An example of the sorted categories is "beer-white wine-red wine-beer-carbonated beverage-cocktail-beer", converting these categories into vectors, i.e. obtaining the operational sequence.

In some embodiments, the initialized vector of each category may be determined in advance by means of one-hot (one-hot) encoding. The specific vector values corresponding to each class may be randomly generated.

In step S104, a model word2vec is generated by training the word vector using the operation sequence, and updated vectors for each class are obtained.

Word2vec is realized through a neural network, and Word vectors of words in the text are determined according to the context of the text. Word2vec is generally used to process a sentence or paragraph, such as a sentence, a piece of news, a piece of novel, a product description, and so forth.

The inventors have found that the words in a text passage or sentence appear in a certain order. Although the categories operated by the user are not located in the sentence, the categories also have the sequence due to the influence of the operation sequence. Therefore, the sequentially arranged categories can be regarded as a sentence or a text paragraph, and the corresponding vectors are input into the word2vec model for training, so that the vectors can be optimized.

Word2vec includes Continuous Bag-of-Words (CBOW) and Skip-Gram (CBOW). In the CBOW model, the context of a word w is known, the word w needs to be predicted, and the word w is a positive sample at the moment, and other words in a corpus are negative samples; in Skip-Gram, knowing the word w, the context of the word w needs to be predicted.

In some embodiments, the vector of the item class may be updated with the CBOW model in word2 vec. The labeled value of the training data of the model is one of a predetermined number of adjacent class vectors in the sequence of operations, with the other class vectors as input class vectors. For example, for "beer-white spirit-red wine-beer-carbonated beverage" that the user continuously browses, the vector of "beer", "white spirit", "beer" and "carbonated beverage" may be used as an input, and the vector of "red wine" may be used as an expected output value, i.e., a label value of the input data. It means that the user may browse the "red wine" when browsing the "beer", "white wine", "beer" and "carbonated beverage" items.

In step S106, the associated categories are determined according to the distance between the vectors of the different categories. In some embodiments, different categories for which the distance between vectors is less than a preset value may be determined as associated categories.

A vector comprises a plurality of dimensions, each representing a feature, although the meaning of each dimension is abstract and not directly explicitly defined by a human being. The features are obtained after the model is learned through continuous training. Therefore, if the vector distances of the two categories are close, the numerical values of the two categories in the dimensions are close, so that the two categories have higher relevance.

The method of the above embodiment treats a plurality of categories related to user operations in a certain order as one sentence. By using the thought of training word vectors according to the context of words in natural language processing, the vectors of categories can be trained according to the operation sequence of users on the categories. Therefore, the accuracy of the category correlation analysis can be improved.

An embodiment of the invention for determining a class vector by word2vec is described below with reference to fig. 2.

Fig. 2 is a flow diagram illustrating a method for determining a class vector according to some embodiments of the invention. As shown in fig. 2, the class vector determination method of this embodiment includes steps S202 to S208.

In step S202, a preset number of adjacent item vectors in the operation sequence are obtained and input into the word2vec model as a training subsequence, one item vector in the training subsequence is used as an expected output item vector, and the other item vectors are used as input item vectors. The data is training data of word2vec, and the expected output class vector is a mark value of a training subsequence.

In some embodiments, a window parameter of word2vec may be set, the value of the window parameter representing a preset number of retrieved neighboring item class vectors.

In some embodiments, the expected output class vector is the vector centered in the acquired class vector. For example, for the training subsequence "A-B-C-D-E-F-G-H-I-J-K," when the window parameter value is 5, the center-located category F may be taken as the expected output category vector, and the first five categories A, B, C, D, E and the last five categories G, H, I, J, K of F may be taken as the input category vectors.

In step S204, the sum of the input class vectors is calculated as a projection vector.

In step S206, a conditional probability is determined according to the product of the projection vector and the parameter vector, wherein the conditional probability is the conditional probability of obtaining the expected output item class vector when the input item class vector is used as the context. The parameter vector is a parameter of the hidden layer.

In step S208, the parameter vector and the input class vector are updated so that the objective function based on the conditional probability is maximized as a training target.

The projection vectors carry information of the classes surrounding the class to be predicted. After the vector is multiplied by the parameter vector to carry out mapping transformation, the output probability of the node where the expected output class vector is located is the maximum under the ideal condition. Thus, the parameter vector and the projection vector can be updated such that the conditional probability is maximized. The solving process may utilize, for example, a stochastic gradient ascent method, which is not described in detail herein.

In some embodiments, the hidden layer may be implemented by a tree structure. An embodiment of the tree structure construction method of the present invention is described below with reference to fig. 3.

FIG. 3 is a flow diagram of a tree structure construction method according to some embodiments of the invention. As shown in fig. 3, the tree structure construction method of this embodiment includes steps S302 to S304.

In step S302, the number of operations for each class in the training set is counted. The training set may be a set of operation sequences obtained from pre-collected historical operation data of one or more users.

In step S304, a tree structure with the class as a leaf node is constructed according to the ratio of the operation times of each class to the sum of the operation times of all classes, wherein the distance from the leaf node to the root node of the class is in a negative correlation with the ratio corresponding to the class.

In some embodiments, the tree structure may be implemented using a huffman tree. Firstly, regarding each class as a leaf node, wherein the probability of the leaf node is the ratio of the operation times of each class of the class to the sum of the operation times of all classes; and then adding a common father node for the two nodes which have no father node and have the minimum probability, wherein the probability of the father node is the sum of the probabilities of the child nodes, and so on until a root node is obtained. Therefore, the categories with high probability are closer to the root node and have shorter codes, so that the search efficiency and the training efficiency can be improved.

The process of calculating the conditional probability may be implemented by means of a tree structure. In some embodiments, the conditional probability is the product of the probabilities of each node on the path passing through the leaf node corresponding to the expected output class vector from the root node, the probability of passing through each node on the path is determined by a logistic regression function, and the parameters of the logistic regression function are the product of the projection vector and the parameter vector.

E.g. probability of passing each node on the pathThe calculation formula (2) can be referred to formula (1). n denotes a passing node, v denotes a projection vector,

representing a vector of parameters. n-0 indicates that the node n belongs to the left subtree of the parent node, and n-1 indicates that the node n belongs to the right subtree of the parent node.

In some embodiments, the user's operational data needs to be cleaned before updating the vectors for the categories, making the correlation analysis more accurate. An embodiment of the data cleansing method of the present invention is described below with reference to fig. 4.

FIG. 4 is a flow diagram illustrating a data cleansing method according to some embodiments of the invention. As shown in fig. 4, the data cleansing method of this embodiment includes steps S402 to S404.

In step S402, an operation record of each user is obtained, where the operation record includes a user identifier, an operation time, and a category to which the operation relates.

In step S404, the operation records of the users whose browsed categories are less than the preset value are screened out, so that the categories related to the operation of the same user are sorted according to the operation time according to the screened operation records.

For example, browsing details data of millions of users in browsing records of a plurality of different dates may be randomly sampled, the browsing details data may include user identifiers, browsing category identifiers, browsing time stamps, and the like, and the granularity of the browsing time stamps may be, for example, seconds. And then grouping the obtained browsing detail data according to the user identification or grouping according to the user identification and the operation date, and sequencing the browsing categories in each group according to the browsing time stamps. And removing the groups with the number of the browsing categories smaller than the preset value, namely recording that the number of the default browsing categories is smaller than the preset value as invalid browsing without participating in the training process.

Because the vector of the categories is determined by the user through multiple operations on different categories, the method of the embodiment can screen out the data which can not well represent the relevance among the categories, so that the relevance analysis of the categories is more accurate.

After the association relationship of the categories is obtained, recommendation can be performed according to the association relationship. An embodiment of the item recommendation method of the present invention is described below with reference to fig. 5.

FIG. 5 is a flow diagram of an item recommendation method according to some embodiments of the invention. As shown in fig. 5, the item recommendation method of this embodiment includes steps S502 to S504.

In step S502, an associated category of which the degree of association with the category concerned by the user to be recommended is higher than a preset value is acquired.

The categories concerned by the user refer to categories which are judged by the historical behaviors of the user and may be preferred or needed by the user, for example, categories which are purchased by the user more than a preset value, collected by the user more than a preset value, and clicked for promotion activities more than a preset value, and the like. Through the item association analysis process of the foregoing embodiment, other items associated with the item concerned by the user, that is, the items that the user is likely to have potential interest, can be mined. Thus, these associated categories can be recommended to the user.

In step S504, the associated category is recommended to the user to be recommended.

In some embodiments, the user to be recommended is not concerned with the associated categories before recommending the associated categories to the user to be recommended. That is, the associated item recommended to the user is not originally the item of interest to the user. Therefore, through the analysis process of the invention, which users are new users of categories and users with purchase potential can be determined, so that the item recommendation efficiency can be improved.

An embodiment of the data analysis apparatus of the present invention is described below with reference to fig. 6.

FIG. 6 is a schematic diagram of a data analysis device according to some embodiments of the invention. As shown in fig. 6, the data analysis device 60 of this embodiment includes: an operation sequence generating module 610 configured to sort the categories related to the operation of the same user according to the operation time, and generate an operation sequence, where the generated operation sequence includes initialized vectors corresponding to each category; a class vector updating module 620 configured to generate a model word2vec by using the operation sequence training word vector to obtain an updated vector for each class; an associated category determining module 630 configured to determine an associated category according to a distance between vectors of different categories.

In some embodiments, the item class vector updating module 620 is further configured to obtain a preset number of adjacent item class vectors in the operation sequence as training subsequences, input the training subsequences into the word2vec model, and use one item class vector in the training subsequences as an expected output item class vector and use other item class vectors as input item class vectors; calculating the sum of the input category vectors as a projection vector; determining a conditional probability according to the product of the projection vector and the parameter vector, wherein the conditional probability is the conditional probability of obtaining an expected output item vector when the input item vector is used as a context; and updating the parameter vector and the input category vector by taking the target function based on the conditional probability as a training target.

In some embodiments, the data analysis device 60 further comprises: the tree structure building module 640 is configured to build a tree structure with the categories as leaf nodes according to the proportion of the operation times of each category in the sum of the operation times of all the categories, wherein the distance from the leaf node to the root node of each category is in a negative correlation relation with the proportion corresponding to each category; the conditional probability is the product of the probabilities of all nodes on the path from the root node to the leaf node corresponding to the expected output category vector, the probability of each node on the path is determined by a logistic regression function, and the parameter of the logistic regression function is the product of the projection vector and the parameter vector.

In some embodiments, the data analysis device 60 further comprises: the data screening module 650 is configured to obtain an operation record of each user, where the operation record includes a user identifier, operation time, and a category to which the operation relates; and screening out the operation records of the users with the browsed categories less than the preset value so as to sort the categories related to the operation of the same user according to the operation time after screening.

In some embodiments, the data analysis device 60 further comprises: the category recommending module 660 is configured to acquire the associated categories of which the degree of association with the categories concerned by the user to be recommended is higher than a preset value; and recommending the associated categories to the user to be recommended.

Fig. 7 is a schematic structural diagram of a data analysis apparatus according to another embodiment of the present invention. As shown in fig. 7, the data analysis device 70 of this embodiment includes: a memory 710 and a processor 720 coupled to the memory 710, the processor 720 being configured to perform the data analysis method of any of the foregoing embodiments based on instructions stored in the memory 710.

Memory 710 may include, for example, system memory, fixed non-volatile storage media, and the like. The system memory stores, for example, an operating system, an application program, a Boot Loader (Boot Loader), and other programs.

FIG. 8 is a schematic diagram of a data analysis device according to further embodiments of the present invention. As shown in fig. 8, the data analysis device 80 of this embodiment includes: the memory 810 and the processor 820 may further include an input/output interface 830, a network interface 840, a storage interface 850, and the like. These

interfaces

830, 840, 850 and the memory 810 and the processor 820 may be connected, for example, by a bus 860. The input/output interface 830 provides a connection interface for input/output devices such as a display, a mouse, a keyboard, and a touch screen. The network interface 840 provides a connection interface for various networking devices. The storage interface 850 provides a connection interface for external storage devices such as an SD card and a usb disk.

An embodiment of the present invention further provides a computer-readable storage medium on which a computer program is stored, wherein the computer program is configured to implement any one of the foregoing data analysis methods when executed by a processor.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A method of data analysis, comprising:

sorting the categories related to the operation of the same user according to operation time to generate an operation sequence, wherein the generated operation sequence comprises initialized vectors corresponding to each category;

generating a model word2vec by adopting the operation sequence training word vector to obtain the updated vector of each category;

and determining the associated categories according to the distances between the vectors of different categories.

2. The data analysis method of claim 1, wherein the training of the word vector generation model word2vec using the operation sequence to obtain updated vectors for each category comprises:

obtaining a preset number of adjacent product vectors in an operation sequence as a training subsequence, inputting the training subsequence into a word2vec model, taking one product vector in the training subsequence as an expected output product vector, and taking other product vectors as input product vectors;

calculating the sum of the input category vectors as a projection vector;

determining a conditional probability according to the product of the projection vector and the parameter vector, wherein the conditional probability is the conditional probability of obtaining an expected output item vector when the input item vector is used as a context;

and updating the parameter vector and the input category vector by taking the target function based on the conditional probability as a training target.

3. The data analysis method of claim 2, further comprising:

constructing a tree structure with the categories as leaf nodes according to the ratio of the operation times of each category to the sum of the operation times of all the categories, wherein the distance from the leaf node to the root node of each category is in a negative correlation relation with the ratio corresponding to each category;

the conditional probability is the product of the probabilities of all nodes on a path from the root node to the leaf node corresponding to the expected output category vector, the probability of each node on the path is determined through a logistic regression function, and the parameter of the logistic regression function is the product of the projection vector and the parameter vector.

4. A data analysis method as claimed in any one of claims 1 to 3, further comprising:

acquiring an operation record of each user, wherein the operation record comprises a user identifier, operation time and a category related to the operation;

and screening out the operation records of the users with the browsed categories less than the preset value so as to sort the categories related to the operation of the same user according to the operation time after screening.

5. The data analysis method of claim 1, further comprising:

acquiring an associated category of which the association degree with the category concerned by the user to be recommended is higher than a preset value;

and recommending the associated categories to the user to be recommended.

6. The data analysis method according to claim 5, wherein the user to be recommended does not pay attention to the associated item before recommending the associated item to the user to be recommended.

7. The data analysis method of claim 1, wherein the operation is a browse operation, a stow operation, or an add shopping cart operation.

8. A data analysis apparatus comprising:

the operation sequence generation module is configured to sort the categories related to the operation of the same user according to operation time and generate an operation sequence, wherein the generated operation sequence comprises initialized vectors corresponding to each category;

the class vector updating module is configured to adopt the operation sequence training word vector to generate a model word2vec and obtain an updated vector of each class;

and the associated category determining module is configured to determine the associated categories according to the distances between the vectors of the different categories.

9. The data analysis device of claim 8, wherein the item vector update module is further configured to obtain a preset number of adjacent item vectors in the operation sequence as training subsequences, input the training subsequences into the word2vec model, and use one item vector in the training subsequences as an expected output item vector and use other item vectors as input item vectors; calculating the sum of the input category vectors as a projection vector; determining a conditional probability according to the product of the projection vector and the parameter vector, wherein the conditional probability is the conditional probability of obtaining an expected output item vector when the input item vector is used as a context; and updating the parameter vector and the input category vector by taking the target function based on the conditional probability as a training target.

10. The data analysis device of claim 9, further comprising:

the tree structure building module is configured to build a tree structure with the categories as leaf nodes according to the proportion of the operation times of each category in the sum of the operation times of all the categories, wherein the distance from the leaf node to the root node of each category is in a negative correlation relation with the proportion corresponding to each category;

11. The data analysis device of any one of claims 8 to 10, further comprising:

the data screening module is configured to obtain an operation record of each user, wherein the operation record comprises a user identifier, operation time and a category related to the operation; and screening out the operation records of the users with the browsed categories less than the preset value so as to sort the categories related to the operation of the same user according to the operation time after screening.

12. The data analysis device of claim 8, further comprising:

the category recommendation module is configured to acquire an associated category of which the association degree with the category concerned by the user to be recommended is higher than a preset value; and recommending the associated categories to the user to be recommended.

13. The data analysis device of claim 12, wherein the user to be recommended is not interested in the associated item before recommending the associated item to the user to be recommended.

14. A data analysis apparatus comprising:

a memory; and

a processor coupled to the memory, the processor configured to perform the data analysis method of any of claims 1-7 based on instructions stored in the memory.

15. A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, implements the data analysis method of any one of claims 1 to 7.