CN116881659A - Product classification method, device, equipment and medium - Google Patents

Product classification method, device, equipment and medium Download PDF

Info

Publication number
CN116881659A
CN116881659A CN202310877542.6A CN202310877542A CN116881659A CN 116881659 A CN116881659 A CN 116881659A CN 202310877542 A CN202310877542 A CN 202310877542A CN 116881659 A CN116881659 A CN 116881659A
Authority
CN
China
Prior art keywords
product
products
matrix
order
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310877542.6A
Other languages
Chinese (zh)
Inventor
高丰
张瑜
田士福
赵海龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202310877542.6A priority Critical patent/CN116881659A/en
Publication of CN116881659A publication Critical patent/CN116881659A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • G06F18/21322Rendering the within-class scatter matrix non-singular
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Accounting & Taxation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Finance (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Technology Law (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a product classification method, device, equipment and medium, which can be applied to the technical field of finance. The method comprises the following steps: generating an undirected graph according to product attribute data associated with N products and product identifiers of the N products, wherein N is an integer greater than or equal to 2; inputting the adjacency matrix of each order corresponding to the undirected graph and a preset embedding dimension into a network embedding model, and outputting a proximity embedding vector of each order; performing dimension reduction on the adjacent degree embedded vector of each order to obtain a target adjacent degree embedded vector of each order; and classifying the N products based on the target proximity embedded vector of each order to obtain a product classification result.

Description

Product classification method, device, equipment and medium
Technical Field
The present disclosure relates to the field of financial technology, and in particular, to a method, apparatus, device, and medium for classifying products.
Background
With the development of science and technology and the improvement of living standard of people, financial products are various, and risks, benefits and operation modes of the financial products are very different, so that the unified management is very difficult, and the problem of product classification is the primary problem of product management.
In the implementation process of the method, the product classification is carried out according to the product attributes in the related technology, and as the number of products is increased and the product related attributes are more and more complex, the product is classified by utilizing the product attributes, and the problem of accurate classification of the products under the complex condition exists.
Disclosure of Invention
In view of the foregoing, the present disclosure provides a product classification method, apparatus, device, medium, and program product.
According to a first aspect of the present disclosure, there is provided a product classification method comprising:
generating an undirected graph according to product attribute data associated with N products and product identifiers of the N products, wherein N is an integer greater than or equal to 2;
inputting the adjacency matrix of each order corresponding to the undirected graph and a preset embedding dimension into a network embedding model, and outputting a proximity embedding vector of each order;
performing dimension reduction on the adjacent degree embedded vector of each order to obtain a target adjacent degree embedded vector of each order; and
and classifying N products based on the target proximity embedded vector of each order to obtain a product classification result.
According to an embodiment of the present disclosure, classifying N products based on target proximity embedding vectors of each order, to obtain a product classification result, includes:
Randomly screening target products from N products;
determining the embedded space distance between the target product corresponding to each step and other products based on the target proximity embedded vector of each step;
determining a sub-classification result corresponding to each step according to the embedded space distance;
and determining a product classification result according to the sub-classification result.
According to an embodiment of the present disclosure, determining a sub-classification result corresponding to each order according to an embedding space distance includes:
for each step:
and dividing the target product with the embedded space distance meeting the threshold value and other products into one type to obtain a sub-classification result.
According to an embodiment of the present disclosure, determining a product classification result from a sub-classification result includes:
and weighting and combining the sub-classification results of each step according to the requirements of the users on the products to obtain the product classification results.
According to an embodiment of the present disclosure, an undirected graph includes nodes and edges, and product attribute data includes product base attribute data and attribute data associated with a product owner;
generating an undirected graph according to product attribute data associated with the N products and product identifications of the N products, wherein the undirected graph comprises:
generating nodes according to the product identifiers of the N products;
Determining relationships among product attributes according to product attribute data associated with the N products;
edges are generated based on relationships between product attributes.
According to an embodiment of the present disclosure, inputting an adjacency matrix and a preset embedding dimension of each order corresponding to an undirected graph into a network embedding model, outputting a proximity embedding vector of each order, including:
inputting the adjacency matrix of each order corresponding to the undirected graph and a preset embedding dimension into a network embedding model so as to execute the following operations:
extracting eigenvalues and eigenvectors of the adjacent matrix;
performing matrix decomposition on the adjacent matrix to obtain a first matrix and a second matrix, wherein the first matrix is used for representing a diagonal matrix of the characteristic value; the second matrix is used for representing a matrix composed of feature vectors;
screening a first target matrix and a second target matrix which meet preset embedding dimensions from the first matrix and the second matrix respectively;
determining a proximity embedding vector based on the first target matrix and the second target matrix;
the proximity embedding vector is output.
According to an embodiment of the present disclosure, performing dimension reduction on a proximity embedding vector of each order to obtain a target proximity embedding vector of each order, includes:
And (3) reducing the dimension of the adjacent embedded vector of each order by using a principal component analysis method to obtain the target adjacent embedded vector of each order.
A second aspect of the present disclosure provides a product classification device comprising:
the generating module is used for generating an undirected graph according to the product attribute data associated with the N products and the product identifiers of the N products, wherein N is an integer greater than or equal to 2;
the first processing module is used for inputting the adjacency matrix of each order corresponding to the undirected graph and the preset embedding dimension into the network embedding model and outputting the adjacency embedding vector of each order;
the second processing module is used for reducing the dimension of the adjacent embedded vector of each order to obtain a target adjacent embedded vector of each order; and
and the classification module is used for classifying the N products based on the target adjacent embedded vector of each order to obtain a product classification result.
A third aspect of the present disclosure provides an electronic device, comprising: one or more processors; and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the product classification method described above.
A fourth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-described product classification method.
A fifth aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the above-described product classification method.
According to the embodiment of the disclosure, an arbitrary order adjacency matrix is input into a network embedding model, an arbitrary order adjacency embedding vector is reserved, then dimension reduction is carried out, a product is represented in a low-dimension vector form, and nodes with strong relevance in an embedding space can be adjacent to each other; products are classified based on the target proximity embedding vector for each order, which is more flexible to use in different product requirements, e.g., coarse-grained classification may use higher order proximity and fine-grained classification may use lower order proximity. The bottom structure of the product network can be well reserved, and the products can be accurately classified.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario diagram of a product classification method, apparatus, device, medium, and program product according to an embodiment of the disclosure;
FIG. 2 schematically illustrates a flow chart of a product classification method according to an embodiment of the disclosure;
FIG. 3 schematically illustrates a flow chart of a method for classifying N products based on target proximity embedding vectors for each order to obtain a product classification result, according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a method flow diagram for generating an undirected graph based on product attribute data associated with N products and product identifications of the N products, in accordance with an embodiment of the present disclosure;
FIG. 5 schematically illustrates a block diagram of a product classification device according to an embodiment of the disclosure; and
fig. 6 schematically illustrates a block diagram of an electronic device adapted to implement a product classification method according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
In the technical scheme of the disclosure, the related data (such as including but not limited to personal information of a user) are collected, stored, used, processed, transmitted, provided, disclosed, applied and the like, all conform to the regulations of related laws and regulations, necessary security measures are adopted, and the public welcome is not violated.
In the technical scheme of the embodiment of the disclosure, the authorization or consent of the user is obtained before the personal information of the user is obtained or acquired.
In the implementation process of the method, the prior product classification is to classify the product portrait according to the product attributes, and as the number of products is increased, the related attributes of the products are more and more complex, so that the products are classified by utilizing the product attributes, and the problem of accurate classification of the products under the complex condition is solved. For example, consider that in a recommendation system, if the classification is based only on product attributes, there is a limitation in calculating the similarity between items, and two products may not be able to establish a relationship because there is no common attribute.
Embodiments of the present disclosure provide a product classification method comprising: generating an undirected graph according to product attribute data associated with N products and product identifiers of the N products, wherein N is an integer greater than or equal to 2; inputting the adjacency matrix of each order corresponding to the undirected graph and a preset embedding dimension into a network embedding model, and outputting a proximity embedding vector of each order; performing dimension reduction on the adjacent degree embedded vector of each order to obtain a target adjacent degree embedded vector of each order; and classifying the N products based on the target proximity embedded vector of each order to obtain a product classification result.
Fig. 1 schematically illustrates an application scenario diagram of a product classification method, apparatus, device, medium and program product according to an embodiment of the disclosure.
As shown in fig. 1, the application scenario 100 according to this embodiment may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 is a medium used to provide a communication link between the first terminal device 101, the second terminal device 102, the third terminal device 103, and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 through the network 104 using at least one of the first terminal device 101, the second terminal device 102, the third terminal device 103, to receive or send messages, etc. Various communication client applications, such as a shopping class application, a web browser application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only) may be installed on the first terminal device 101, the second terminal device 102, and the third terminal device 103.
The first terminal device 101, the second terminal device 102, the third terminal device 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by the user using the first terminal device 101, the second terminal device 102, and the third terminal device 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the product classification method provided by the embodiments of the present disclosure may be generally performed by the server 105. Accordingly, the product sorting apparatus provided by the embodiments of the present disclosure may be generally provided in the server 105. The product classification method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105. Accordingly, the product classification apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103 and/or the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The product classification method of the disclosed embodiment will be described in detail below with reference to fig. 2 to 4 based on the scenario described in fig. 1.
Fig. 2 schematically illustrates a flow chart of a product classification method according to an embodiment of the disclosure.
As shown in fig. 2, the product classification method 200 of this embodiment includes operations S210 to S240.
In operation S210, an undirected graph is generated according to product attribute data associated with N products and product identifications of the N products, where N is an integer greater than or equal to 2.
According to embodiments of the present disclosure, product attribute data may include product base attribute data and attribute data associated with a product owner. The undirected graph can include nodes and edges. The connection between the products can be established according to the product attribute data associated with the N products, the edges of the undirected graph are generated according to the connection, and the products are identified as nodes of the undirected graph according to the product.
Wherein the basic attribute data may be data for characterizing an attribute name. The attribute names may be, for example, a product type, a product owner, a manner in which the product owner obtains an asset, a product offered object, a product status, a product owner type, a fixed management product asset, a floating management product asset, a product transaction asset, a product escrow value added service asset, a product asset valuation, a product launch mode, a product launch asset period, a method of obtaining an asset based on product calculations, a new product classification, a non-fixed deadline product type, a user's product ownership, a manager's expiration rights for a product ownership period, a derivative product market operation mode, a marker, a product brand, a product period number, a product trust identifier, a product trust type, a product trust form, an open period, a regular open period, other regular open periods, an open mode, a period in which the product is requested to be owned, a product trust role, a collaboration mode, a product escrow, a recruitment mode, an investment direction, an operation mode, a product risk level, a share category, a number of days in which the user obtains the relevant product, a user risk level, a structured (hierarchical) product level category, a management mode, and the like.
The attribute data associated with the product owner may be data for characterizing an attribute name of the auxiliary information. The attribute name of the auxiliary information may be, for example, a product owner group, a transaction amount of the product in a preset period of time, and the number of times the product owner repeatedly owns the product in the preset period of time.
In operation S220, the adjacency matrix and the preset embedding dimension of each order corresponding to the undirected graph are input into the network embedding model, and the proximity embedding vector of each order is output.
According to embodiments of the present disclosure, undirected graphs may be represented as an N-th order adjacency matrix based on the links established between the various products. The preset embedding dimension may be determined according to the number of products. The network embedding model may be built from proximity. Wherein any one element in the adjacency matrix may represent the weight of an edge between two nodes.
For example, if the number of products is on the order of ten thousand, the preset embedding dimension may be set to 128, and if the number of products is on the order of thousand, the preset embedding dimension may be set to 64.
According to the embodiment of the disclosure, the network embedding model can be used for reserving the node adjacency of any order, and the characteristic decomposition results of the adjacency matrixes of different orders can be obtained by utilizing the connection between the adjacency matrixes of higher orders based on the singular value decomposition frame and then converted into the singular value decomposition results so as to obtain the node adjacency embedding vector.
The high-order adjacency matrix can be used for representing hidden association relations among products. For example, for product A, B, C, there is a strong association between A and B, a strong association between B and C, no association between A and C, if represented by a first order adjacency matrix, no association between ACs, if represented by a second order matrix, there is an association between ACs by the association of AB, BC, higher order, and so on.
In operation S230, the proximity embedding vector of each order is reduced in dimension to obtain a target proximity embedding vector of each order.
According to the embodiment of the disclosure, the dimension of the adjacent embedded vector of each order can be reduced by using a vector dimension reduction method, and the obtained target adjacent embedded vector of each order is low-dimensional, such as three-dimensional.
In operation S240, N products are classified based on the target proximity embedding vector of each order, resulting in a product classification result.
According to the embodiment of the disclosure, the spatial position information of each product can be determined according to the target proximity embedding vector of each step, and the product classification result can be determined according to the spatial position information of each product.
According to the embodiment of the disclosure, an arbitrary order adjacency matrix is input into a network embedded model, an arbitrary order adjacency embedded vector is reserved, then dimension reduction is carried out, a product is represented in a low-dimension vector form, and nodes with strong relevance in an embedded space can be adjacent to each other; products are classified based on the target proximity embedding vector for each order, which is more flexible to use in different product requirements, e.g., coarse-grained classification may use higher order proximity and fine-grained classification may use lower order proximity. The bottom structure of the product network can be well reserved, and the products can be accurately classified.
According to the embodiment of the disclosure, the product attribute data associated with the N products may be obtained after data processing is performed under a Hadoop big data processing platform. The data processing may be to clean the data and remove or replace the abnormal value in the data.
Fig. 3 schematically illustrates a flow chart of a method for classifying N products based on target proximity embedding vectors of each order to obtain a product classification result according to an embodiment of the present disclosure.
As shown in fig. 3, the method 340 for classifying N products based on the target proximity embedding vector of each order to obtain the product classification result in this embodiment includes operations S341 to S344.
In operation S341, a target product is randomly selected from the N products.
In operation S342, an embedding space distance between the target product corresponding to each order and other products is determined based on the target proximity embedding vector of each order.
According to the embodiment of the disclosure, the embedding space distance can be determined according to the target proximity embedding vector corresponding to the target product and the target proximity embedding vector corresponding to each remaining other product.
In operation S343, a sub-classification result corresponding to each order is determined according to the embedding space distance.
According to the embodiment of the disclosure, for each step, the embedded space distance can be compared with a threshold value, and the target embedded space distance meeting the threshold value is screened; and determining other target residual products which can be classified into one type with the target product according to the target embedded space distance. And the like, classifying can be carried out for each product until all the products are classified, and a sub-classification result is obtained. The sub-classification results may be used to characterize classified products with associations.
In operation S344, a product classification result is determined according to the sub-classification result.
According to the embodiment of the disclosure, the sub-classification results of each stage can be combined to obtain the product classification result. Based on the product classification result, the recommendation system can push the products of each category to the target user.
According to the embodiment of the disclosure, based on the sub-classification results corresponding to each step, the product classification results are determined, and products with strong relevance can be divided into a group under the condition that more original information in the undirected graph is reserved as much as possible; according to the classification condition of any order, the final classification result is determined, the products with relevance can be classified as much as possible, a product pushing system is facilitated, the products are accurately pushed by utilizing the product classification result, and the pushed products can meet the requirements of users on the products.
According to an embodiment of the present disclosure, determining a sub-classification result corresponding to each order according to the embedding space distance may include:
for each step: and dividing the target product with the embedded space distance meeting the threshold value and other products into one type to obtain a sub-classification result.
According to embodiments of the present disclosure, the threshold may be determined according to an actual classification accuracy. For each step, the embedded space distance can be compared with a threshold value, the target embedded space distance meeting the threshold value is screened, and the target product with the embedded space distance meeting the threshold value and other products are classified into one type, so that a sub-classification result is obtained.
According to the embodiment of the disclosure, the target products with the embedded space distance meeting the threshold value and other products are divided into one type, products with similar product attributes can be divided into one type, products with similarity among the product attributes can be divided into one type, accurate classification of all products is facilitated, and accurate classification of the products under complex conditions can be met.
According to an embodiment of the present disclosure, determining a product classification result from the sub-classification result may include:
and weighting and combining the sub-classification results of each step according to the requirements of the users on the products to obtain the product classification results.
According to embodiments of the present disclosure, a user's demand for a product may be determined by collecting the user's transaction information in real time or collecting the user's web page history of browsing the product.
According to the embodiment of the disclosure, the sub-classification results of each order can be weighted according to the requirements of users on products, and then weighted combination is performed to obtain the product classification results.
According to the embodiment of the disclosure, according to the requirements of the user on the products, the obtained product classification results can improve the user experience while meeting the requirements of the user.
Fig. 4 schematically illustrates a method flow diagram for generating an undirected graph based on product attribute data associated with N products and product identifications of the N products, according to an embodiment of the present disclosure.
As shown in fig. 4, the method 410 of generating an undirected graph according to product attribute data associated with N products and product identifications of the N products of this embodiment may include operations S411 to S413. It should be noted that the undirected graph may include nodes and edges, and the product attribute data includes product basic attribute data and attribute data associated with the product owner
In operation S411, a node is generated according to product identifications of the N products.
According to the embodiment of the disclosure, the product identifier of each product corresponds to one node and is used as the node of the undirected graph.
In operation S412, relationships between product attributes are determined from product attribute data associated with the N products.
In operation S413, edges are generated based on the relationships between the product attributes.
According to the embodiment of the disclosure, products with association relation among the product attributes can be connected to serve as edges of the undirected graph.
According to the embodiment of the disclosure, the undirected graph is used for representing the whole product network, the nodes can represent the product identifiers, the edges can represent the connection between the product attributes and other product attributes, more original product data information can be reserved to the greatest extent, and accurate classification of products is facilitated.
According to an embodiment of the present disclosure, inputting an adjacency matrix and a preset embedding dimension of each order corresponding to an undirected graph into a network embedding model, and outputting a proximity embedding vector of each order may include:
inputting the adjacency matrix of each order corresponding to the undirected graph and a preset embedding dimension into a network embedding model so as to execute the following operations:
extracting eigenvalues and eigenvectors of the adjacent matrix; performing matrix decomposition on the adjacent matrix to obtain a first matrix and a second matrix, wherein the first matrix is used for representing a diagonal matrix of the characteristic value; the second matrix is used for representing a matrix composed of feature vectors; screening a first target matrix and a second target matrix which meet preset embedding dimensions from the first matrix and the second matrix respectively; determining a proximity embedding vector based on the first target matrix and the second target matrix; the proximity embedding vector is output.
According to the embodiment of the disclosure, a top-l feature decomposition of the adjacent matrix A can be calculated by using a high-order proximity function to obtain a first matrix and a second matrix. Based on the feature decomposition calculation method, top-d feature decomposition of the adjacency matrix a from 1 to a preset embedding dimension d can be calculated. The first target matrix and the second target matrix satisfying the preset embedding dimension may be screened by calculating the re-weighted eigenvalue Λ' and then sorting in descending order of absolute values.
According to the higher order proximity function, the following formula (1) shows:
F(A)=w 1 A+w 2 A 2 +...+w q A q (1)
where q may represent the order of the network embedding sought; w (w) 1 ,w 2 ,...,w q The weight of each step can be represented, and the weight can be set to w i =0.1 i I.e. higher order weights are lower.
Since the different order feature decomposition results of the arbitrary order proximity function are of a certain degree of correlation, if [ λ, x ] is the feature pair of matrix a, then [ F (λ), x ] is the feature pair of matrix s=f (a). The result of the feature decomposition of the matrix S can be obtained without the feature decomposition from the result of the matrix A by replacing λ with F (λ).
According to the embodiment of the disclosure, the result of the eigenvalue decomposition can be converted into the result of the matrix decomposition in a certain way according to a singular value decomposition formula and a matrix decomposition minimization objective function.
Wherein, the matrix factorization minimization objective function may be represented by the following formula (2):
wherein U is * ,V * ∈R N×d The content/context embedding vectors may be represented separately, N may represent the number of nodes, i.e. the number of products, and d may represent a preset embedding dimension. The principle can be to decompose the matrix S into U * ,V * The product of the two matrices divides a matrix S of size m x n into a matrix U of size m x d and n x d respectively * And V * So that U * V *T The values of (a) are as close as possible to the matrix S.
The singular value decomposition formula may be represented by the following formula (3):
singular value decomposition takes the form of 3 matrix multiplications, U, V.epsilon.R N×d Wherein each column corresponds to a left/right singular value vector, Σr d×d Is a diagonal matrix of singular values arranged in descending order. Since the singular value often corresponds to the important information implicit in the matrix and the importance and the singular value size are positively correlated, only the first d factors are needed to represent it. Specifically, the top-d singular value decomposition of S may be expressed as [ U, sigma, V ]]. The embedding in the formula (2) can be obtained by multiplying Σ by U, V, as shown in the following formula (4):
when the singular value decomposition decomposes adjacent matrixes of different orders, the consumption in time and space is large, feasibility is not realized, and the problem can be converted into a characteristic value decomposition problem in order to realize matrix decomposition and meet a singular value decomposition formula and a matrix decomposition minimum objective function. Eigenvalue decomposition is a method of decomposing a matrix into products of the matrix represented by its eigenvalues and eigenvectors. The matrix S can be decomposed into [ Λ, X ]Wherein Λ ε R d×d The diagonal matrix representing the eigenvalues may be arranged in descending order of absolute value. X epsilon R N×d Representing a matrix of eigenvectors of the matrix, one for each column. Also for feature decomposition, feature values may be decomposed to obtain feature vectors that satisfy the preset embedding dimension d, i.e., to extract the most important features of the matrix. The feature decomposition can yield feature pairs [ Λ (i, i), X (: i)],1≤i≤d。
For any of the symmetric matrices S,so that the following formulas (5) and (6) are established:
where abs (x) = |x| represents an absolute value; sign (·) represents the sign function, i.e. there is a function sign (x), when x>0, the result of the function is 1, if x=0, the result is 0, if x<0, resulting in-1. Using the relationship between the eigenvalue decomposition and the singular value decomposition, the result of the singular value decomposition can be obtained from the eigenvalue decomposition by equation (5), whereas the result of the eigenvalue decomposition can be obtained from the singular value decomposition by equation (6), and finally the each-order proximity embedding vector U is determined i * I is more than or equal to 1 and less than or equal to r, wherein r represents the order.
According to the embodiment of the disclosure, by adopting a matrix decomposition method, the problem of matrix sparsification can be solved by performing matrix decomposition, and when the number of products is large, the accuracy can be higher.
According to an embodiment of the present disclosure, performing dimension reduction on the proximity embedding vector of each order to obtain a target proximity embedding vector of each order may include:
and (3) reducing the dimension of the adjacent embedded vector of each order by using a principal component analysis method to obtain the target adjacent embedded vector of each order.
According to the embodiment of the disclosure, the method can utilize a principal component analysis method to dimension down to three dimensions the adjacent embedded vector of each order to obtain the target adjacent embedded vector of each order. The principal component analysis method may be any known principal component analysis method, and is not particularly limited herein.
According to the embodiment of the disclosure, the product network can be visualized through the principal component analysis method, so that the product classification result obtained according to the requirements of users on products is facilitated, and the user experience can be improved while the requirements of the users are met.
Based on the product classification method, the disclosure also provides a product classification device. The device will be described in detail below in connection with fig. 5.
Fig. 5 schematically shows a block diagram of a product classification device according to an embodiment of the disclosure.
As shown in fig. 5, the product classification apparatus 500 of this embodiment includes a generation module 510, a first processing module 520, a second processing module 530, and a classification module 540.
The generating module 510 is configured to generate an undirected graph according to product attribute data associated with N products and product identifiers of the N products, where N is an integer greater than or equal to 2. In an embodiment, the generating module 510 may be configured to perform the operation S210 described above, which is not described herein.
The first processing module 520 is configured to input the adjacency matrix of each order corresponding to the undirected graph and the preset embedding dimension into the network embedding model, and output the proximity embedding vector of each order. In an embodiment, the first processing module 520 may be configured to perform the operation S220 described above, which is not described herein.
The second processing module 530 is configured to reduce the dimension of the proximity embedding vector of each order to obtain a target proximity embedding vector of each order. In an embodiment, the second processing module 530 may be configured to perform the operation S230 described above, which is not described herein.
The classification module 540 is configured to classify N products based on the target proximity embedded vector of each order, and obtain a product classification result. In an embodiment, the classification module 540 may be used to perform the operation S240 described above, which is not described herein.
According to an embodiment of the present disclosure, the classification module 540 may include a first screening unit, a first determination unit, a second determination unit, and a third determination unit.
The first screening unit is used for randomly screening target products from N products.
The first determining unit is used for determining the embedding space distance between the target product corresponding to each order and other products based on the target adjacent embedding vector of each order.
The second determining unit is used for determining a sub-classification result corresponding to each step according to the embedded space distance.
The third determining unit is used for determining a product classification result according to the sub-classification result.
According to an embodiment of the present disclosure, the generating module 510 may include a first generating subunit, a fourth determining unit, and a second generating subunit. Wherein the undirected graph includes nodes and edges, and the product attribute data includes product base attribute data and attribute data associated with a product owner.
The first generation subunit is used for generating nodes according to the product identifiers of the N products.
The fourth determining unit is used for determining the relation among the product attributes according to the product attribute data associated with the N products.
The second generation subunit is configured to generate edges based on the relationships between the product attributes.
According to an embodiment of the present disclosure, the first processing module 520 may include an input unit, an extraction unit, a decomposition unit, a second filtering unit, a fifth determination unit, and an output unit.
The input unit is used for inputting the adjacency matrix of each step corresponding to the undirected graph and the preset embedding dimension into the network embedding model.
The extraction unit is used for extracting the eigenvalue and the eigenvector of the adjacent matrix.
The decomposition unit is used for carrying out matrix decomposition on the adjacent matrixes to obtain a first matrix and a second matrix, wherein the first matrix is used for representing diagonal matrixes of the characteristic values; the second matrix is used to characterize the matrix of eigenvectors.
The second screening unit is used for screening a first target matrix and a second target matrix which meet the preset embedding dimension from the first matrix and the second matrix respectively.
The fifth determination unit is configured to determine a proximity embedding vector based on the first target matrix and the second target matrix.
The output unit is used for outputting the adjacent embedded vector.
Any of the generation module 510, the first processing module 520, the second processing module 530, and the classification module 540 may be combined in one module to be implemented, or any of the modules may be split into a plurality of modules according to an embodiment of the present disclosure. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to embodiments of the present disclosure, at least one of the generating module 510, the first processing module 520, the second processing module 530, and the classifying module 540 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable way of integrating or packaging the circuitry, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, at least one of the generation module 510, the first processing module 520, the second processing module 530, and the classification module 540 may be at least partially implemented as a computer program module, which when executed, may perform the corresponding functions.
Fig. 6 schematically illustrates a block diagram of an electronic device adapted to implement a product classification method according to an embodiment of the disclosure.
As shown in fig. 6, an electronic device 600 according to an embodiment of the present disclosure includes a processor 601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. The processor 601 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. Processor 601 may also include on-board memory for caching purposes. The processor 601 may comprise a single processing unit or a plurality of processing units for performing different actions of the method flows according to embodiments of the disclosure.
In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are stored. The processor 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. The processor 601 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 602 and/or the RAM 603. Note that the program may be stored in one or more memories other than the ROM 602 and the RAM 603. The processor 601 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in one or more memories.
According to an embodiment of the present disclosure, the electronic device 600 may also include an input/output (I/O) interface 605, the input/output (I/O) interface 605 also being connected to the bus 604. The electronic device 600 may also include one or more of the following components connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.
The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 602 and/or RAM 603 and/or one or more memories other than ROM 602 and RAM 603 described above.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. The program code, when executed in a computer system, causes the computer system to perform the methods provided by embodiments of the present disclosure.
The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 601. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of signals over a network medium, and downloaded and installed via the communication section 609, and/or installed from the removable medium 611. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 601. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.
The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims (11)

1. A method of product classification comprising:
generating an undirected graph according to product attribute data associated with N products and N product identifiers of the products, wherein N is an integer greater than or equal to 2;
Inputting the adjacency matrix of each order corresponding to the undirected graph and a preset embedding dimension into a network embedding model, and outputting a proximity embedding vector of each order;
performing dimension reduction on the adjacent degree embedded vector of each step to obtain a target adjacent degree embedded vector of each step; and
and classifying N products based on the target proximity embedded vector of each order to obtain a product classification result.
2. The method of claim 1, wherein classifying the N products based on the target proximity embedding vector of each order to obtain a product classification result, comprises:
randomly screening target products from N products;
determining the embedded space distance between the target product and other products corresponding to each step based on the target proximity embedded vector of each step;
determining a sub-classification result corresponding to each step according to the embedded space distance;
and determining the product classification result according to the sub-classification result.
3. The method of claim 2, wherein the determining the sub-classification result for each order according to the embedding space distance comprises:
for each step:
And dividing the target product and other products of which the embedded space distance meets a threshold value into one type to obtain the sub-classification result.
4. The method of claim 2, wherein the determining the product classification result from the sub-classification result comprises:
and weighting and combining the sub-classification results of each order according to the requirements of the users on the products to obtain the product classification results.
5. The method of claim 1, wherein the undirected graph includes nodes and edges, the product attribute data including product base attribute data and attribute data associated with a product owner;
wherein the generating an undirected graph according to product attribute data associated with the N products and product identifications of the N products includes:
generating the node according to the product identifiers of the N products;
determining relationships among product attributes according to the product attribute data associated with the N products;
the edges are generated based on the relationships between the product attributes.
6. The method of claim 1, wherein inputting the adjacency matrix and the preset embedding dimension of each order corresponding to the undirected graph into a network embedding model, outputting the adjacency embedding vector of each order, and comprising:
Inputting the adjacency matrix and the preset embedding dimension of each level corresponding to the undirected graph into the network embedding model so as to execute the following operations:
extracting characteristic values and characteristic vectors of the adjacent matrix;
performing matrix decomposition on the adjacent matrix to obtain a first matrix and a second matrix, wherein the first matrix is used for representing a diagonal matrix of the characteristic value; the second matrix is used for representing a matrix composed of the feature vectors;
screening a first target matrix and a second target matrix which meet the preset embedding dimension from the first matrix and the second matrix respectively;
determining the proximity embedding vector based on the first target matrix and the second target matrix;
and outputting the proximity embedded vector.
7. The method of claim 1, wherein the dimension reducing the proximity embedding vector of each order to obtain the target proximity embedding vector of each order comprises:
and reducing the dimension of the adjacent embedded vector of each order by using a principal component analysis method to obtain the target adjacent embedded vector of each order.
8. A product sorting apparatus comprising:
The generating module is used for generating an undirected graph according to product attribute data associated with N products and N product identifiers of the products, wherein N is an integer greater than or equal to 2;
the first processing module is used for inputting the adjacency matrix of each order corresponding to the undirected graph and a preset embedding dimension into a network embedding model and outputting a proximity embedding vector of each order;
the second processing module is used for reducing the dimension of the adjacent embedded vector of each order to obtain a target adjacent embedded vector of each order; and
and the classification module is used for classifying N products based on the target adjacent embedded vector of each order to obtain a product classification result.
9. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-7.
10. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1-7.
11. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 7.
CN202310877542.6A 2023-07-17 2023-07-17 Product classification method, device, equipment and medium Pending CN116881659A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310877542.6A CN116881659A (en) 2023-07-17 2023-07-17 Product classification method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310877542.6A CN116881659A (en) 2023-07-17 2023-07-17 Product classification method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN116881659A true CN116881659A (en) 2023-10-13

Family

ID=88254504

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310877542.6A Pending CN116881659A (en) 2023-07-17 2023-07-17 Product classification method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN116881659A (en)

Similar Documents

Publication Publication Date Title
CN111783039B (en) Risk determination method, risk determination device, computer system and storage medium
CN112231592A (en) Network community discovery method, device, equipment and storage medium based on graph
CN109087138A (en) Data processing method and system, computer system and readable storage medium storing program for executing
Ben Hamza Nonextensive information-theoretic measure for image edge detection
CN111191677A (en) User characteristic data generation method and device and electronic equipment
CN116155628B (en) Network security detection method, training device, electronic equipment and medium
CN112446777A (en) Credit evaluation method, device, equipment and storage medium
CN116308704A (en) Product recommendation method, device, electronic equipment, medium and computer program product
CN113869904B (en) Suspicious data identification method, device, electronic equipment, medium and computer program
CN114493853A (en) Credit rating evaluation method, credit rating evaluation device, electronic device and storage medium
CN116881659A (en) Product classification method, device, equipment and medium
CN111291196B (en) Knowledge graph perfecting method and device, and data processing method and device
CN114139059A (en) Resource recommendation model training method, resource recommendation method and device
Bagirov et al. An Algorithm for Clustering Using L1‐Norm Based on Hyperbolic Smoothing Technique
Rougier et al. The scope of the Kalman filter for spatio‐temporal applications in environmental science
CN114897290A (en) Evolution identification method and device of business process, terminal equipment and storage medium
CN114332472A (en) Data processing method and device based on graph neural network
CN114358024A (en) Log analysis method, apparatus, device, medium, and program product
CN114067149A (en) Internet service providing method and device and computer equipment
Sumalatha et al. Rough set based decision rule generation to find behavioural patterns of customers
CN110610392A (en) Data processing method and system, computer system and computer readable storage medium
CN116501993B (en) House source data recommendation method and device
US20240121119A1 (en) Method and Apparatus for Classifying Blockchain Address
CN116450950A (en) Product combination recommendation method, device, equipment and medium
CN114707762A (en) Credit risk prediction method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination