CN117556147B

CN117556147B - Electronic commerce data classification recommendation system and method

Info

Publication number: CN117556147B
Application number: CN202410039143.7A
Authority: CN
Inventors: 武岳巍; 付睿翎; 邢彤彤; 余振宇; 吴肇良; 冯小丽; 殷复莲
Original assignee: Communication University of China
Current assignee: Communication University of China
Priority date: 2024-01-11
Filing date: 2024-01-11
Publication date: 2024-04-12
Anticipated expiration: 2044-01-11
Also published as: CN117556147A

Abstract

The present invention provides an e-commerce data classification recommendation system and method. Through the loss-aware feature attention mechanism network LAFAMN, the supervision network therein is used to weight different input feature groups, and the weights are self-regulated according to the prediction results of the network as a whole and each sub-network, so as to effectively solve the feature imbalance problem in the e-commerce data classification recommendation process; in addition, based on the double suppression loss function regulated by the classification confidence, the loss value of the large-class easy-to-judge samples is greatly suppressed, while ensuring that the loss value of the small-class difficult-to-judge samples is not reduced at all, so as to solve the category imbalance problem in two dimensions of the difficulty of judgment and the imbalance of the number of samples. The joint use of LAFAMN and the double suppression loss function can form a complete recommendation system, solve the class imbalance classification problem in three dimensions, and provide a complete solution process for the commodity recommendation problem.

Description

E-commerce data classification recommendation system and method

技术领域Technical Field

本发明涉及人工智能技术领域下的智能推荐领域，更为具体地，涉及一种电商数据分类推荐系统、方法。The present invention relates to the field of intelligent recommendation under the field of artificial intelligence technology, and more specifically, to an e-commerce data classification recommendation system and method.

背景技术Background Art

随着互联网的飞速发展，电商行业的覆盖面越来越广，几乎所有的电商平台都在使用产品推荐系统以帮助客户快速找到其感兴趣的东西。对于电商平台而言，一个有效的推荐算法能够提升电商平台的利润并降低人力成本，但是，由于电商商品数量庞大、展示位置数量较少、用户浏览商品量较少，因此在推荐系统中存在不同维度的类不平衡分类问题，其中包括：类别间样本数量不平衡、样本判别难易不平衡以及样本特征数量不平衡三个维度。With the rapid development of the Internet, the coverage of the e-commerce industry is getting wider and wider. Almost all e-commerce platforms are using product recommendation systems to help customers quickly find what they are interested in. For e-commerce platforms, an effective recommendation algorithm can increase the profits of e-commerce platforms and reduce labor costs. However, due to the large number of e-commerce products, the small number of display locations, and the small number of users browsing products, there are class imbalance classification problems in different dimensions in the recommendation system, including: imbalance in the number of samples between categories, imbalance in the difficulty of sample discrimination, and imbalance in the number of sample features.

针对面向数据类别不平衡的分类问题，研究领域内一般称样本数量少的那个类别为小类，反之，样本数量多的为大类。对此，传统分类方法以最小化错误率为目标建立分类器，基于小类样本数量稀少以及具有较为复杂的特征分布的特点，网络倾向于将所有样本判别为大类，这种分类方法大大降低了小类的泛化性能。In the research field, the category with fewer samples is generally called the small category, and vice versa, the category with more samples is called the large category. In this regard, the traditional classification method aims to minimize the error rate to establish a classifier. Due to the small number of samples in the small category and the more complex feature distribution, the network tends to classify all samples as large categories. This classification method greatly reduces the generalization performance of the small category.

.因此，分析电商推荐领域类不平衡造成算法性能降低的原因，并探究推荐算法中的类不平衡问题的解决方法，成为目前人工智能推荐领域的研究方向之一。Therefore, analyzing the reasons why class imbalance in the e-commerce recommendation field causes reduced algorithm performance and exploring solutions to the class imbalance problem in recommendation algorithms has become one of the current research directions in the field of artificial intelligence recommendation.

发明内容Summary of the invention

鉴于上述目前电子商务领域的推荐系统中商品数据存在的多维度类不平衡问题，本发明的目的是提供一种电商数据分类推荐系统、方法，使用注意力机制对传统多专家学习网络进行整合、改进以构建LAFAM网络，并基于分类置信度调控的双压制损失函数（Suppression Loss），抑制大类样本和易判断样本损失值的贡献度，同时保证小类难判断样本的损失值完全不削减以解决现有推荐算法中的类不平衡问题。In view of the multi-dimensional class imbalance problem of commodity data in the current recommendation system in the field of e-commerce, the purpose of the present invention is to provide an e-commerce data classification recommendation system and method, which uses the attention mechanism to integrate and improve the traditional multi-expert learning network to construct a LAFAM network, and based on the double suppression loss function (Suppression Loss) regulated by the classification confidence, suppresses the contribution of the loss value of large-category samples and easy-to-judge samples, while ensuring that the loss value of small-category difficult-to-judge samples is not reduced at all to solve the class imbalance problem in the existing recommendation algorithm.

本发明提供的一种电商数据分类推荐系统，包括：The present invention provides an e-commerce data classification recommendation system, comprising:

特征预处理单元，用于通过预设排序方法和预设分类方法对第一训练数据进行重要性排序和特征分类，以得到所述第一训练数据的特征预处理数据；A feature preprocessing unit, used to sort the first training data by importance and classify the features by a preset sorting method and a preset classification method, so as to obtain feature preprocessing data of the first training data;

模型训练单元，用于通过预设的LAFAM网络对训练数据进行学习训练处理，以得到每种类型的输出和标签，将所述输出和标签进行损失值计算后，通过对所述损失值的加权求和后回传处理，使所述损失值达到预设要求后得到LAFAMN模型；其中，所述训练数据包括所述第一训练数据和所述特征预处理数据，所述LAFAM网络包括监督网络和预设数量的子网络；所述子网络用于对所述特征预处理数据进行训练学习，所述监督网络用于对所述第一训练数据进行训练学习；并且，所述监督网络学习所述子网络的占比权重，所述子网络通过动态调整总损失完成子网权重和特征权重的自学习；以及，在所述子网络中使用双压制损失函数处理高度不平衡的数据集；A model training unit is used to perform learning and training processing on the training data through a preset LAFAM network to obtain each type of output and label, calculate the loss value of the output and label, and then return the weighted sum of the loss value to obtain a LAFAMN model after the loss value reaches the preset requirement; wherein, the training data includes the first training data and the feature preprocessing data, and the LAFAM network includes a supervision network and a preset number of sub-networks; the sub-network is used to train and learn the feature preprocessing data, and the supervision network is used to train and learn the first training data; and, the supervision network learns the proportion weight of the sub-network, and the sub-network completes the self-learning of the sub-network weight and the feature weight by dynamically adjusting the total loss; and, in the sub-network, a double suppression loss function is used to process highly unbalanced data sets;

分类推荐单元，用于通过所述LAFAMN模型对电商数据进行分类推荐处理。The classification recommendation unit is used to perform classification recommendation processing on the e-commerce data through the LAFAMN model.

其中，可选的方案是，所述特征预处理单元包括：Among them, an optional solution is that the feature preprocessing unit includes:

重要性排序单元，用于通过预设排序方法对所述第一训练数据进行初级排序处理，并将所述初级排序处理结果进行加权计算特征重要性分数，以获取所述第一训练数据的最终排序结果；An importance ranking unit, configured to perform a primary ranking process on the first training data by a preset ranking method, and weight the primary ranking process result to calculate a feature importance score, so as to obtain a final ranking result of the first training data;

特征分类单元，用于按照预设分类方法对所述第一训练数据进行分类处理。The feature classification unit is used to classify the first training data according to a preset classification method.

其中，可选的方案是，所述预设排序方法包括PS-smart、XGBoost、GBDT；通过所述预设分类方法将所述第一训练数据划分为稠密特征、稀疏特征、时序特征和基本特征；其中，所述基本特征为在分类后不在稠密特征、稀疏特征和时序特征三类特征中的其他特征；所述子网络分别对不同类别的输入特征进行学习训练处理。Among them, the optional scheme is that the preset sorting method includes PS-smart, XGBoost, and GBDT; the first training data is divided into dense features, sparse features, time series features, and basic features through the preset classification method; wherein the basic features are other features that are not in the three categories of dense features, sparse features, and time series features after classification; and the subnetwork performs learning and training processing on different categories of input features respectively.

其中，可选的方案是，所述子网络分别对不同类别的输入特征进行学习训练处理，包括：Among them, an optional solution is that the sub-network performs learning and training processing on different categories of input features respectively, including:

将所述第一训练数据的经特征预处理数据后的不同类别的输入特征分别投入到不同的子网络当中进行学习，所有的子网络输出一个0到1之间的分数，所述分数代表当前样本在当前子网络的学习结果；其中，0为最不可能被推荐、1为最有可能被推荐；Input features of different categories after feature preprocessing of the first training data are respectively put into different sub-networks for learning, and all sub-networks output a score between 0 and 1, which represents the learning result of the current sample in the current sub-network; 0 means the sample is least likely to be recommended and 1 means the sample is most likely to be recommended;

每个子网络将通过样本的标签值和预测值计算得到一个损失值，，其中，为子网络总数；所述损失值的值越大，相应子网络对于样本的预测置信度越低，在最终计算输出值时的占比越低。Each sub-network will calculate a loss value through the sample's label value and prediction value. , in, is the total number of subnetworks; the loss value The larger the value of , the lower the corresponding sub-network's prediction confidence for the sample, and the lower its share in the final calculation of the output value.

其中，可选的方案是，所述监督网络的输出为所有子网络在总输出中的占比；并且，Among them, an optional solution is that the output of the supervision network is the proportion of all sub-networks in the total output; and,

所述监督网络训练时使用的标签为根据损失值进行了SoftMax计算后得到的关于所述损失值l _i的比重向量，其中，内元素取值，， The label used in the supervised network training is the weight vector of the loss value l _i obtained after performing SoftMax calculation based on the loss value ,in, Internal element value , ,

将作为映射前的输入进行SoftMax计算，对于第个子网络，使用SoftMax针对每一个子网络估算其预测成功的置信度，如公式（1）所示：（1） Will As the input before mapping, SoftMax calculation is performed. sub-networks, and use SoftMax to estimate the confidence of its prediction success for each sub-network , as shown in formula (1): (1)

其中，与呈负相关。in, and Negatively correlated.

其中，可选的方案是，通过所述损失值再计算一组正向参数，以使每个子网络的输出都达到最优，其中，向量内元素取值，，如公式（2）所示：Among them, an optional solution is to use the loss value Calculate another set of forward parameters , so that the output of each sub-network is optimal, where The value of the element in the vector , , as shown in formula (2):

（2） (2)

其中，与呈正相关。 in, and There is a positive correlation.

本发明还提供一种电商数据分类推荐方法，基于如前所述的电商数据分类推荐系统进行数据类不平衡分类，包括：The present invention also provides an e-commerce data classification recommendation method, which performs data class imbalance classification based on the e-commerce data classification recommendation system as described above, comprising:

通过预设排序方法和预设分类方法对第一训练数据进行重要性排序和特征分类，以得到所述第一训练数据的特征预处理数据；Sorting the first training data by importance and classifying its features by a preset sorting method and a preset classification method to obtain feature preprocessing data of the first training data;

通过预设的LAFAM网络对训练数据进行训练学习处理，以得到每种类型的输出和标签，将所述输出和标签进行损失值计算后，通过对所述损失值的加权求和后回传处理，使所述损失值达到预设要求后得到LAFAMN模型；其中，所述训练数据包括所述第一训练数据和所述特征预处理数据，所述LAFAM网络包括监督网络和预设数量的子网络；所述子网络用于对所述特征预处理数据进行训练学习，所述监督网络用于对所述第一训练数据进行训练学习；并且，所述监督网络学习所述子网络的占比权重，所述子网络通过动态调整总损失完成子网权重和特征权重的自学习；以及，在所述子网络中使用双压制损失函数处理高度不平衡的数据集；The training data is trained and learned through a preset LAFAM network to obtain each type of output and label. After the loss value of the output and label is calculated, the weighted sum of the loss value is returned for processing, so that the loss value reaches the preset requirement to obtain the LAFAMN model; wherein, the training data includes the first training data and the feature preprocessing data, and the LAFAM network includes a supervisory network and a preset number of sub-networks; the sub-network is used to train and learn the feature preprocessing data, and the supervisory network is used to train and learn the first training data; and, the supervisory network learns the proportion weight of the sub-network, and the sub-network completes the self-learning of the sub-network weight and the feature weight by dynamically adjusting the total loss; and, in the sub-network, a double suppression loss function is used to process highly unbalanced data sets;

通过所述LAFAMN模型对电商数据进行分类推荐处理。The LAFAMN model is used to perform classification and recommendation processing on e-commerce data.

本发明还提供一种电子设备，所述电子设备包括：The present invention further provides an electronic device, comprising:

至少一个处理器；以及，at least one processor; and,

与所述至少一个处理器通信连接的存储器；其中，a memory communicatively connected to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的计算机程序，所述计算机程序被所述至少一个处理器执行，以使所述至少一个处理器能够执行如前所述的电商数据分类推荐方法中的步骤。The memory stores a computer program that can be executed by the at least one processor, and the computer program is executed by the at least one processor so that the at least one processor can perform the steps in the e-commerce data classification recommendation method as described above.

从上面的技术方案可知，本发明提供的电商数据分类推荐系统、方法，首先通过损失感知特征注意力机制LAFAMN，使用监督网络对不同的输入特征组进行加权，根据网络整体和每个子网络的预测结果对权值自调控，有效解决特征不平衡问题；另外基于分类置信度调控的双压制损失函数，对大类易判断样本的损失值进行大幅度抑制，同时保证小类难判断样本的损失值完全不削减，以解决判别难易与样本数量不平衡两个维度的类别不平衡问题。LAFAMN与双压制损失函数的共同使用可以组成一个完整的推荐系统，解决三个维度的类不平衡分类问题，为商品推荐问题提供完整解决流程。From the above technical solutions, it can be seen that the e-commerce data classification recommendation system and method provided by the present invention firstly uses the loss-aware feature attention mechanism LAFAMN to weight different input feature groups using a supervised network, and self-regulates the weights according to the prediction results of the network as a whole and each sub-network, effectively solving the feature imbalance problem; in addition, the double suppression loss function based on the classification confidence regulation greatly suppresses the loss value of large-category easy-to-judge samples, while ensuring that the loss value of small-category difficult-to-judge samples is not reduced at all, so as to solve the category imbalance problem in two dimensions: difficulty of judgment and imbalance of sample quantity. The joint use of LAFAMN and the double suppression loss function can form a complete recommendation system, solve the class imbalance classification problem in three dimensions, and provide a complete solution process for the product recommendation problem.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

通过参考以下结合附图的说明书内容，并且随着对本发明的更全面理解，本发明的其它目的及结果将更加明白及易于理解。在附图中：By referring to the following description in conjunction with the accompanying drawings, and with a more comprehensive understanding of the present invention, other objects and results of the present invention will become more apparent and easy to understand. In the accompanying drawings:

图1为根据本发明实施例的电商数据分类推荐系统的逻辑结构示意图；FIG1 is a schematic diagram of the logical structure of an e-commerce data classification recommendation system according to an embodiment of the present invention;

图2为根据本发明实施例的电商数据分类推荐系统的框架示意图；FIG2 is a schematic diagram of a framework of an e-commerce data classification recommendation system according to an embodiment of the present invention;

图3为根据本发明实施例的LAFAM网络的理论结构示意图；FIG3 is a schematic diagram of a theoretical structure of a LAFAM network according to an embodiment of the present invention;

图4为根据本发明实施例的电商数据分类推荐方法的流程图；FIG4 is a flow chart of an e-commerce data classification recommendation method according to an embodiment of the present invention;

图5为根据本发明实施例的LAFAM网络在实例中的应用结构示意图；FIG5 is a schematic diagram of an application structure of a LAFAM network in an example according to an embodiment of the present invention;

图6为根据本发明实施例的LAFAM网络和双压制损失函数在实例中的网格搜索示意图；FIG6 is a schematic diagram of a grid search of a LAFAM network and a dual suppression loss function in an example according to an embodiment of the present invention;

图7为根据本发明实施例的电子设备的示意图。FIG. 7 is a schematic diagram of an electronic device according to an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

在下面的描述中，出于说明的目的，为了提供对一个或多个实施例的全面理解，阐述了许多具体细节。然而，很明显，也可以在没有这些具体细节的情况下实现这些实施例。在其它例子中，为了便于描述一个或多个实施例，公知的结构和设备以方框图的形式示出。In the following description, for the purpose of illustration, in order to provide a comprehensive understanding of one or more embodiments, many specific details are set forth. However, it is apparent that these embodiments may also be implemented without these specific details. In other examples, for ease of describing one or more embodiments, known structures and devices are shown in the form of block diagrams.

针对现有的电商产品推荐方案中存在的数据类不平衡问题，本发明提供一种电商数据分类推荐系统、方法，通过融合注意力机制和混合多专家的网络架构，使用函数压制、取幂次压制的方式构造双压制损失函数，探究不同网络和不同损失函数对推荐系统性能的影响，实现面向数据类不平衡的推荐系统模型的构建，将其应用于电子商务的商品推荐中，以解决类别数量不均衡、判别难易不均衡和特征不均衡三个维度的问题。In view of the data class imbalance problem existing in the existing e-commerce product recommendation schemes, the present invention provides an e-commerce data classification recommendation system and method, which integrates the attention mechanism and the network architecture of hybrid multiple experts, constructs a double suppression loss function by using function suppression and power suppression, explores the influence of different networks and different loss functions on the performance of the recommendation system, and realizes the construction of a recommendation system model for data class imbalance. It is applied to the product recommendation of e-commerce to solve the problems of imbalance in the number of categories, imbalance in the difficulty of judgment, and imbalance in features in three dimensions.

为了更好地说明本发明的技术方案，下面先对本发明中所涉及的部分技术术语进行简单说明。In order to better illustrate the technical solution of the present invention, some technical terms involved in the present invention are briefly explained below.

GBDT（Gradient Boosting Decision Tree，梯度提升决策树），是一种基于Boosting集成学习思想的加法模型，通过逐步迭代地训练一系列弱学习器（通常是决策树），每一次迭代都尝试纠正前一次迭代的误差，最终将这些弱学习器组合成一个强学习器。GBDT模型可解释强，应用效果好，在数据挖掘，计算广告、推荐系统等领域应用广泛。GBDT (Gradient Boosting Decision Tree) is an additive model based on the idea of Boosting ensemble learning. It trains a series of weak learners (usually decision trees) step by step and iteratively. Each iteration attempts to correct the error of the previous iteration, and finally combines these weak learners into a strong learner. The GBDT model is highly interpretable and has good application effects. It is widely used in data mining, computational advertising, recommendation systems and other fields.

PS-Smart回归，参数服务器PS（Parameter Server）致力于解决大规模的离线及在线训练任务，SMART（Scalable Multiple Additive Regression Tree）是GBDT基于PS实现的迭代算法。PS-smart能够支持百亿样本、几十万特征的训练任务，可以在上千个节点上运行，且具有故障转移功能，稳定性好；同时，PS-Smart支持多种数据格式、训练目标和评估目标，以及输出特征重要性，并包含直方图近似等加速训练的优化。PS-Smart regression, parameter server PS (Parameter Server) is committed to solving large-scale offline and online training tasks, SMART (Scalable Multiple Additive Regression Tree) is an iterative algorithm implemented by GBDT based on PS. PS-smart can support training tasks with tens of billions of samples and hundreds of thousands of features, can run on thousands of nodes, and has failover capabilities and good stability; at the same time, PS-Smart supports multiple data formats, training targets and evaluation targets, as well as output feature importance, and includes optimizations such as histogram approximation to accelerate training.

XgBoost（eXtreme Gradient Boosting，极致梯度提升），是一类由基函数与权重进行组合形成对数据拟合效果佳的合成算法，与GBDT不同，XgBoost为损失函数增加了正则化项，且使用损失函数的二阶泰勒展开作为损失函数的拟合。它在并行计算效率、缺失值处理、预测性能上都非常强大，同时在防止过拟合和提高泛化能力方面也表现出色，可以快速准确地解决许多数据科学问题。XgBoost (eXtreme Gradient Boosting) is a type of synthetic algorithm that combines basis functions and weights to form a good data fitting effect. Unlike GBDT, XgBoost adds a regularization term to the loss function and uses the second-order Taylor expansion of the loss function as the fitting of the loss function. It is very powerful in parallel computing efficiency, missing value processing, and prediction performance. It also performs well in preventing overfitting and improving generalization ability, and can quickly and accurately solve many data science problems.

稠密特征，是指特征向量中大部分元素都是非零的，这种特征通常用于图像、音频等数据类型。稠密特征的计算速度较慢，但是可以提供更多的信息，因此在某些任务中表现更好。Dense features refer to features in which most elements are non-zero. This type of feature is usually used for data types such as images and audio. Dense features are slower to calculate, but they can provide more information and therefore perform better in some tasks.

稀疏特征，是指特征向量中只有少数几个元素是非零的，这种特征通常用于文本、推荐系统等数据类型。稀疏特征的计算速度较快。Sparse features refer to features in which only a few elements are non-zero. This type of feature is usually used for data types such as text and recommendation systems. Sparse features are faster to calculate.

时序特征，基于时间序列提取或统计的特征。Time series features are features extracted or counted based on time series.

F-measure，是一种统计量，又称为F-Score，是Precision和Recall加权调和平均，是IR（信息检索）领域的常用的一个评价标准，常用于评价分类模型的好坏。F-measure, also known as F-Score, is a statistic that is the weighted harmonic mean of Precision and Recall. It is a commonly used evaluation criterion in the field of IR (information retrieval) and is often used to evaluate the quality of classification models.

以下将结合附图对本发明的具体实施例进行详细描述。The specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

需要说明的是，以下示例性实施例的描述实际上仅仅是说明性的，不作为对本发明及其应用或使用的任何限制。对于相关领域普通技术人员已知的技术和设备可能不作详细讨论，但在适当情况下，所述技术和设备应当被视为说明书的一部分。It should be noted that the following description of the exemplary embodiments is merely illustrative and is not intended to limit the present invention and its application or use. Technologies and devices known to those skilled in the art may not be discussed in detail, but where appropriate, such technologies and devices should be considered part of the specification.

为了说明本发明提供的电商数据分类推荐系统、方法，图1、图2、图3、图4、图5、图6对本发明实施例的电商数据分类推荐系统逻辑结构、系统架构、LAFAM网络的理论结构、电商数据分类推荐方法的流程图、LAFAM网络在实例中的应用结构、LAFAM网络和双压制损失函数在实例中的网格搜索进行了示例性标示；图7对本发明实施例的电商数据分类推荐方法进行了示例性标示。In order to illustrate the e-commerce data classification recommendation system and method provided by the present invention, Figures 1, 2, 3, 4, 5, and 6 exemplarily illustrate the logical structure of the e-commerce data classification recommendation system, the system architecture, the theoretical structure of the LAFAM network, the flow chart of the e-commerce data classification recommendation method, the application structure of the LAFAM network in the example, and the grid search of the LAFAM network and the dual suppression loss function in the example; Figure 7 exemplarily illustrates the e-commerce data classification recommendation method of the embodiment of the present invention.

本发明提供的电商数据分类推荐系统，以损失感知特征注意力机制网络（LossAware Feature Attention Mechanism Network，LAFAMN）为主体，结合双压制损失函数（Suppression Loss）的分类推荐系统框架实现，以共同解决类别数量不均衡、判别难易不均衡和特征不均衡三个维度的问题。为了表述的方便，在本发明的下述实施例中，也将LAFAM网络简称为网络。The e-commerce data classification recommendation system provided by the present invention is based on the LossAware Feature Attention Mechanism Network (LAFAMN) and is implemented in combination with a classification recommendation system framework of a double suppression loss function to jointly solve the problems of imbalanced number of categories, imbalanced difficulty of discrimination, and imbalanced features. For the convenience of expression, in the following embodiments of the present invention, the LAFAM network is also referred to as a network.

图1和图2分别示出了根据本发明实施例的电商数据分类推荐系统逻辑结构和系统架构。1 and 2 respectively show the logical structure and system architecture of the e-commerce data classification recommendation system according to an embodiment of the present invention.

如图1和图2共同所示，本发明提供的电商数据分类推荐系统100，主要包括特征预处理单元110、模型训练单元120和分类单元130三部分，相应的系统架构是一个具有普适性的流程性系统，分为特征预处理、LAFAMN模型训练和分类三个阶段。As shown in both FIG. 1 and FIG. 2 , the e-commerce data classification recommendation system 100 provided by the present invention mainly includes three parts: a feature preprocessing unit 110, a model training unit 120 and a classification unit 130. The corresponding system architecture is a universal procedural system, which is divided into three stages: feature preprocessing, LAFAMN model training and classification.

其中，特征预处理单元110，用于通过预设排序方法和预设分类方法对第一训练数据进行重要性排序和特征分类，以得到所述第一训练数据的特征预处理数据；The feature preprocessing unit 110 is used to sort the first training data by importance and classify the features by a preset sorting method and a preset classification method to obtain feature preprocessing data of the first training data;

模型训练单元120，用于通过预设的LAFAM网络对训练数据进行学习训练处理，以得到每种类型的输出和标签，将所述输出和标签进行损失值计算后，通过对所述损失值的加权求和后回传处理，使所述损失值达到预设要求后得到LAFAMN模型；The model training unit 120 is used to perform learning and training processing on the training data through a preset LAFAM network to obtain each type of output and label, calculate the loss value of the output and label, and then perform weighted summation of the loss value and then return the loss value to obtain a LAFAMN model after the loss value reaches a preset requirement;

分类单元130，用于通过所述LAFAMN模型对电商数据进行分类推荐处理。The classification unit 130 is used to perform classification and recommendation processing on the e-commerce data through the LAFAMN model.

下面将结合具体实施例对上述三个部分做详细的说明。The above three parts will be described in detail below in conjunction with specific embodiments.

在进行特征预处理之前，首先需要进行训练数据的采集（样本采集）。在本发明中，所采集的训练数据中的一部分作为第一训练数据被用于学习训练，该第一训练数据在经过特征预处理之后输入模型训练单元子网络进行学习训练，对于模型训练单元的监督网络而言，则是对全部第一训练数据进行学习训练。Before feature preprocessing, it is necessary to first collect training data (sample collection). In the present invention, a part of the collected training data is used as the first training data for learning and training. After feature preprocessing, the first training data is input into the model training unit sub-network for learning and training. For the supervision network of the model training unit, all the first training data are learned and trained.

另外，由于本发明的方案主要针对目前电子商务领域的推荐系统中商品数据存在的多维度类不平衡问题，因此，在图2所示的实施例中，所采集的训练数据包括与电子商务领域的商品数据相关的物品特征、优惠券、佣金、历史、价格等数据。In addition, since the solution of the present invention is mainly aimed at the multi-dimensional class imbalance problem existing in the commodity data in the current recommendation system in the e-commerce field, in the embodiment shown in FIG2 , the collected training data includes item features, coupons, commissions, history, prices and other data related to the commodity data in the e-commerce field.

所采集的训练数据可以集中存储在样本数据库中，在样本采集完毕，即可通过特征预处理单元110对部分训练数据进行预处理。本发明通过多种预设算法完成部分训练数据的排序和分类，以取代传统分类方法中的特征选择和融合。The collected training data can be centrally stored in a sample database, and after the sample collection is completed, part of the training data can be preprocessed by the feature preprocessing unit 110. The present invention uses a variety of preset algorithms to complete the sorting and classification of part of the training data to replace the feature selection and fusion in the traditional classification method.

在本发明的一个具体实施方式中，特征预处理单元110包括重要性排序单元111和特征分类单元112。其中，重要性排序单元111，用于通过预设排序方法对所述第一训练数据进行初级排序处理，并将所述初级排序处理结果进行加权计算特征重要性分数，以获取所述第一训练数据的最终排序结果；特征分类单元112，用于按照预设分类方法对所述第一训练数据进行分类处理。In a specific embodiment of the present invention, the feature preprocessing unit 110 includes an importance ranking unit 111 and a feature classification unit 112. The importance ranking unit 111 is used to perform primary ranking processing on the first training data by a preset ranking method, and weight the primary ranking processing result to calculate the feature importance score to obtain the final ranking result of the first training data; the feature classification unit 112 is used to classify the first training data according to the preset classification method.

具体的，作为示例，在排序部分使用PS-smart、XGBoost、GBDT三种方式对特征重要性进行排序，针对三者给出的排序结果加权计算特征重要性分数，并给出最终排序。另外，为了提高排序效率，还可以针对此过程内给出的排序进行一定程度的截断，对于贡献度几乎为0的特征做舍弃操作，具体的截断位置应当根据特定问题进行调节。在特征分类部分，为了更好地配合LAFAM网络的运行，在本发明的一个具体实施例中，将特征分为基本特征、稠密特征、稀疏特征和时序特征，并使用不同的子网络处理不同特征。其中的基本特征为在分类后不在这三类特征（稠密特征、稀疏特征和时序特征）里面的其他特征。Specifically, as an example, in the sorting part, PS-smart, XGBoost, and GBDT are used to sort the feature importance. The feature importance scores are weighted based on the sorting results given by the three methods, and the final sorting is given. In addition, in order to improve the sorting efficiency, a certain degree of truncation can be performed on the sorting given in this process, and features with a contribution of almost 0 can be discarded. The specific truncation position should be adjusted according to the specific problem. In the feature classification part, in order to better cooperate with the operation of the LAFAM network, in a specific embodiment of the present invention, the features are divided into basic features, dense features, sparse features, and time series features, and different sub-networks are used to process different features. The basic features are other features that are not in these three categories of features (dense features, sparse features, and time series features) after classification.

模型训练单元120通过预设的LAFAM网络对训练数据进行学习训练处理。其中的LAFAM网络包括监督网络和预设数量的子网络，其中，子网络用于对特征预处理单元110预处理得到的特征进行训练学习，监督网络用于对未进行特征预处理的所有训练数据进行训练学习，其中，监督网络学习子网络的占比权重。The model training unit 120 performs learning and training processing on the training data through a preset LAFAM network. The LAFAM network includes a supervision network and a preset number of sub-networks, wherein the sub-network is used to train and learn the features preprocessed by the feature preprocessing unit 110, and the supervision network is used to train and learn all training data that have not been preprocessed, wherein the supervision network learns the weight of the sub-network.

在LAFAMN模型的训练过程中，首先对前述得到的所有数据进行输入学习。其中进行过排序、分类的特征数据输入到相应的每一个子网络当中，未进行特征预处理的全部特征输入到监督网络当中，该监督网络将学习子网络的占比权重。通过这种方式，使LAFAMN模型的训练方式解决了特征不平衡的问题，为不同类型的特征分配合适的网络；并且，在本发明中，还通过动态调整总损失完成子网权重和特征权重的自学习，使网络在不丢弃任何特征的情况下，让每个特征对预测结果做出其最大的贡献。During the training process of the LAFAMN model, all the data obtained above are first input for learning. The sorted and classified feature data are input into each corresponding sub-network, and all the features that have not been pre-processed are input into the supervision network, which will learn the weight of the sub-network. In this way, the training method of the LAFAMN model solves the problem of feature imbalance and allocates suitable networks for different types of features; and in the present invention, the self-learning of sub-network weights and feature weights is completed by dynamically adjusting the total loss, so that the network can make each feature make its greatest contribution to the prediction result without discarding any feature.

此外，为了解决类不平衡问题，本发明在适当的子网络中使用双压制损失函数，以处理高度不平衡的数据集。通过LAFAM网络和双压制损失函数二者搭配，完美地解决了特征不平衡、数量不平衡、判别难易度不平衡三个问题。In addition, in order to solve the class imbalance problem, the present invention uses a dual suppression loss function in an appropriate sub-network to process highly unbalanced data sets. By combining the LAFAM network and the dual suppression loss function, the three problems of feature imbalance, quantity imbalance, and discrimination difficulty imbalance are perfectly solved.

在分类推荐单元中，通过所述LAFAMN模型对电商数据进行分类推荐处理。In the classification recommendation unit, the e-commerce data is classified and recommended using the LAFAMN model.

另外，在本发明的一个具体实施例中，在通过模型训练得到LAFAMN模型之后，还通过网格搜索的方式对LAFAM网络的参数、和以及双压制损失函数的参数、、、和进行优化，组合各个参数的取值范围，生成参数网格并选择使网络分类推荐的准确率、F-measure参数达到最优的组合以得到分类阈值，根据所述分类阈值对训练好的LAFAMN模型进行参数优化处理。具体的，作为实施例，在分类推荐过程中，本发明将日售卖量排在行业及总销售量顶端的商品称为“打爆商品”，即该商品有较为火爆的销售市场。同时，根据LAFAMN模型打分给出商品打爆可能性排序，本发明在分类过程中也会重视工程中所使用的前50/100/200中标率（HR@50、HR@100和HR@200）参数，使其达到最大。In addition, in a specific embodiment of the present invention, after the LAFAMN model is obtained through model training, the parameters of the LAFAM network are also searched by grid search. , and And the parameters of the double suppression loss function , , , And optimize, combine the value range of each parameter, generate a parameter grid and select the best combination of the accuracy of network classification recommendation and F-measure parameters to obtain the classification threshold, and optimize the parameters of the trained LAFAMN model according to the classification threshold. Specifically, as an embodiment, in the classification recommendation process, the present invention refers to the products whose daily sales volume ranks at the top of the industry and total sales volume as "hot products", that is, the products have a relatively hot sales market. At the same time, according to the LAFAMN model score, the possibility of the products being hot is ranked. In the classification process, the present invention also pays attention to the top 50/100/200 winning rates (HR@50, HR@100 and HR@200) parameters used in the project to maximize them.

图3为根据本发明实施例的LAFAM网络的理论结构示意图。FIG3 is a schematic diagram of the theoretical structure of a LAFAM network according to an embodiment of the present invention.

为了解决特征不平衡的问题，本发明提出了一种损失感知特征注意力机制网络LAFAMN，LAFAMN的理论结构如图3所示。In order to solve the problem of feature imbalance, the present invention proposes a loss-aware feature attention mechanism network LAFAMN. The theoretical structure of LAFAMN is shown in FIG3 .

在图3所示的LAFAMN中，首先，设置多了个子网络对特征进行学习，子网络的数量、类型可以根据不同问题进行调整和改变，如图3中前三个子网络所示为多个不同的子网。In the LAFAMN shown in FIG3 , first, multiple subnetworks are set up to learn features. The number and type of subnetworks can be adjusted and changed according to different problems. The first three subnetworks in FIG3 are shown as multiple different subnetworks.

将第一训练数据的经特征预处理数据后的不同类别的输入特征分别投入到不同的子网络当中进行学习，所有的子网络输出一个0到1之间的分数，该分数代表当前样本在当前网络的学习结果，0为最不可能被推荐、1为最有可能被推荐。每个子网都将通过该样本的标签值和预测值计算得到一个损失值损失值，，其中，即第i个子网络的输出预测值与当前样本标签值之间的差距，其中N为子网络总数（不包含监督网络）。这个损失值不仅作为损失函数进行回传，更重要的是作为输出置信度的判定标准。值越大，说明该子网对于样本的预测置信度越低，应当加强网络对当前子网的学习力度，因此在回传的损失中给其更大的权重。与此同时，值越大说明其预测的错误概率越高，因此在最终计算输出值时，其占比应当越低。其次，根据以上特点，本发明设置一个监督网络，如图3中最下方子网络所示，该网络的输出是所有子网络输出在总输出中的占比，其训练时使用的标签是根据损失值进行了SoftMax计算后得到的关于的比重向量，其中N为子网络总数（不包含监督网络），另外内元素取值，。The input features of different categories after feature preprocessing of the first training data are respectively put into different sub-networks for learning. All sub-networks output a score between 0 and 1, which represents the learning result of the current sample in the current network, 0 is the least likely to be recommended, and 1 is the most likely to be recommended. Each sub-network will calculate a loss value based on the label value and prediction value of the sample. , , where is the difference between the output prediction value of the ith subnetwork and the current sample label value, and N is the total number of subnetworks (excluding the supervisory network). This loss value is not only fed back as a loss function, but more importantly, it is used as a criterion for determining the output confidence. The larger the value, the lower the prediction confidence of the subnet for the sample, and the network should strengthen its learning of the current subnet, so it should be given a greater weight in the return loss. The larger the value, the higher the error probability of its prediction. Therefore, when the output value is finally calculated, its proportion should be lower. Secondly, according to the above characteristics, the present invention sets a supervision network, as shown in the bottom sub-network in Figure 3. The output of this network is the proportion of all sub-network outputs in the total output. The label used in its training is the one obtained after the SoftMax calculation based on the loss value. The weight vector , where N is the total number of sub-networks (excluding the supervision network), and Internal element value , .

SoftMax函数一般用于解决多分类问题中，将多个神经元的输出映射到的范围内。本发明使用其可以将多个类别的预测值按指数映射的特性，将作为映射前的输入进行SoftMax计算。那么对于第个子网络，使用SoftMax针对每一个子网络估算其预测成功的置信度，如公式（1）所示：The SoftMax function is generally used to solve multi-classification problems and map the outputs of multiple neurons to a range. The present invention uses its characteristic of exponentially mapping the predicted values of multiple categories to As the input before mapping, SoftMax calculation is performed. sub-networks, and use SoftMax to estimate the confidence of its prediction success for each sub-network , as shown in formula (1):

（1） (1)

对而言，损失越大，说明其预测成功的置信度越低，即预测的值与标签值差距越大，而越小，则该子网络对网络总体输出的贡献度越低，与呈负相关。但如果单纯使用置信度决定其在输出中的占比，则会造成一个现象，即只要有一个子网络有较为优秀的预测结果，则整个网络都会几乎完全依赖于这一个网络的结果进行预测，这会造成网络的健壮性和稳定性差。因此，本发明希望LAFAM网络中的每一个子网络都尽最大努力交付，即在一次整体的学习中，将每个子网络的输出都达到最优。故而通过再计算一组正向参数，其中N为子网络总数（不包含监督网络），另外向量内元素取值，，如公式（2）所示： right In terms of loss The larger the value, the lower the confidence level of its prediction success, that is, the greater the gap between the predicted value and the label value. The smaller it is, the lower the contribution of the sub-network to the overall output of the network. and Negatively correlated. However, if the confidence level is used alone to determine its proportion in the output, a phenomenon will occur, that is, as long as one sub-network has a relatively good prediction result, the entire network will almost completely rely on the result of this network for prediction, which will cause poor robustness and stability of the network. Therefore, the present invention hopes that each sub-network in the LAFAM network will do its best to deliver, that is, in an overall learning, the output of each sub-network will be optimized. Therefore, through Calculate another set of forward parameters , where N is the total number of sub-networks (excluding the supervision network), and The value of the element in the vector , , as shown in formula (2):

（2） (2)

对而言，损失越大，说明其预测成功的置信度越低，即预测的值与标签值差距越大，而越大，则希望该子网络对网络总体损失值贡献度越高，与呈正相关。 right In terms of loss The larger the value, the lower the confidence level of its prediction success, that is, the greater the gap between the predicted value and the label value. The larger it is, the higher the contribution of the sub-network to the overall loss value of the network is expected to be. and There is a positive correlation.

如图3所示，在LAFAM网络中，共有三种类型的输出和标签，即：As shown in Figure 3, in the LAFAM network, there are three types of outputs and labels, namely:

网络总体输出和原始样本的标签值；Overall network output and the label value of the original sample ;

子网络的输出和原始样本的标签值；Output of the subnetwork and the label value of the original sample ;

监督网络的输出和由子网损失SoftMax得到的标签值。Output of the supervised network And the label value obtained by subnet loss SoftMax .

这三种类型输出每类需要分别计算损失值，本实施例中将三类输出的损失值分别记为、和，由各类的输出值和标签值进行损失计算。在得到三个损失值之后，将对其加权求和后进行回传，以便于每类型网络拿到相应的损失值并进行优化。因此，本实施例中定义总损失函数（return all），为进行回传导数的计算和证明，使用平方损失函数进行示例，具体过程如公式（3）所示：Each of the three types of outputs needs to calculate the loss value separately. In this embodiment, the loss values of the three types of outputs are recorded as , and , the loss is calculated by the output value and label value of each class. After obtaining the three loss values, they are weighted and summed and then transmitted back so that each type of network can obtain the corresponding loss value and perform optimization. Therefore, the total loss function is defined in this embodiment (return all), in order to calculate and prove the feedback coefficient, the square loss function is used as an example. The specific process is shown in formula (3):

（3） (3)

其中，、和为LAFAM网络的参数，可以使用网格搜索或者其他调参方式对其进行调整和设定。y _j为第j个子网络的输出；为SoftMax运算中使用的温度系数，可以调整SoftMax计算后各个值之间的差距，一般情况下，。越小，SoftMax曲线越陡峭，运算之后输出的各值之间的差距就会大，将相似性放大化，以便于区分难分样本；相反，越大，SoftMax曲线越平滑，运算之后输出的各值之间的差距就会小。本实施例中通过设置系数来控制对子网络损失加权的均衡性。 in, , and is the parameter of the LAFAM network, which can be adjusted and set using grid search or other parameter adjustment methods. y _j is the output of the jth sub-network; is the temperature coefficient used in the SoftMax operation, You can adjust the gap between the values after SoftMax calculation. Generally, . The smaller it is, the steeper the SoftMax curve is, and the gap between the values output after the operation will be larger, which will amplify the similarity and make it easier to distinguish difficult samples; on the contrary, The larger the value, the smoother the SoftMax curve, and the smaller the difference between the values output after the operation. To control the loss of the sub-network Weighted balance.

网络的最终输出如公式（4）所示：The final output of the network is shown in formula (4):

（4） (4)

其中，是监督网络给出的输出向量中的参数。图3所示的网络图是LAFAM网络在训练及预测时所需的全过程，在训练时，LAFAM网络会分别计算三个输出，并对每个输出求损失并回传。但在最终进行预测过程时，监督网络输出的参数直接用于总输出的加权计算，得到LAFAM网络输出y，图3中所有点划线部分不再运行。 in, is the parameter in the output vector given by the supervisory network. The network diagram shown in Figure 3 is the entire process required for the LAFAM network to be trained and predicted. During training, the LAFAM network will calculate three outputs separately, calculate the loss for each output and send it back. However, in the final prediction process, the parameters output by the supervisory network are directly used for the weighted calculation of the total output to obtain the LAFAM network output y . All dotted parts in Figure 3 are no longer run.

对于现有分类方案中存在的类别间样本数量不平衡和样本判别难易不平衡这两个维度的不平衡问题，本发明还提出双压制损失函数（Suppression Loss）进行优化和解决。For the imbalance problems in two dimensions, namely, imbalance in the number of samples between categories and imbalance in the difficulty of sample discrimination, existing in existing classification schemes, the present invention also proposes a double suppression loss function (Suppression Loss) to optimize and solve them.

通常情况下，在分类问题中数据的不平衡往往表现在大类数据远远多于小类。此时如果按照一般方式（例如平方损失、交叉熵损失等）直接计算全局损失，大类的样本训练损失则占据主导地位。而在电子商品智能推荐的场景中，打爆商品为小类，找出打爆商品则是研究的重要目标。针对样本数量不平衡问题，通过函数的方式，为越趋近于大类的样本分配更低的损失贡献度，同时尽量保证趋近于小类的样本损失无衰减，能够便于对其进行更好的优化学习。Normally, data imbalance in classification problems is often manifested in that the number of large-category data is far greater than that of small-category data. At this time, if the global loss is directly calculated in a general way (such as square loss, cross entropy loss, etc.), the training loss of large-category samples will dominate. In the scenario of intelligent recommendation of electronic products, the best-selling products are small-category products, and finding the best-selling products is an important research goal. In order to solve the problem of imbalanced sample quantity, a function is used to assign a lower loss contribution to samples that are closer to large categories, while trying to ensure that the loss of samples close to small categories is not attenuated, which can facilitate better optimization and learning.

这样的思路同样可以用于判别难易不平衡问题的解决。样本预测输出值与标签值之间差距越大，说明预测值与标签值距离越远、误差率越大、预测置信度越低，这类样本往往数量较少，却是预测的重点和难点。因此需要提升预测输出值与标签值之间差距大的样本对总损失的贡献度，同时压制置信度高的样本对总损失的贡献度。This idea can also be used to solve the problem of imbalanced discrimination. The larger the gap between the sample prediction output value and the label value, the farther the prediction value is from the label value, the greater the error rate, and the lower the prediction confidence. Such samples are often small in number, but they are the focus and difficulty of prediction. Therefore, it is necessary to increase the contribution of samples with a large gap between the prediction output value and the label value to the total loss, while suppressing the contribution of samples with high confidence to the total loss.

综上，本发明实施例提出的双压制损失函数共由三部分组成，第一部分是通过函数的手段，对大类样本损失贡献度压制；第二部分是通过同样的函数和不同的参数，对易分样本贡献度的压制；最后一部分是通过高幂次函数扩大输出差距的方式，对易分样本贡献度进行压制，其形式如公式（5）所示：In summary, the dual suppression loss function proposed in the embodiment of the present invention consists of three parts. The first part is to suppress the loss contribution of large-class samples by means of functions; the second part is to suppress the contribution of easy-to-separate samples by using the same function and different parameters; the last part is to suppress the contribution of easy-to-separate samples by expanding the output gap through a high-power function, and its form is shown in formula (5):

（5） (5)

其中，为网络的最终预测输出；为原始样本的标签值，取值0或1；是样本预测输出值与标签值之间的距离，本实施例中称之为预测偏差；、、、和为双压制损失函数的参数，可以通过网格搜索法等调参方式，结合实际数据集合问题进行设定。 in, is the final prediction output of the network; is the label value of the original sample, which takes the value 0 or 1; is the distance between the sample prediction output value and the label value, which is called prediction deviation in this embodiment ; , , , and is the parameter of the double suppression loss function, which can be set by adjusting parameters such as grid search method in combination with the actual data set problem.

在具体的应用过程中，本发明提出的LAFAMN模型可以计算出商品成为打爆商品的概率。为了评估LAFAMN模型的有效性，在回归问题中，使用四个评价指标对本发明提供的LAFAMN模型进行性能评估，四个评价指标分别是均方根误差RMSE（Root Mean SquaredError）、平均绝对误差MAE（Mean Absolute Error）、加权平均绝对误差百分比WMAPE（Weighted Mean Absolute Percentage Error）以及前50/100/200中标率HR@50/100/200（Hire Rate @50/100/200）。其中，前三个评价指标从回归算法侧对模型进行考量，最后一个评价指标从商品预测排序准确率角度对模型进行考量。在分类问题中，本发明使用三个评价指标，分别是精确度Precision、准确度Accuracy、F值F-measure。In the specific application process, the LAFAMN model proposed in the present invention can calculate the probability of a product becoming a hot-selling product. In order to evaluate the effectiveness of the LAFAMN model, in the regression problem, four evaluation indicators are used to evaluate the performance of the LAFAMN model provided by the present invention. The four evaluation indicators are Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Weighted Mean Absolute Percentage Error (WMAPE), and Hire Rate @50/100/200. Among them, the first three evaluation indicators consider the model from the regression algorithm side, and the last evaluation indicator considers the model from the perspective of the accuracy of product prediction and ranking. In the classification problem, the present invention uses three evaluation indicators, namely Precision, Accuracy, and F-measure.

RMSE指标可以更好地体现极端值对误差的影响，其计算方式如公式（6）所示：The RMSE indicator can better reflect the impact of extreme values on errors. Its calculation method is shown in formula (6):

（6） (6)

其中，n是样本个数，是预测的模型输出，是真实的标签值。 Where n is the number of samples, is the predicted model output, is the actual label value.

MAE指标可以非常直观地展现回归误差情况，其计算方式如公式（7）所示：The MAE indicator can very intuitively show the regression error. Its calculation method is shown in formula (7):

（7） (7)

WMAPE指标受极端值影响较小、受个体影响较小，可以较为均匀地展现网络整体预测情况，其计算方式如公式（8）所示：The WMAPE indicator is less affected by extreme values and individuals, and can show the overall prediction of the network more evenly. Its calculation method is shown in formula (8):

（8） (8)

HR指标能够反映模型推荐的准确性，即用户的需求项是否包含在模型的推荐项中，其计算方式如公式（9）所示：The HR indicator can reflect the accuracy of the model recommendation, that is, whether the user's demand items are included in the model's recommended items. Its calculation method is shown in formula (9):

（9） (9)

其中，是所有的测试集合，是每个用户top-k列表中属于测试集合的项目个数的总和。 in, is the set of all tests, It is the sum of the number of items in each user's top-k list that belong to the test set.

Precision指标可以有效反映小类的查准率，其计算方式如公式（10）所示：The precision index can effectively reflect the precision rate of small categories, and its calculation method is shown in formula (10):

（10） (10)

其中，表示预测为正例且实际也为正例的样例数，表示预测为负例但实际为正例的样例数，表示预测为正例但实际为负例的样例数，表示预测为负例且实际也为负例的样例数。 in, It indicates the number of samples that are predicted to be positive and are actually positive. Indicates the number of samples predicted to be negative but actually positive. Indicates the number of samples predicted to be positive but actually negative. Indicates the number of samples that are predicted to be negative and are actually negative.

Accuracy指标反应全局准确率，能够确保网络不做出偏激的预测，其计算方式如公式（11）所示：The Accuracy indicator reflects the global accuracy and can ensure that the network does not make biased predictions. Its calculation method is shown in formula (11):

（11） (11)

F-measure指标用于衡量在数据类不平衡分类问题中算法的整体表现，其计算方式如公式（12）、公式（13）所示：The F-measure indicator is used to measure the overall performance of the algorithm in the data class imbalance classification problem. Its calculation method is shown in formula (12) and formula (13):

（12） (12)

（13） (13)

其中，为一个调节Precision与Recall比重的系数，通常取1。 in, It is a coefficient that adjusts the ratio of Precision to Recall, and is usually set to 1.

对于RMSE、MAE和WMAPE这三个指标来说，均为数值越小时表示推荐系统效果越好。对于HR@50/100/200、Precision、Accuracy和F-measure这四个指标来说，均为数值越大时表示分类推荐系统效果越好。For the three indicators of RMSE, MAE and WMAPE, the smaller the value, the better the recommendation system. For the four indicators of HR@50/100/200, Precision, Accuracy and F-measure, the larger the value, the better the classification recommendation system.

如图4所示，本发明还提供一种基电商数据分类推荐方法，基于如前所述的电商数据分类推荐系统100进行电商数据的分类推荐，包括：As shown in FIG. 4 , the present invention further provides an e-commerce data classification recommendation method, which performs classification recommendation of e-commerce data based on the e-commerce data classification recommendation system 100 as described above, including:

S1：通过预设排序方法和预设分类方法对第一训练数据进行重要性排序和特征分类，以得到所述第一训练数据的特征预处理数据；S1: sorting the importance and classifying the features of the first training data by a preset sorting method and a preset classification method to obtain feature preprocessing data of the first training data;

S2：通过预设的LAFAM网络对训练数据进行训练学习处理，以得到每种类型的输出和标签，将所述输出和标签进行损失值计算后，通过对所述损失值的加权求和后回传处理，使所述损失值达到预设要求后得到LAFAMN模型；其中，所述训练数据包括所述第一训练数据和所述特征预处理数据，所述LAFAM网络包括监督网络和预设数量的子网络；所述子网络用于对所述特征预处理数据进行训练学习，所述监督网络用于对所述第一训练数据进行训练学习；并且，所述监督网络学习所述子网络的占比权重，所述子网络通过动态调整总损失完成子网权重和特征权重的自学习；以及，在所述子网络中使用双压制损失函数处理高度不平衡的数据集；S2: The training data is trained and processed through the preset LAFAM network to obtain each type of output and label, and after the loss value of the output and label is calculated, the weighted sum of the loss value is returned for processing, so that the loss value reaches the preset requirement to obtain the LAFAMN model; wherein, the training data includes the first training data and the feature preprocessing data, and the LAFAM network includes a supervision network and a preset number of sub-networks; the sub-network is used to train and learn the feature preprocessing data, and the supervision network is used to train and learn the first training data; and, the supervision network learns the proportion weight of the sub-network, and the sub-network completes the self-learning of the sub-network weight and the feature weight by dynamically adjusting the total loss; and, in the sub-network, a double suppression loss function is used to process highly unbalanced data sets;

S3：通过所述LAFAMN模型对电商数据进行分类推荐处理。S3: Classify and recommend e-commerce data using the LAFAMN model.

上述电商数据分类推荐方法是和前述电商数据分类推荐系统相对应的实现方法，其具体的执行步骤可参照上述电商数据分类推荐系统的具体实施例，在此不做再一一详述。The above-mentioned e-commerce data classification recommendation method is an implementation method corresponding to the above-mentioned e-commerce data classification recommendation system. Its specific execution steps can refer to the specific implementation example of the above-mentioned e-commerce data classification recommendation system, and will not be described in detail here.

从上述实施例可以看出，本发明提出的损失感知特征注意力机制网络（LAFAMN），针对目前电子商务领域的推荐系统中商品数据存在的多维度类不平衡问题，即特征不平衡、判别难易不平衡以及样本数量不平衡，使用注意力机制对传统多专家学习网络进行整合、改进。与之互补地，还提出了一种基于分类置信度调控的双压制损失函数（SuppressionLoss），抑制大类样本和易判断样本损失值的贡献度，同时保证小类难判断样本的损失值完全不削减。LAFAM网络与双压制损失函数的联合应用可以组成一个完整的推荐系统，解决上述三个维度的类不平衡分类问题。实验证明，本发明提出的电商数据分类推荐系统和方法，在回归问题指标和分类问题指标上与现有的推荐算法相比均有明显提升。It can be seen from the above embodiments that the loss-aware feature attention mechanism network (LAFAMN) proposed in the present invention uses the attention mechanism to integrate and improve the traditional multi-expert learning network for the multi-dimensional class imbalance problem existing in the recommendation system in the current e-commerce field, namely feature imbalance, imbalance in discrimination difficulty and imbalance in sample quantity. Complementarily, a double suppression loss function (SuppressionLoss) based on classification confidence regulation is also proposed to suppress the contribution of the loss value of large-class samples and easy-to-judge samples, while ensuring that the loss value of small-class difficult-to-judge samples is not reduced at all. The combined application of the LAFAM network and the double suppression loss function can form a complete recommendation system to solve the above three-dimensional class imbalance classification problem. Experiments have shown that the e-commerce data classification recommendation system and method proposed in the present invention have significant improvements in regression problem indicators and classification problem indicators compared with existing recommendation algorithms.

以下将以一个具体的应用实施例对本发明提供的电商数据分类推荐系统做更为详细的说明。The e-commerce data classification recommendation system provided by the present invention will be described in more detail below with reference to a specific application example.

在样本数据库的构建方面，在本实施例中，使用来自于某电商的电商数据集进行实验验证，在后续的表述中记为Tmall数据集。该数据集总存储大小10.7GB，总共包含数据条数51,134,193行，原始数据特征共有453个，由于很多特征存在缺失、错填等现象，因此在实验中本实施例选取系统算法中已经置信（经过1年以上时间验证，其系统填写正确率大于90%的数据）的286个特征进行实验使用。In terms of the construction of the sample database, in this embodiment, an e-commerce data set from an e-commerce company is used for experimental verification, which is recorded as the Tmall data set in the subsequent description. The total storage size of the data set is 10.7GB, and it contains a total of 51,134,193 rows of data. There are 453 original data features. Since many features are missing or incorrectly filled, in the experiment, this embodiment selects 286 features that have been trusted in the system algorithm (after more than 1 year of verification, the system fills in the data with a correct rate greater than 90%) for experimental use.

由于LAFAMN模型给出的最终输出值在取值范围内的连续数值，因此本实施例中先将此问题考虑为回归问题进行实验验证，在此过程中使用RMSE、MAE、WMAPE、HR@50/100/200参数进行效果验证，对网络给出的分值排序进行验证。但从实际应用角度而言，一个商品到底是打爆商品还是不是打爆商品这是一个绝对性的判断，因此，为将预测问题实际化，本发明还将排序回归问题转化为分类问题进行学习与验证，将数据集中的销量归一化后，选择0.15作为阈值，进行二分类，分类后类别为0（大类，非打爆商品）和1（小类，打爆商品）的数量分别为70,184,766条和20,483条，不平衡度为3426:1。Since the final output value given by the LAFAMN model is a continuous value within the range of values, this embodiment first considers this problem as a regression problem for experimental verification. In this process, the RMSE, MAE, WMAPE, and HR@50/100/200 parameters are used to verify the effect and verify the score ranking given by the network. However, from the perspective of practical application, whether a product is a hot-selling product or not is an absolute judgment. Therefore, in order to make the prediction problem practical, the present invention also converts the sorting regression problem into a classification problem for learning and verification. After normalizing the sales volume in the data set, 0.15 is selected as the threshold for binary classification. After classification, the number of categories 0 (major category, non-hot-selling product) and 1 (minor category, hot-selling product) is 70,184,766 and 20,483 respectively, and the imbalance is 3426:1.

本实施例中将Tmall数据集中286个特征分为基本特征、价格特征、折扣特征和时序特征。价格特征指的是当前商品的价格绝对值、叶子类目下价格平均值、竞品商品价格平均值、历史价格最低值及其对应的销量、历史价格最高值及其对应的销量等41个特征。折扣特征指的是当前商品给到的折扣值、历史平均折扣值、竞品商品平均折扣值、历史最高折扣值及其对应的销量、历史最低折扣值及其对应的销量等28个特征。时序特征指的是商品在30日内的每日价格、每日销量、每日商品评分等10个特征。除去以上三种特征之外的207个特征归属于基本特征。以上四个类别的特征将覆盖整个286个特征，每个特征属于且仅属于一个类别当中，每个类别所属的特征没有交叉和重复。In this embodiment, the 286 features in the Tmall data set are divided into basic features, price features, discount features and time series features. Price features refer to 41 features such as the absolute value of the price of the current product, the average price under the leaf category, the average price of competing products, the lowest historical price and its corresponding sales volume, the highest historical price and its corresponding sales volume. Discount features refer to 28 features such as the discount value given to the current product, the historical average discount value, the average discount value of competing products, the highest historical discount value and its corresponding sales volume, the lowest historical discount value and its corresponding sales volume. Time series features refer to 10 features such as the daily price, daily sales volume, and daily product rating of the product within 30 days. The 207 features other than the above three features belong to the basic features. The features of the above four categories will cover the entire 286 features, each feature belongs to and only belongs to one category, and the features belonging to each category do not overlap or repeat.

由此，本发明在实例中应用的LAFAM网络结构如图5所示，其网络参数与图3中的参数名称一一对应，展示的是时的状态，即设置四个子网络和一个监督网络进行学习。图5中最外侧实线框内为训练模型，点划线框内为预测模型。特征按类别输入到四个子网络中，全特征使用推荐算法工程中常见的深度交叉网络DCN进行学习，其由Deep和Cross两部分组成，Deep部分为全连接网络，用于提取特征的深度特性，Cross部分做特征交叉计算，用于提取特征的广度特性，可以很好地兼容垂类推荐和广泛爱好试探，且其运行效率高，因此使用其作为全特征的学习网络。价格特征和折扣特征即使用简单的全连接神经网络，它可以提取样本的高维特征，以此做推荐预测。时序特征使用门控循环神经网络GRU进行学习，其具有简单的结构，同时能够克服RNN网络的长依赖问题。Therefore, the LAFAM network structure used in the example of the present invention is shown in Figure 5, and its network parameters correspond to the parameter names in Figure 3 one by one, showing the state of the time, that is, setting four sub-networks and a supervision network for learning. The outermost solid line frame in Figure 5 is the training model, and the dotted line frame is the prediction model. Features are input into the four sub-networks by category, and all features are learned using the deep cross network DCN commonly used in recommendation algorithm engineering. It consists of two parts, Deep and Cross. The Deep part is a fully connected network for extracting the depth characteristics of features, and the Cross part performs feature cross calculations for extracting the breadth characteristics of features. It can be well compatible with vertical category recommendations and extensive hobby explorations, and its operation efficiency is high, so it is used as a learning network for all features. Price features and discount features use a simple fully connected neural network, which can extract high-dimensional features of samples for recommendation prediction. The time series features are learned using a gated recurrent neural network GRU, which has a simple structure and can overcome the long dependency problem of the RNN network.

本实施例中，验证LAFAM网络在回归问题中的表现时使用平方损失函数作为网络的基本回传损失函数进行实验，与其他传统网络（如GBDT、DCN、MoE）的表现进行对比，实验结果如表1所示：In this embodiment, the square loss function is used as the basic return loss function of the network to verify the performance of the LAFAM network in the regression problem. The performance is compared with other traditional networks (such as GBDT, DCN, and MoE). The experimental results are shown in Table 1:

表1 不同网络在回归问题中的表现数据对比Table 1 Comparison of performance data of different networks in regression problems

首先对表1进行宏观分析，可以看出LAFAM网络在HR@50/100/200参数中都具有明显的优势，同时，在RMSE、MAE、WMAPE方面，LAFAM网络相较其他网络没有明显劣势，且在MAE、WMAPE参数上甚至占据最优位置，RMSE参数有所降低是因为在高度不平衡数据集中，大类样本贡献了大量的误差值。因此，从结果看来，作为回归问题考量时，LAFAM网络可以很好地优化特征不平衡和数据类别不平衡带来的预测准确率降低问题，相较其他传统网络有一定的预测中标率优势。First, we conduct a macro analysis of Table 1. We can see that the LAFAM network has obvious advantages in the HR@50/100/200 parameters. At the same time, in terms of RMSE, MAE, and WMAPE, the LAFAM network has no obvious disadvantages compared to other networks, and even occupies the best position in MAE and WMAPE parameters. The RMSE parameter is reduced because in highly unbalanced data sets, large categories of samples contribute a large number of error values. Therefore, from the results, when considering regression problems, the LAFAM network can well optimize the problem of reduced prediction accuracy caused by feature imbalance and data category imbalance, and has a certain prediction success rate advantage compared to other traditional networks.

其次，再针对三个中标率参数进行详细的分析。首先进行相同算法之间不同参数的对比，即表格横向对比，可以看出，从HR@200→HR@100→HR@50即从最广到最精，GBDT的准确率是在不断下降的，LAFAM网络是在不断提升的，换而言之，LAFAM网络更加精于顶端预测，即小类样本的找准。再进行同一参数不同算法之间的对比，即表格纵向对比，可以看出，LAFAM网络在HR@200中比排在第二高的MoE网络高出8.4%，在HR@100参数中，比MoE网络高出10.4%，而在HR@50参数中比MoE网络高出34.8%，范围越精确，LAFAMN相较其他网络表现越佳，其对小类样本的预测有明显的优势，可以很好地在类别不平衡的情况下，完成回归排序和推荐。Secondly, we conduct a detailed analysis of the three winning rate parameters. First, we compare different parameters of the same algorithm, that is, the horizontal comparison of the table. It can be seen that from HR@200→HR@100→HR@50, that is, from the widest to the most precise, the accuracy of GBDT is constantly decreasing, and the LAFAM network is constantly improving. In other words, the LAFAM network is more accurate in top prediction, that is, the identification of small class samples. Then we compare different algorithms with the same parameters, that is, the vertical comparison of the table. It can be seen that the LAFAM network is 8.4% higher than the second highest MoE network in HR@200, 10.4% higher than the MoE network in HR@100 parameters, and 34.8% higher than the MoE network in HR@50 parameters. The more precise the range, the better the performance of LAFAMN compared with other networks. It has obvious advantages in predicting small class samples and can complete regression sorting and recommendation well in the case of class imbalance.

下面进行LAFAM网络和双压制损失函数Suppression Loss的分类验证。The following is the classification verification of the LAFAM network and the double suppression loss function Suppression Loss.

首先，使用网格搜索法确定网络参数，部分参数的网格搜索优化如图6所示，其中黑点位置为所选参数值。最终得到的网络使用参数如表2所示，表中仅展示所提出的LAFAM网络和双压制损失函数Suppression Loss的最优参数，其余网络和损失函数均使用相同方式调参并得到最优解。First, the grid search method is used to determine the network parameters. The grid search optimization of some parameters is shown in Figure 6, where the black dots are the selected parameter values. The final network parameters are shown in Table 2. The table only shows the optimal parameters of the proposed LAFAM network and the dual suppression loss function Suppression Loss. The remaining networks and loss functions are adjusted in the same way to obtain the optimal solution.

表2 参数设置总结Table 2 Summary of parameter settings

在不同的网络下分别验证平方损失函数Square、聚焦损失函数Focal Loss、收缩损失函数Shrinkage Loss和双压制损失函数Suppression Loss，在对比网络选取方面选择分别验证MLP、GRU、DCN、MoE和LAFAM网络，前四者分别是在LAFAM网络中使用到的子网络，因此对其单独进行测试，在作为基线指标的同时，也相当于完成了消融实验，证明联合网络相较每个子网络有相应提升和效果优化。实验结果如表3所示，表格中每一栏目为相同网络下四种不同损失函数的验证结果对比，每栏内对应顺序为相同损失函数不同网络的验证结果对比。The square loss function, focal loss function, shrinkage loss function, and double suppression loss function were verified under different networks. In terms of comparison network selection, the MLP, GRU, DCN, MoE, and LAFAM networks were selected for verification. The first four are sub-networks used in the LAFAM network, so they are tested separately. While serving as baseline indicators, they are also equivalent to completing the ablation experiment, proving that the joint network has corresponding improvements and effect optimization compared to each sub-network. The experimental results are shown in Table 3. Each column in the table is a comparison of the verification results of four different loss functions under the same network, and the corresponding order in each column is a comparison of the verification results of different networks with the same loss function.

表3 不同损失函数在不同网络中的实验结果对比Table 3 Comparison of experimental results of different loss functions in different networks

由表3观察得出，在网络方面，LAFAM网络在不平衡度高的数据集上有相较其他网络更明显的提升，在1:20不平衡度数据集中LAFAM网络的三个参数都处于最优位置，平均比其他网络高出10%左右，而在1:5和1:10不平衡度数据集中仅比最优网络平均低7%左右，因此LAFAM网络在类不平衡、特征不平衡数据集中使用，其效果稳定，优势明显。在损失函数方面，可以清晰地发现，在每个网络中，双压制损失函数Suppression Loss都有最优的表现，因为它可以很好地解决类别数量不均衡和样本判别置信度不均衡的问题。双压制损失函数的实验效果平均相较其他损失函数有约15%的提升，验证了多种网络和不平衡度数据集上的表现，因此其适配性也较为优秀。从实验数据来看，双压制损失函数的稳定性是可接受的。From Table 3, we can see that in terms of network, LAFAM network has a more obvious improvement on highly unbalanced datasets than other networks. In the 1:20 unbalanced dataset, the three parameters of LAFAM network are in the optimal position, which is about 10% higher than other networks on average, and only about 7% lower than the optimal network on average in the 1:5 and 1:10 unbalanced datasets. Therefore, LAFAM network is stable and has obvious advantages when used in class imbalance and feature imbalance datasets. In terms of loss function, it can be clearly found that in each network, the double suppression loss function Suppression Loss has the best performance, because it can well solve the problems of unbalanced number of categories and unbalanced sample discrimination confidence. The experimental effect of the double suppression loss function is about 15% higher than other loss functions on average, which verifies the performance on a variety of networks and unbalanced datasets, so its adaptability is also relatively good. From the experimental data, the stability of the double suppression loss function is acceptable.

如图7所示，本发明还提供一种电子设备，该电子设备包括：As shown in FIG. 7 , the present invention further provides an electronic device, the electronic device comprising:

至少一个处理器；以及，at least one processor; and,

与至少一个处理器通信连接的存储器；其中，a memory communicatively connected to at least one processor; wherein,

该存储器存储有可被至少一个处理器执行的计算机程序，该计算机程序被所述至少一个处理器执行，以使所述至少一个处理器能够执行前述的电商数据分类推荐方法中的步骤。The memory stores a computer program that can be executed by at least one processor, and the computer program is executed by the at least one processor so that the at least one processor can execute the steps in the aforementioned e-commerce data classification recommendation method.

本领域技术人员可以理解的是，图7示出的结构并不构成对所述电子设备1的限定，可以包括比图示更少或者更多的部件，或者组合某些部件，或者不同的部件布置。Those skilled in the art will appreciate that the structure shown in FIG. 7 does not limit the electronic device 1 and may include fewer or more components than shown in the figure, or combine certain components, or arrange the components differently.

例如，尽管未示出，所述电子设备1还可以包括给各个部件供电的电源（比如电池），优选地，电源可以通过电源管理装置与所述至少一个处理器10逻辑相连，从而通过电源管理装置实现充电管理、放电管理、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备1还可以包括多种传感器、蓝牙模块、Wi-Fi模块等，在此不再赘述。For example, although not shown, the electronic device 1 may also include a power source (such as a battery) for supplying power to various components. Preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that the power management device can realize functions such as charging management, discharging management, and power consumption management. The power source may also include any components such as one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators. The electronic device 1 may also include a variety of sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be described in detail here.

进一步地，所述电子设备1还可以包括网络接口，可选地，所述网络接口可以包括有线接口和/或无线接口（如WI-FI接口、蓝牙接口等），通常用于在该电子设备1与其他电子设备之间建立通信连接。Furthermore, the electronic device 1 may also include a network interface. Optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is generally used to establish a communication connection between the electronic device 1 and other electronic devices.

可选地，该电子设备1还可以包括用户接口，用户接口可以是显示器（Display）、输入单元（比如键盘（Keyboard）），可选地，用户接口还可以是标准的有线接口、无线接口。可选地，在一些实施例中，显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED（Organic Light-Emitting Diode，有机发光二极管）触摸器等。其中，显示器也可以适当的称为显示屏或显示单元，用于显示在电子设备1中处理的信息以及用于显示可视化的用户界面。Optionally, the electronic device 1 may further include a user interface, which may be a display, an input unit (such as a keyboard), or a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, and an OLED (Organic Light-Emitting Diode) touch device. The display may also be appropriately referred to as a display screen or a display unit, which is used to display information processed in the electronic device 1 and to display a visual user interface.

应该了解，所述实施例仅为说明之用，在专利申请范围上并不受此结构的限制。It should be understood that the embodiment is for illustration only and the scope of the patent application is not limited to this structure.

所述电子设备1中的所述存储器11存储的电商数据分类推荐程序12是多个指令的组合，在所述处理器10中运行时，可以实现：The e-commerce data classification recommendation program 12 stored in the memory 11 of the electronic device 1 is a combination of multiple instructions. When running in the processor 10, it can achieve:

通过预设的LAFAM网络对训练数据进行训练学习处理，以得到每种类型的输出和标签，将所述输出和标签进行损失值计算后，通过对所述损失值的加权求和后回传处理，使所述损失值达到预设要求后得到LAFAMN模型；其中，所述训练数据包括所述第一训练数据和所述特征预处理数据，所述LAFAM网络包括监督网络和预设数量的子网络；所述子网络用于对所述特征预处理数据进行训练学习，所述监督网络用于对所述第一训练数据进行训练学习；并且，所述监督网络学习所述子网络的占比权重，所述子网络通过动态调整总损失完成子网权重和特征权重的自学习；以及，在所述子网络中使用双压制损失函数处理高度不平衡的数据集；The training data is trained and learned through a preset LAFAM network to obtain each type of output and label, and after the loss value of the output and label is calculated, the weighted sum of the loss value is returned for processing, so that the loss value reaches the preset requirement to obtain the LAFAMN model; wherein, the training data includes the first training data and the feature preprocessing data, and the LAFAM network includes a supervisory network and a preset number of sub-networks; the sub-network is used to train and learn the feature preprocessing data, and the supervisory network is used to train and learn the first training data; and, the supervisory network learns the proportion weight of the sub-network, and the sub-network completes the self-learning of the sub-network weight and the feature weight by dynamically adjusting the total loss; and, in the sub-network, a double suppression loss function is used to process highly unbalanced data sets;

具体地，所述处理器10对上述指令的具体实现方法可参考图4对应实施例中相关步骤的描述，在此不赘述。Specifically, the specific implementation method of the processor 10 for the above instructions can refer to the description of the relevant steps in the corresponding embodiment of Figure 4, which will not be repeated here.

进一步地，所述电子设备1集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。所述计算机可读介质可以包括：能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器（ROM，Read-Only Memory）。Furthermore, if the module/unit integrated in the electronic device 1 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a mobile hard disk, a magnetic disk, an optical disk, a computer memory, and a read-only memory (ROM).

如上参照附图以示例的方式描述了根据本发明提出的电商数据分类推荐系统、方法。但是，本领域技术人员应当理解，对于上述本发明所提出的电商数据分类推荐系统、方法，还可以在不脱离本发明内容的基础上做出各种改进。因此，本发明的保护范围应当由所附的权利要求书的内容确定。As described above, the e-commerce data classification recommendation system and method proposed in the present invention are described by way of example with reference to the accompanying drawings. However, those skilled in the art should understand that various improvements can be made to the e-commerce data classification recommendation system and method proposed in the present invention without departing from the content of the present invention. Therefore, the protection scope of the present invention should be determined by the content of the attached claims.

Claims

1. An e-commerce data classification recommendation system, characterized by comprising:

A feature preprocessing unit, used to sort the first training data by importance and classify the features by a preset sorting method and a preset classification method, so as to obtain feature preprocessing data of the first training data;

A model training unit is used to perform learning and training processing on the training data through a preset Loss Aware Feature Attention Mechanism Network (LAFAMN) to obtain each type of output and label, and after calculating the loss value of the output and label, the weighted sum of the loss value is returned for processing, so that the loss value reaches the preset requirement to obtain the LAFAMN model; wherein, the training data includes the first training data and the feature preprocessing data, and the LAFAMN includes a supervision network and a preset number of sub-networks; the sub-network is used to train and learn the feature pre-processing data, and the supervision network is used to train and learn the first training data; and, the supervision network learns the proportion weight of the sub-network, and the sub-network completes the self-learning of the sub-network weight and the feature weight by dynamically adjusting the total loss; and, in the sub-network, a double suppression loss function is used to process highly unbalanced data sets;

A classification recommendation unit, used for performing classification recommendation processing on e-commerce data through the LAFAMN model;

The outputs and labels of the LAFAMN include:

The overall output of the network includes and the label value of the original sample ;

Subnetwork output, including and the label value of the original sample ;

The output of the supervised network includes And the label value obtained by subnet loss SoftMax ;

Among them, the loss values of the three types of output are recorded as , and , define the total loss function , the process of calculating the total loss function using the square loss function is shown in formula (3):

(3)

in, , and is the parameter of LAFAMN, yj is the output of _the jth sub-network; The temperature coefficient used in the SoftMax operation is set by To control the loss value of the sub-network Weighted balance;

The final output of LAFAMN is shown in formula (4):

(4)

in, are the parameters in the output vector given by the supervisory network;

The dual suppression loss function includes three parts, among which,

The first part is to suppress the loss contribution of large-class samples by means of functions;

The second part is to suppress the contribution of easy-to-divide samples through the same function and different parameters;

The last part is to suppress the contribution of easy-to-divided samples by expanding the output gap through a high-power function, which is shown in formula (5):

(5)

in, It is the final prediction output of LAFAMN; is the label value of the original sample, which takes the value 0 or 1; is the distance between the sample prediction output value and the label value; , , , and is the parameter of the double suppression loss function.

2. The e-commerce data classification recommendation system according to claim 1, wherein the feature preprocessing unit comprises:

An importance ranking unit, configured to perform a primary ranking process on the first training data by a preset ranking method, and weight the primary ranking process result to calculate a feature importance score, so as to obtain a final ranking result of the first training data;

The feature classification unit is used to classify the first training data according to a preset classification method.

3. The e-commerce data classification recommendation system according to claim 2, characterized in that:

The preset sorting methods include PS-smart, XGBoost, and GBDT;

The first training data is divided into dense features, sparse features, time series features and basic features by the preset classification method; wherein the basic features are other features that are not included in the three categories of dense features, sparse features and time series features after classification;

The sub-networks perform learning and training processing on input features of different categories respectively.

4. The e-commerce data classification recommendation system according to claim 3, wherein the sub-networks perform learning and training processing on input features of different categories respectively, including:

Input features of different categories after feature preprocessing of the first training data are respectively put into different sub-networks for learning, and all sub-networks output a score between 0 and 1, which represents the learning result of the current sample in the current sub-network; 0 means the sample is least likely to be recommended and 1 means the sample is most likely to be recommended;

Each sub-network will calculate a loss value through the sample's label value and prediction value , ,in, is the total number of subnetworks; the loss value The larger the value of , the lower the corresponding sub-network's prediction confidence for the sample, and the lower its share in the final calculation of the output value.

5. The e-commerce data classification recommendation system according to claim 4, characterized in that:

The output of the supervisory network is the proportion of all sub-networks in the total output; and

The label used in the supervised network training is the weight vector of the loss value l _i obtained after performing SoftMax calculation based on the loss value ,in, Internal element value , ,

Will As the input before mapping, SoftMax calculation is performed. sub-networks, and use SoftMax to estimate the confidence of its prediction success for each sub-network , as shown in formula (1): (1)

in, and Negatively correlated.

6. The e-commerce data classification recommendation system according to claim 5, further comprising:

By the loss value Calculate another set of forward parameters , so that the output of each sub-network is optimal, where The value of the element in the vector , , as shown in formula (2):

(2)

in, and There is a positive correlation.

7. An e-commerce data classification recommendation method, based on the e-commerce data classification recommendation system according to any one of claims 1 to 6, comprising:

Sorting the first training data by importance and classifying its features by a preset sorting method and a preset classification method to obtain feature preprocessing data of the first training data;

The training data is trained and learned by the preset LAFAMN to obtain each type of output and label. After the loss value of the output and label is calculated, the weighted sum of the loss value is returned for processing, so that the loss value reaches the preset requirement to obtain the LAFAMN model; wherein, the training data includes the first training data and the feature preprocessing data, and the LAFAMN includes a supervisory network and a preset number of sub-networks; the sub-network is used to train and learn the feature pre-processing data, and the supervisory network is used to train and learn the first training data; and, the supervisory network learns the proportion weight of the sub-network, and the sub-network completes the self-learning of the sub-network weight and the feature weight by dynamically adjusting the total loss; and, in the sub-network, a double suppression loss function is used to process highly unbalanced data sets;

The LAFAMN model is used to perform classification and recommendation processing on e-commerce data.

8. An electronic device, characterized in that the electronic device comprises:

at least one processor; and,

a memory communicatively connected to the at least one processor; wherein,

The memory stores a computer program that can be executed by the at least one processor, and the computer program is executed by the at least one processor so that the at least one processor can execute the steps in the e-commerce data classification recommendation method as described in claim 7.