CN114529351A

CN114529351A - Commodity category prediction method, device, equipment and storage medium

Info

Publication number: CN114529351A
Application number: CN202210240176.9A
Authority: CN
Inventors: 薛睿蓉; 王成; 陈承泽
Original assignee: Shanghai Weimeng Enterprise Development Co ltd
Current assignee: Shanghai Weimeng Enterprise Development Co ltd
Priority date: 2022-03-10
Filing date: 2022-03-10
Publication date: 2022-05-24

Abstract

The application discloses a commodity category prediction method, a device, equipment and a storage medium, wherein the device comprises the following steps: inputting preset commodity information into a pre-training model for processing to obtain a commodity category corresponding to the preset commodity information; the pre-training model is an existing post-training model used for predicting the commodity category; carrying out distribution statistics on the commodity categories corresponding to the preset commodity information to obtain corresponding category distribution results, and carrying out distribution alignment processing on the preset commodity information according to the category distribution results; and carrying out model fine adjustment on the pre-training model by using the preset commodity information after distribution alignment to obtain a target prediction model, so as to predict the commodity category of the commodity information to be predicted by using the target prediction model. According to the method and the device, the pre-training model trained by large-scale corpora is transferred to the target prediction model, a large number of manual labeling processes are avoided, and the commodity category prediction efficiency and accuracy are improved.

Description

Commodity category prediction method, device, equipment and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a commodity category prediction method, a commodity category prediction device, commodity category prediction equipment and a storage medium.

Background

Big data is ubiquitous, and particularly in a rapidly developing e-commerce platform, data maintenance is very important. A large number of commodity categories are the most important to be treated, and because each E-commerce platform is provided with the own unique commodity category, how to carry out category sorting and multi-level category combing on all commodities according to different platforms in a self-adaptive mode is a necessary way for improving platform competitiveness and working efficiency. Most of the existing commodity category management schemes combine direct manual labeling and fine tuning models together, however, the category labeling of commodities is complex, the manual labeling cost is high, the efficiency is low, and meanwhile, the accuracy cannot be guaranteed.

Therefore, how to provide an efficient and accurate method for predicting the category of a commodity is a technical problem to be solved urgently by those skilled in the art.

Disclosure of Invention

In view of this, the present invention provides a method, an apparatus, a device and a storage medium for predicting a commodity category, which can avoid a large number of manual labeling processes and improve the efficiency and accuracy of predicting the commodity category. The specific scheme is as follows:

a first aspect of the present application provides a method for predicting a category of a commodity, including:

inputting preset commodity information into a pre-training model for processing to obtain a commodity category corresponding to the preset commodity information; the pre-training model is an existing post-training model used for predicting the commodity category;

carrying out distribution statistics on the commodity categories corresponding to the preset commodity information to obtain corresponding category distribution results, and carrying out distribution alignment processing on the preset commodity information according to the category distribution results;

and carrying out model fine adjustment on the pre-training model by using the preset commodity information after distribution alignment to obtain a target prediction model, so as to predict the commodity category of the commodity information to be predicted by using the target prediction model.

Optionally, before inputting the preset commodity information into the pre-training model for processing, the method further includes:

acquiring third-party commodity information and corresponding commodity categories;

and carrying out data cleaning on the third-party commodity information and the corresponding commodity category in an active learning mode so as to carry out model fine adjustment on the pre-training model by using the cleaned data.

Optionally, the data cleaning is performed on the third-party commodity information and the corresponding commodity category in an active learning manner, and the data cleaning method includes:

training classifiers of the corresponding types of commodity categories by respectively utilizing the third-party commodity information in each commodity category;

and predicting the commodity category of the corresponding third-party commodity information by using the trained classifier, and deleting the third-party commodity information with the confidence coefficient smaller than a first preset threshold value.

Optionally, before the training of the classifier for the commodity category of the corresponding category by respectively using the third-party commodity information in each commodity category, the method further includes:

and dividing the third-party commodity information and the corresponding commodity categories into a training set, a testing set and a verification set, and screening the third-party commodity information from the training set to train the classifiers of the commodity categories of the corresponding categories.

Optionally, the method for predicting the category of the commodity further includes:

and carrying out model effect verification on the trained classifier of each commodity category in a five-fold cross verification mode.

carrying out sample division on the third-party commodity information according to the commodity category by using a query function to respectively obtain a positive sample and a negative sample corresponding to each commodity category; the positive sample comprises third-party commodity information which is consistent in commodity category and is of a corresponding type, and the negative sample comprises third-party commodity information which is of a commodity category and is of another type;

and training classifiers of the commodity categories of the corresponding types by respectively utilizing the positive samples and the negative samples, and moving third-party commodity information with the confidence level larger than a second preset threshold value in the negative samples in the training process into the positive samples to continue training until the classifiers are converged.

Optionally, the performing distribution statistics on the commodity categories corresponding to the preset commodity information includes:

and determining sampling commodity information from the preset commodity information, and verifying the commodity category of the sampling commodity information so as to perform distribution statistics on the commodity category corresponding to the sampling commodity information after verification.

A second aspect of the present application provides a commodity category prediction apparatus including:

the system comprises a preprocessing module, a training module and a training module, wherein the preprocessing module is used for inputting preset commodity information into a pre-training model for processing to obtain a commodity category corresponding to the preset commodity information; the pre-training model is an existing post-training model used for predicting the commodity category;

the distribution alignment module is used for carrying out distribution statistics on the commodity categories corresponding to the preset commodity information to obtain corresponding category distribution results and carrying out distribution alignment processing on the preset commodity information according to the category distribution results;

and the first fine tuning module is used for carrying out model fine tuning on the pre-training model by using the preset commodity information after distribution alignment to obtain a target prediction model so as to predict the commodity category of the commodity information to be predicted by using the target prediction model.

A third aspect of the application provides an electronic device comprising a processor and a memory; wherein the memory is used for storing a computer program which is loaded and executed by the processor to realize the aforementioned commodity category prediction method.

A fourth aspect of the present application provides a computer-readable storage medium, in which computer-executable instructions are stored, and when the computer-executable instructions are loaded and executed by a processor, the method for predicting the categories of the aforementioned goods is implemented.

In the application, preset commodity information is input into a pre-training model to be processed, and a commodity category corresponding to the preset commodity information is obtained; the pre-training model is an existing post-training model used for predicting the commodity category; then carrying out distribution statistics on the commodity categories corresponding to the preset commodity information to obtain corresponding category distribution results, and carrying out distribution alignment processing on the preset commodity information according to the category distribution results; and finally, carrying out model fine adjustment on the pre-training model by using the preset commodity information after distribution alignment to obtain a target prediction model, so as to predict the commodity category of the commodity information to be predicted by using the target prediction model. Therefore, the pre-training model trained by large-scale corpora is transferred to the target prediction model, namely commodity category information is transferred to the target prediction model, a large number of manual labeling processes are avoided, and commodity category prediction efficiency and accuracy are improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

FIG. 1 is a flow chart of a method for predicting a category of a commodity according to the present application;

FIG. 2 is a schematic diagram of a specific method for predicting merchandise categories according to the present application;

FIG. 3 is a diagram of a specific data cleansing process provided herein;

fig. 4 is a schematic structural diagram of a device for predicting commodity categories according to the present application;

fig. 5 is a structural diagram of an electronic device for predicting commodity categories according to the present application.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Most of the existing commodity category management schemes combine direct manual labeling and fine tuning models together, however, the category labeling of commodities is complex, the manual labeling cost is high, the efficiency is low, and meanwhile, the accuracy cannot be guaranteed. Aiming at the technical defects, the commodity category prediction scheme is provided, a large number of manual labeling processes can be avoided, and commodity category prediction efficiency and accuracy are improved.

Fig. 1 is a flowchart of a method for predicting a commodity category according to an embodiment of the present disclosure. Referring to fig. 1, the method for predicting the categories of goods includes:

s11: inputting preset commodity information into a pre-training model for processing to obtain a commodity category corresponding to the preset commodity information; the pre-training model is an existing post-training model used for predicting the commodity category.

In this embodiment, preset commodity information is input to a pre-training model for processing, and a commodity category corresponding to the preset commodity information is obtained. The pre-training model is an existing post-training model used for predicting the commodity category. Specifically, the pre-training model (PTM) refers to a model that is trained by using a large amount of texts that appear in human life, so that the model learns the probability distribution of each word or word appearing in the texts, thereby modeling the model according to the text distribution. The language model is marked by the linguistic data, so that the language model can be trained almost limitlessly by utilizing large-scale linguistic data, the large-scale anticipation enables the pre-training model to obtain strong capability, and downstream category management tasks show more excellent effect through the pre-training model, namely the language model trained by the large-scale linguistic data. In addition, on the basis of the pre-training model, various classification and regression NLP tasks can be accessed at the downstream.

Further, in order to ensure the migration effect, the pre-training model is required to be finely tuned (finetune) on the data of the third-party public commodity category in addition to the pre-training model which is trained on the large-scale public corpus and is an existing open source. The fine tuning is a process of applying a pre-trained model to the own data set and adapting the parameters to the own data set. Specifically, in this embodiment, the third-party commodity information and the corresponding commodity category are obtained first, and then the third-party commodity information and the corresponding commodity category are subjected to data cleaning in an active learning manner, so that model fine tuning is performed on the pre-training model by using the cleaned data. The open corpus has common NLP problems of inconsistent labeling, ambiguous commodity names and the like. In the process, after the third-party data are obtained, the marked linguistic data are cleaned by using the idea of active learning.

There are two schemes for data cleansing in this example. In one embodiment, the third-party commodity information and the corresponding commodity category are divided into a training set, a testing set and a verification set, that is, the split data set is train, test and valid, wherein the valid is fixed, and the train and the test are selected according to a scheme. And screening third-party commodity information from the training set to train the classifier of the corresponding type of commodity category. Then, training classifiers of the corresponding types of commodity categories by respectively utilizing the third-party commodity information in each commodity category; and finally, predicting the commodity category of the corresponding third-party commodity information by using the trained classifier, and deleting the third-party commodity information with the confidence coefficient smaller than a first preset threshold value. Through active learning, untrusted samples in model training are removed, and commodity categories marked by the models are iterated continuously, so that the commodity category prediction accuracy is improved.

In addition, in order to ensure the cleaning effect, the trained classifier of each commodity category can be subjected to model effect verification in a five-fold cross verification mode. The five-fold cross validation is to divide the marked data into five parts, wherein each part is taken as a validation set in turn, and the remaining four parts are taken as training sets. And training the model by using a training set and observing the prediction effect of the model by using a verification set. The effect of observing five verification sets is more likely to fully explain the real effect of the model than one result.

In another embodiment, a query function is used for carrying out sample division on the third-party commodity information according to the commodity category to respectively obtain a positive sample and a negative sample corresponding to each commodity category. The Query Function (Query Function) is a Query Function in active learning. The updating process of the machine learning model comprises the following steps: the model is updated in an incremental learning or relearning mode, so that manually marked data are merged into the machine learning model, and the model effect is improved. The positive sample comprises third-party commodity information which is consistent in commodity category and is of a corresponding type, and the negative sample comprises third-party commodity information which is of a commodity category and is of another type. And then training classifiers of the corresponding types of commodity categories by respectively utilizing the positive samples and the negative samples, and moving third-party commodity information with the confidence level greater than a second preset threshold value in the negative samples in the training process into the positive samples to continue training until the classifiers converge. That is, first, a positive sample p (positive) is randomly selected, and then a small amount of data u (unlabelled) is selected from the remaining data as a negative sample. Then training a classifier to predict the U which is not selected, adding the data with higher confidence coefficient into P, and continuously putting the rest data into U. Repeating the above process for multiple times until convergence. With particular reference to figure 3.

S12: and carrying out distribution statistics on the commodity categories corresponding to the preset commodity information to obtain corresponding category distribution results, and carrying out distribution alignment processing on the preset commodity information according to the category distribution results.

In this embodiment, the distribution statistics is performed on the commodity categories corresponding to the preset commodity information to obtain corresponding category distribution results, and the preset commodity information is subjected to distribution alignment processing according to the category distribution results. In order to improve the processing efficiency, only a small part of commodity information and categories need to be subjected to distribution statistics, so that a small part of all commodity category data is randomly sampled before the distribution statistics, manual verification is carried out on the part of data, and only the category distribution after the manual verification is counted. And according to the distribution, carrying out distribution alignment on all the commodity data. Specifically, sampling commodity information is determined from the preset commodity information, and commodity categories of the sampling commodity information are verified, so that distribution statistics is performed on the commodity categories corresponding to the verified sampling commodity information. It will be appreciated that sampling also decimates a portion of the data. Often, because the global amount of data is too large to be human or machine computational power, a portion of the data needs to be extracted for manual labeling or model training. The sampling method is various and is flexibly selected mainly according to the purpose of sampling. Meanwhile, in an actual scene, the amount of sampled data is often determined according to the number of categories and the amount of labor, which is not limited in this embodiment.

It should be noted that distribution alignment refers to approximating the distribution of the categories of the data to be aligned after reducing each category to the desired distribution. However, in actual operation, the distribution of the commodity categories has long tailedness, and the categories with too little data amount are not beneficial to training the relevant information by the model learning. Thus, no reduction is required for the tail category.

S13: and carrying out model fine adjustment on the pre-training model by using the preset commodity information after distribution alignment to obtain a target prediction model, so as to predict the commodity category of the commodity information to be predicted by using the target prediction model.

In this embodiment, the pre-training model is subjected to model fine tuning by using the preset commodity information after distribution alignment to obtain a target prediction model, so that the commodity category of the commodity information to be predicted is predicted by using the target prediction model. The process also uses the aligned data to re-fine tune the pre-training language model, and repeats the above steps until convergence. So far, the migration learning is accomplished, falls to the minimum with the cost of labor, does not need a large amount of manual works directly to mark this data, and can with the better migration of model effect to this data. The method can be cracked, and the transfer learning is to transfer the learned and trained model parameters to a new model to help the new model training. Given that most data or tasks are relevant, model parameters that have already been learned (which may also be understood as knowledge learned by the model) can be shared with new models by migration learning in some way to accelerate and optimize the learning efficiency of the model without learning from zero as in most networks.

Therefore, in the embodiment of the application, preset commodity information is input into a pre-training model for processing to obtain a commodity category corresponding to the preset commodity information; the pre-training model is an existing post-training model used for predicting the commodity category; then carrying out distribution statistics on the commodity categories corresponding to the preset commodity information to obtain corresponding category distribution results, and carrying out distribution alignment processing on the preset commodity information according to the category distribution results; and finally, carrying out model fine adjustment on the pre-training model by using the preset commodity information after distribution alignment to obtain a target prediction model, so as to predict the commodity category of the commodity information to be predicted by using the target prediction model. According to the embodiment of the application, the pre-training model trained by large-scale corpora is transferred to the target prediction model, namely commodity category information is transferred to the target prediction model, a large number of manual labeling processes are avoided, and commodity category prediction efficiency and accuracy are improved.

Referring to fig. 4, the embodiment of the present application further discloses a device for predicting commodity categories, which includes:

the system comprises a preprocessing module 11, a training module and a training module, wherein the preprocessing module is used for inputting preset commodity information into a pre-training model for processing to obtain a commodity category corresponding to the preset commodity information; the pre-training model is an existing post-training model used for predicting the commodity category;

the distribution alignment module 12 is configured to perform distribution statistics on the commodity categories corresponding to the preset commodity information to obtain corresponding category distribution results, and perform distribution alignment processing on the preset commodity information according to the category distribution results;

the first fine tuning module 13 is configured to perform model fine tuning on the pre-training model by using the preset commodity information after distribution alignment to obtain a target prediction model, so as to predict the commodity category of the commodity information to be predicted by using the target prediction model.

In some embodiments, the commodity category prediction device further includes:

the acquisition module is used for acquiring the third-party commodity information and the corresponding commodity category;

the cleaning module is used for cleaning data of the third-party commodity information and the corresponding commodity category in an active learning mode;

and the second fine tuning module is used for carrying out model fine tuning on the pre-training model by using the cleaned data.

In some embodiments, the cleaning module specifically includes a first cleaning submodule and a second cleaning submodule, where the first cleaning submodule includes:

the first data dividing unit is used for dividing the third-party commodity information and the corresponding commodity categories into a training set, a testing set and a verification set, and screening the third-party commodity information from the training set to train the classifiers of the commodity categories of the corresponding types;

the first training unit is used for training the classifier of the commodity category of the corresponding category by respectively utilizing the third-party commodity information in each commodity category;

the prediction unit is used for predicting the commodity category of the corresponding third-party commodity information by using the trained classifier;

the deleting unit is used for deleting the third-party commodity information with the confidence coefficient smaller than the first preset threshold value;

the verification unit is used for verifying the model effect of the trained classifier of each commodity category in a five-fold cross verification mode;

the second cleaning submodule includes:

the second data dividing unit is used for carrying out sample division on the third-party commodity information according to the commodity category by utilizing the query function to respectively obtain a positive sample and a negative sample corresponding to each commodity category; the positive sample comprises third-party commodity information which is consistent in commodity category and is of a corresponding type, and the negative sample comprises third-party commodity information which is of a commodity category and is of another type;

and the second training unit is used for training the classifiers of the commodity categories of the corresponding types by respectively utilizing the positive samples and the negative samples, and moving the third-party commodity information with the confidence level larger than a second preset threshold value in the negative samples in the training process into the positive samples to continue training until the classifiers are converged.

In some specific embodiments, the distribution alignment module is further specifically configured to determine sampled commodity information from the preset commodity information, and check a commodity category of the sampled commodity information, so as to perform distribution statistics on a commodity category corresponding to the checked sampled commodity information.

Further, the embodiment of the application also provides electronic equipment. FIG. 5 is a block diagram illustrating an electronic device 20 according to an exemplary embodiment, and the contents of the diagram should not be construed as limiting the scope of use of the present application in any way.

Fig. 5 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present disclosure. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. The memory 22 is configured to store a computer program, and the computer program is loaded and executed by the processor 21 to implement the relevant steps in the commodity category prediction method disclosed in any one of the foregoing embodiments.

In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and a communication protocol followed by the communication interface is any communication protocol applicable to the technical solution of the present application, and is not specifically limited herein; the input/output interface 25 is configured to obtain external input data or output data to the outside, and a specific interface type thereof may be selected according to specific application requirements, which is not specifically limited herein.

In addition, the storage 22 is used as a carrier for resource storage, and may be a read-only memory, a random access memory, a magnetic disk or an optical disk, etc., and the resources stored thereon may include an operating system 221, a computer program 222, data 223, etc., and the storage may be a transient storage or a permanent storage.

The operating system 221 is used for managing and controlling each hardware device and the computer program 222 on the electronic device 20, so as to realize the operation and processing of the mass data 223 in the memory 22 by the processor 21, and may be Windows Server, Netware, Unix, Linux, and the like. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the commodity category prediction method disclosed in any of the foregoing embodiments and executed by the electronic device 20. Data 223 may include sample data collected by electronic device 20.

Further, an embodiment of the present application further discloses a storage medium, where a computer program is stored, and when the computer program is loaded and executed by a processor, the steps of the method for predicting the categories of the commodities disclosed in any of the foregoing embodiments are implemented.

The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

The above detailed description is provided for the method, apparatus, device and storage medium for predicting the category of goods provided by the present invention, and the principle and implementation of the present invention are explained in this document by applying specific examples, and the description of the above examples is only used to help understanding the method and core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A method for predicting a category of a commodity, comprising:

2. The method for predicting the commodity category according to claim 1, wherein before inputting the preset commodity information into the pre-training model for processing, the method further comprises:

3. The method for predicting the commodity category according to claim 2, wherein the data cleaning of the third-party commodity information and the corresponding commodity category by the active learning method comprises:

4. The method of claim 3, wherein before training the classifier for the category of commodities of the corresponding category using the third-party commodity information in each category of commodities, the method further comprises:

5. The method for predicting categories of commodities as claimed in claim 3, further comprising:

6. The method for predicting the commodity category according to claim 2, wherein the data cleaning of the third-party commodity information and the corresponding commodity category by the active learning method comprises:

7. The method according to any one of claims 1 to 6, wherein the performing distribution statistics on the commodity category corresponding to the preset commodity information includes:

8. An article category prediction device comprising:

9. An electronic device, comprising a processor and a memory; wherein the memory is for storing a computer program that is loaded and executed by the processor to implement the item category prediction method of any one of claims 1 to 7.

10. A computer-readable storage medium storing computer-executable instructions which, when loaded and executed by a processor, carry out a method of commodity category prediction according to any one of claims 1 to 7.