CN113806537A

CN113806537A - Commodity category classification method and device, equipment, medium and product thereof

Info

Publication number: CN113806537A
Application number: CN202111075426.XA
Authority: CN
Inventors: 叶朝鹏
Original assignee: Guangzhou Huaduo Network Technology Co Ltd
Current assignee: Guangzhou Huaduo Network Technology Co Ltd
Priority date: 2021-09-14
Filing date: 2021-09-14
Publication date: 2021-12-17

Abstract

The application discloses a commodity category classification method and a device, equipment, a medium and a product thereof, wherein the method comprises the following steps: acquiring a title text corresponding to the commodity object; calling a text feature extraction model to extract text feature information from the title text; in the training process of the text feature extraction model, training a hierarchical structure of a class tree corresponding to commodity classification by using the same training sample layer by layer, correcting a weight parameter of the text feature extraction model by using an actual loss value of a current layer during training of each layer until the text feature extraction model is trained to be in a convergence state, wherein the actual loss value is obtained by fusing a loss function value of the current layer with an actual loss value of a training layer; and classifying based on the text characteristic information, and marking the classification attribute of the commodity object by using a classification result, wherein the classification result comprises category labels of all levels with hierarchy membership in the category tree. The model of the present application is easy to efficiently train to converge.

Description

Commodity category classification method and device, equipment, medium and product thereof

Technical Field

The present application relates to the field of e-commerce information technologies, and in particular, to a method for classifying categories of commodities, and a corresponding apparatus, computer device, computer-readable storage medium, and computer program product.

Background

The electronic commerce platform is provided with a plurality of related commodities, the quantity of the commodities can reach thousands of grades and more than ten thousands of grades, and the commodities can be efficiently organized by means of multi-grade categories. Between multi-level categories, sub-categories usually belong to parent categories, and are expanded layer by layer to form a category tree. Considering that the deeper the hierarchy of the category tree, the greater the amount of loss, therefore, the category tree typically includes three levels, four levels, and generally no more than five levels. On the data level, the category tree realizes the organization of the mass commodity objects in the E-commerce platform by using a multi-level classification structure of the category tree, and is convenient for maintenance such as addition, inquiry, update and the like.

The title text of the commodity can be used as a classification basis, the deep semantic feature information of the title text can be extracted by means of a deep semantic model based on a natural language processing technology, and the classification of corresponding commodity objects can be realized according to the feature information. However, as the hierarchy in the category tree increases, the geometric level of the leaf categories increases, in which part of the leaf categories correspond to more commodity objects, and part of the leaf categories correspond to fewer commodity objects, that is, the commodity objects are not uniformly distributed in the leaf categories. This presents difficulties for the training of deep semantic based neural network models. As is well known, training of a neural network model needs to depend on a large number of labeled training samples for training, if a path from the top layer of a category tree to a leaf category is trained by using a title text of a commodity object, due to the fact that a few leaf category training samples exist, the model is prone to being over-fitted, a prediction result is inaccurate, and the model is not available.

Therefore, in view of the above situation, there is a need to improve the training process of deep semantic models for classification of category trees according to the title text of goods, so as to ensure that the trained models can predict the classification result more accurately, thereby providing a goods object intelligent classification service for e-commerce platforms.

Disclosure of Invention

A primary object of the present application is to solve at least one of the above problems and provide a method for classifying categories of commodities, and a corresponding apparatus, a computer device, a computer readable storage medium, and a computer program product, so as to implement music creation assistance.

In order to meet various purposes of the application, the following technical scheme is adopted in the application:

a method for classifying categories of commodities, adapted to one of the objects of the present application, comprises the steps of:

acquiring a title text corresponding to the commodity object;

calling a text feature extraction model to extract text feature information from the title text; in the training process of the text feature extraction model, training a hierarchical structure of a class tree corresponding to commodity classification by using the same training sample layer by layer, correcting a weight parameter of the text feature extraction model by using an actual loss value of a current layer during training of each layer until the text feature extraction model is trained to be in a convergence state, wherein the actual loss value is obtained by fusing a loss function value of the current layer with an actual loss value of a training layer;

and classifying based on the text characteristic information, and marking the classification attribute of the commodity object by using a classification result, wherein the classification result comprises category labels of all levels with hierarchy membership in the category tree.

In a further embodiment, the training process of the text feature extraction model includes the following steps:

creating a plurality of training tasks to perform training for respective layers in the category tree;

inputting the same training sample for each training task to start training;

and controlling each training task to transmit the actual loss value of each training layer from the top layer to the bottom layer according to the hierarchical structure of the category tree so that the corresponding layer fuses the actual loss value to realize the weight parameter correction of the text feature model in the corresponding training task.

In a further embodiment, the training process of the text feature extraction model includes the steps of training for each layer in the classification structure of the category tree:

extracting text characteristic information from the training sample;

inputting the text characteristic information into a classification model for classification to obtain a classification result corresponding to the current layer;

weighting and summing the loss function value of the current layer and the respective actual loss values of all the previous training layers to obtain the actual loss value of the current layer;

and (4) reversely propagating and correcting the weight parameters of the text feature extraction model by using the actual loss value of the current layer to realize gradient updating.

In a further embodiment, invoking a text feature extraction model to extract text feature information from the title text comprises the following steps:

constructing an embedded vector corresponding to each participle in the title text;

splicing a left vector and a right vector representing the context semantics of each embedded vector to construct a middle feature vector;

and performing pooling operation on the intermediate feature vector to obtain a vector representing the semantics of the title text as the text feature information.

In a further embodiment, the step of marking the classification attribute of the commodity object with the classification result comprises the following steps:

determining a plurality of category labels according to the classification result;

querying a preset dictionary to obtain category texts corresponding to the plurality of category labels;

and assigning the category text as the classification attribute of the commodity object.

In a preferred embodiment, in the training process of the text feature extraction model, the training process of each layer is triggered to run in a multi-task manner.

An apparatus for classifying a category of an article, which is adapted to one of the objects of the present application, includes: the system comprises a title acquisition module, a feature extraction module and a classification marking module; the title acquisition module is used for acquiring a title text corresponding to the commodity object; the feature extraction module is used for calling a text feature extraction model to extract text feature information from the title text; in the training process of the text feature extraction model, training a hierarchical structure of a class tree corresponding to commodity classification by using the same training sample layer by layer, correcting a weight parameter of the text feature extraction model by using an actual loss value of a current layer during training of each layer until the text feature extraction model is trained to be in a convergence state, wherein the actual loss value is obtained by fusing a loss function value of the current layer with an actual loss value of a training layer; and the classification marking module is used for classifying based on the text characteristic information and marking the classification attribute of the commodity object with a classification result, wherein the classification result comprises the category labels of all levels with the level membership in the category tree.

In a further embodiment, the training process of the text feature extraction model operates in the following structure: a task creation module for creating a plurality of training tasks to perform training for each layer in the category tree; the training starting module inputs the same training sample for each training task to start training; and the propagation control module is used for controlling each training task to transmit the actual loss value of each training layer from the top layer to the bottom layer according to the hierarchical structure of the category tree so as to enable the corresponding layer to fuse the actual loss value to realize the weight parameter correction of the text feature model in the corresponding training task.

In a further embodiment, the training process of the text feature extraction model includes a run structure trained for each layer in the classification structure of the category tree: the feature extraction example module is used for extracting text feature information from the training sample; the classification example module is used for inputting the text characteristic information into a classification model for classification to obtain a classification result corresponding to the current layer; the loss superposition example module is used for weighting and summing the loss function value of the current layer and the respective actual loss values of all the previous training layers to obtain the actual loss value of the current layer; and the correction module is used for utilizing the actual loss value of the current layer to reversely propagate and correct the weight parameter of the text feature extraction model to realize gradient updating.

In a further embodiment, the extracting text feature information from the title text by the feature extracting module using a text feature extracting model includes: the vector construction submodule is used for constructing an embedded vector corresponding to each participle in the title text; the semantic citation submodule is used for splicing a left vector and a right vector representing the context semantics of each embedded vector to construct a middle feature vector; and the pooling abstraction sub-module is used for performing pooling operation on the intermediate feature vector to obtain a vector representing the semantics of the title text as the text feature information.

In a further embodiment, the classification tagging module comprises: the label conversion submodule is used for determining a plurality of category labels according to the classification result; the dictionary query submodule is used for querying a preset dictionary to obtain category texts corresponding to the plurality of category labels; and the object assignment submodule is used for assigning the category text to be the classification attribute of the commodity object.

A computer device adapted for one of the purposes of the present application comprises a central processing unit and a memory, the central processing unit being configured to invoke the execution of a computer program stored in the memory to perform the steps of the method for classification of categories of goods as described in the present application.

A computer-readable storage medium, which stores in the form of computer-readable instructions a computer program implemented according to the method for classifying categories of goods, the computer program, when invoked by a computer, performing the steps comprised by the method.

A computer program product, provided to adapt to another object of the present application, comprises computer programs/instructions which, when executed by a processor, implement the steps of the method described in any of the embodiments of the present application.

Compared with the prior art, the application has the following advantages:

according to the text feature extraction model, due to the fact that the training stage trains layer by layer aiming at the hierarchical structure of the category tree, when the text feature extraction model trains the next-layer category tree, the actual loss value for gradient updating obtained by the first training layer of the category tree is fused on the basis of the loss function value of the current layer to conduct parameter correction, the text feature extraction model inherits the loss information corresponding to each layer of the category tree in sequence by citing the actual loss values of all the first training layers, and finally the text feature extraction model obtains the hierarchical structure aiming at the whole category tree to conduct unified representation learning to achieve classification capability.

In the training process, it can be understood that the supervision label of the same training sample is a path formed by each category label of each level in the category tree, and the category labels have hierarchical membership, so that the method of training layer by layer is adopted, namely, the range of data distribution of the next layer is continuously reduced, so that when the last layer, namely the leaf category, is trained, even if the training samples of the leaf category are fewer, the loss function can be rapidly converged, and a model with higher learning capacity is trained to serve the correct classification of commodity objects, therefore, the training efficiency is high, the training cost is low, and the training effect is good.

Therefore, after the text feature extraction model is trained, the classification can be carried out according to the title texts corresponding to the commodity objects.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a schematic flow chart diagram of an exemplary embodiment of a method for classifying categories of goods according to the present application;

FIG. 2 is a schematic diagram of a network architecture for implementing the method for classifying categories of goods according to the present application;

FIG. 3 is a schematic flow chart of a training process of a text feature extraction model in an embodiment of the present application;

FIG. 4 is a flow chart illustrating the operation of a single training task in an embodiment of the present application;

FIG. 5 is a schematic workflow diagram of a text feature extraction model based on a TextRCNN model in an embodiment of the present application;

FIG. 6 is a diagram illustrating a network structure of a TextRCNN used in an embodiment of the present application;

FIG. 7 is a flowchart illustrating a process of completing classification tagging according to a classification result according to the present application;

fig. 8 is a functional block diagram of the commodity category classification device of the present application;

fig. 9 is a schematic structural diagram of a computer device used in the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.

It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As will be appreciated by those skilled in the art, "client," "terminal," and "terminal device" as used herein include both devices that are wireless signal receivers, which are devices having only wireless signal receivers without transmit capability, and devices that are receive and transmit hardware, which have receive and transmit hardware capable of two-way communication over a two-way communication link. Such a device may include: cellular or other communication devices such as personal computers, tablets, etc. having single or multi-line displays or cellular or other communication devices without multi-line displays; PCS (Personal Communications Service), which may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant), which may include a radio frequency receiver, a pager, internet/intranet access, a web browser, a notepad, a calendar and/or a GPS (Global Positioning System) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "client," "terminal device" can be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. The "client", "terminal Device" used herein may also be a communication terminal, a web terminal, a music/video playing terminal, such as a PDA, an MID (Mobile Internet Device) and/or a Mobile phone with music/video playing function, and may also be a smart tv, a set-top box, and the like.

The hardware referred to by the names "server", "client", "service node", etc. is essentially an electronic device with the performance of a personal computer, and is a hardware device having necessary components disclosed by the von neumann principle such as a central processing unit (including an arithmetic unit and a controller), a memory, an input device, an output device, etc., a computer program is stored in the memory, and the central processing unit calls a program stored in an external memory into the internal memory to run, executes instructions in the program, and interacts with the input and output devices, thereby completing a specific function.

It should be noted that the concept of "server" as referred to in this application can be extended to the case of a server cluster. According to the network deployment principle understood by those skilled in the art, the servers should be logically divided, and in physical space, the servers may be independent from each other but can be called through an interface, or may be integrated into one physical computer or a set of computer clusters. Those skilled in the art will appreciate this variation and should not be so limited as to restrict the implementation of the network deployment of the present application.

One or more technical features of the present application, unless expressly specified otherwise, may be deployed to a server for implementation by a client remotely invoking an online service interface provided by a capture server for access, or may be deployed directly and run on the client for access.

Unless specified in clear text, the neural network model referred to or possibly referred to in the application can be deployed in a remote server and used for remote call at a client, and can also be deployed in a client with qualified equipment capability for direct call.

Various data referred to in the present application may be stored in a server remotely or in a local terminal device unless specified in the clear text, as long as the data is suitable for being called by the technical solution of the present application.

The person skilled in the art will know this: although the various methods of the present application are described based on the same concept so as to be common to each other, they may be independently performed unless otherwise specified. In the same way, for each embodiment disclosed in the present application, it is proposed based on the same inventive concept, and therefore, concepts of the same expression and concepts of which expressions are different but are appropriately changed only for convenience should be equally understood.

The embodiments to be disclosed herein can be flexibly constructed by cross-linking related technical features of the embodiments unless the mutual exclusion relationship between the related technical features is stated in the clear text, as long as the combination does not depart from the inventive spirit of the present application and can meet the needs of the prior art or solve the deficiencies of the prior art. Those skilled in the art will appreciate variations therefrom.

The commodity category classification method can be programmed into a computer program product and is realized by being deployed in a client and/or a server to run, so that the client can access an open interface after the computer program product runs in a webpage program or application program mode, and human-computer interaction is realized through a graphical user interface and the progress of the computer program product.

Referring to fig. 1 and 2, in an exemplary embodiment, the method is implemented by the network architecture shown in fig. 2, and includes the following steps:

step S1100, acquiring a title text corresponding to the commodity object:

an application scenario of the application is an application in an e-commerce platform based on independent stations, each independent station is a merchant instance of the e-commerce platform, and has an independent access domain name, and an actual owner of the independent station is responsible for issuing and updating commodities.

And the merchant instance of the independent station brings each commodity on line, and the e-commerce platform constructs a corresponding commodity object for data storage after acquiring the information related to the commodity. The information of the commodity object mainly comprises text information and picture information, wherein the text information comprises title information, content information and attribute information of the commodity object, the title information is used for showing the commodity details, the attribute information is used for describing the characteristics of the commodity, and the like.

In order to implement the technical scheme of the application, abstract texts of the commodity objects can be collected, the abstract texts mainly adopt title information in the commodity objects, and one or more items of attribute information of the title information can be enhanced if necessary. Generally, the abstract text may be obtained according to a preset number and content requirements, for example, the abstract text may be specified to include title information of the commodity object and attribute information of all attribute items thereof. Of course, the adjustment process can be flexibly changed by those skilled in the art on the basis of the above.

And finally, taking the abstract text as the title text for carrying out classification judgment on the commodity object.

Step S1200, calling a text feature extraction model to extract text feature information from the title text; in the training process of the text feature extraction model, the hierarchical structure of the category tree corresponding to the commodity classification is trained layer by using the same training sample, the weight parameter of the text feature extraction model is corrected by the actual loss value of the current layer during training of each layer until the text feature extraction model is trained to be in a convergence state, and the actual loss value is obtained by fusing the loss function value of the current layer with the actual loss value of the training layer:

as shown in fig. 2, the text feature extraction model vectorizes the title text, and then performs deep semantic feature extraction to obtain corresponding text feature information. The text feature extraction model is trained in advance to reach a convergence state. The text feature extraction model does not need to be specific, and includes but is not limited to common deep semantic learning-based network models for extracting text feature information, such as Bert and TextRCNN, which are popular at present. Although the present application will be exemplified by TextRCNN, the scope of coverage of the inventive spirit of the present application should not be construed as being limited thereto.

It can be understood that the text feature extraction model implements the classification capability of a commodity object based on a hierarchical structure of a category tree, and is implemented by training a network architecture as shown in fig. 2, and the key point of the network architecture lies in the capability of training the text feature extraction model to learn representation learning required for classifying the title text. Therefore, the emphasis of the training of the network architecture is to train the text feature extraction model to learn the corresponding ability.

In order to make the text feature extraction model learn the capability of representation learning, the method can be implemented by training the hierarchical structure of the category tree layer by layer.

In this embodiment, the basic principle of training the text feature extraction model is to use the same training sample as input data of the text feature extraction model, perform layer-by-layer training from the top layer in the hierarchical structure of the category tree, extract corresponding text feature information, and then classify the text feature information by the classification model to obtain a classification result.

For the training task corresponding to each layer of the category tree, calculating the corresponding loss function value of the current layer according to the classification result. The classification model can obtain the loss function value of the current layer by calculating using the cross quotient loss function, and the formula is known to those skilled in the art as follows:

since the cross entropy function is known to those skilled in the art, its explanation is omitted.

Starting from the training task corresponding to the top layer of the category tree, organizing the hierarchical sequence of the training tasks towards the bottom layer in the hierarchical structure, so that each training task can be regarded as the corresponding training task of each layer. It can be understood that each layer is trained based on the same training sample, and the cross entropy loss function value of each layer is obtained through corresponding calculation.

For each layer of the training process, the actual loss value used for implementing the gradient update is a total loss value obtained by fusing the actual loss values of all previous training layers on the basis of the cross entropy loss function value of the current layer, and the calculation formula of the actual loss value required for back propagation of each current layer is as follows:

wherein, w_tFor each layer corresponding weight parameter, L_iAnd calculating the actual loss value corresponding to each layer, wherein the actual loss value of the current layer is not calculated, and therefore the cross entropy loss function value of the current layer is adopted to replace calculation, so that the current layer loss function value is fused with the actual loss values corresponding to all other layers which are trained before the current layer.

According to the formula, when the current layer calculates the actual loss value required by the gradient update, the actual loss values of other layers trained before the current layer are referred to for weighted summation, for example, when the current layer is the third layer, the actual loss value corresponding to the first layer (top layer) and the actual loss value corresponding to the second layer are referred to, and the cross entropy loss function value of the third layer itself is referred to. For the calculation of the actual loss value of the first layer, the actual loss value is the cross-entropy loss function value because of the lack of the previous training layer. The second layer, the fourth layer or other nth layer are the same as the third layer.

Therefore, as the training progresses deeply layer by layer, corresponding to the category tree hierarchical structure, the actual loss value of the next layer is always limited by the actual loss values of all the previous training layers, and therefore when the weight parameters of the text feature extraction model are corrected through back propagation, the actual loss values of the whole hierarchical structure from the top layer to the bottom layer are actually integrated. Accordingly, the network architecture is continuously trained in an iterative manner to reach a convergence state, so that training can be completed, and a text feature extraction model required by the application is obtained, so that the text feature extraction model has the capability of performing deep semantic representation learning on the title text required by classification. Therefore, the text feature extraction model can be used for extracting text feature information required by classification from the title text.

Step S1300, classifying based on the text characteristic information, and marking the classification attribute of the commodity object by using a classification result, wherein the classification result comprises category labels of all levels with a hierarchy membership in the category tree:

after the text feature extraction model of the network architecture trained to the convergence state obtains corresponding text feature information from the header text, the classification model can be used for classifying the text feature information to obtain a corresponding classification result. The classification result comprises probability scores of the category labels of all levels of the hierarchy of the category tree mapped by the mapping text, the maximum value is found in a plurality of category labels corresponding to all levels, the maximum value corresponding to all levels is extracted, a classification path with level membership is obtained, and classification is finished.

The classification path comprises a plurality of category labels forming a hierarchical membership, so that the classification attributes of the commodity objects corresponding to the title texts can be marked by the category texts corresponding to the category labels, and the classification marking process is finished.

Referring to fig. 3, in a further embodiment, the training process of the text feature extraction model includes the following steps:

step S2100, creating a plurality of training tasks to perform training for each layer in the category tree:

the number of layers of the category tree is generally suitable to be three, four or five, and by adapting to the number of layers, a corresponding plurality of training tasks can be created in a multi-task training mode, so that each training task is responsible for the training of one corresponding layer.

Step S2200, inputting the same training sample for each training task to start training:

in order to train a plurality of training tasks based on the same training sample, therefore, for each iteration process in the training process, the same training sample is called from the training data set and transmitted to a plurality of training tasks at the same time, so that each training task is trained based on the same training sample.

Step S2300, controlling each training task to transmit the actual loss value of each training layer from the top layer to the bottom layer according to the hierarchical structure of the category tree, so that the corresponding layer fuses the actual loss value to realize the weight parameter correction of the text feature model in the corresponding training task:

although the embodiment is trained by multiple tasks, the text feature extraction model instance of each task needs to acquire the actual loss value of the previous training level for implementing gradient updating according to the level membership, so that, in essence, a serial relationship exists among multiple training tasks, in the training process, each training task corresponding to the corresponding category tree from the top layer to the bottom layer is controlled to train layer by layer, and the current layer performs weighted summation by using the cross entropy loss function value of the current layer and the actual loss values of other layers trained before the current layer to acquire the total loss value as the actual loss value of the current layer, performs gradient updating on the network architecture, and transmits the actual loss value of the current layer to the corresponding training tasks of the layers below the current layer, so that the training tasks of the layers below the current layer can refer to the actual loss values thereof, and the actual loss values of each layer for implementing gradient updating are nested in such a way that the actual loss values of each layer are all subjected to the actual loss values of the upper layers of the current layer And (4) restricting, realizing the weight correction of the text feature extraction model on the basis, and achieving the training of the current layer.

In the embodiment, a multi-task training mechanism is constructed, based on the same training sample, the text feature extraction model is trained on corresponding levels in training tasks corresponding to the levels, and then actual loss values for being quoted by subordinate levels of the current level are transmitted among different training tasks in a serial relation, so that the multi-task collaborative training mechanism is realized, the training efficiency can be greatly improved, and the model can be converged more quickly.

Referring to fig. 4, in a further embodiment, the method is mainly used for disclosing a training process of the text feature extraction model in each training task, and therefore, the step of training for each layer in the classification structure of the category tree includes:

step S3100, extracting text feature information from the training samples:

for each training task, extracting corresponding text feature information from the training sample by using the text feature extraction model.

Step S3200, inputting the text characteristic information into a classification model for classification, and obtaining a classification result corresponding to the current layer:

it can be seen that, in each training task, there is actually an exemplary reference to the network architecture shown in fig. 2, and therefore, in the same way, the classification model can perform classification according to the text feature information extracted from the training sample to obtain a corresponding classification result. The content of the classification result is described with reference to the foregoing example.

Step S3300, performing weighted summation on the loss function value of the current layer and the respective actual loss values of all previous training layers to obtain an actual loss value of the current layer:

the prior training layer is relative to the current layer. The first training layer is also a layer higher than the current layer in the category tree, so that the corresponding actual loss value is obtained by training through the corresponding training task. And after the current layer is classified by the classification model, the loss function value of the current layer can be calculated and obtained according to the cross entropy function. Accordingly, as described in the foregoing embodiments, the loss function value of the current layer and the actual loss values of all previous training layers that were trained before the current layer may be weighted and summed to obtain the actual loss value of the current layer.

Step S3400, utilizing the actual loss value of the current layer to reversely propagate and correct the weight parameters of the text feature extraction model to realize gradient updating:

as mentioned above, the actual loss value of the current layer is used to perform a gradient update on the network architecture, and the weight parameters of the modified text feature model are propagated back, so that the whole network slowly converges in the usual multiple iterations.

The embodiment discloses the specific execution process in each training task, and as can be seen, a plurality of training tasks can be realized by the same business logic, the realization principle is simple, the development cost is low, and the realization is easy.

Referring to fig. 5, in a further embodiment, a text feature extraction model is exemplified by TextRCNN, and the step S1200 of invoking the text feature extraction model to extract text feature information from the caption text includes the following steps:

step 1210, constructing an embedded vector corresponding to each participle in the title text:

as shown in FIG. 6, the text feature extraction model based on the TextRCNN structure comprises a convolution layer for encoding, followed by a pooling layer, and then output.

For a title text, format preprocessing can be performed first, word segmentation is performed and then vectorization is performed to obtain an embedded vector e (w) corresponding to each word segmentation_i). The embedded vector is a representation of a corresponding word segmentation, which can be determined by combining a mapping relation of a preset dictionary to convert the word segmentation from a text into a vector.

Step S1220, splicing the left vector and the right vector representing the context semantics of each embedded vector, and constructing a middle feature vector:

according to the principle of TextRCNN, when encoding the embedded vector of each participle, the left vector C needs to be spliced according to the context_l(w_i) And the right vector C_r(w_i) And after the left vector, the embedded vector and the right vector are spliced, the spliced result is further subjected to linear transformation, and then the intermediate characteristic information corresponding to each participle is obtained.

Step S1230, performing pooling operation on the intermediate feature vector to obtain a vector representing the semantics of the title text as the text feature information:

further, on the basis of the intermediate feature vector, pooling operation is performed, the intermediate feature vector is mapped into a vector representing the semantics of the title text, and encoding is completed, and the vector can be used as the text feature information.

The embodiment further provides a practical network structure of the text feature extraction model and an encoding process thereof, so that readers can easily deepen the creative understanding of the application.

Referring to fig. 7, in a further embodiment, the step S1300 of marking the classification attribute of the commodity object with the classification result includes the following steps:

step S1310, determining a plurality of category labels according to the classification result:

as described above, when the network architecture of the present application is used in the production phase, the classification result obtained by classifying the classification model includes probability scores of the classification labels mapped to the hierarchical structure of the category tree, and a classification path can be determined by the probability scores, that is, a corresponding classification label is determined in each hierarchical structure, and the classification labels have hierarchical membership. Therefore, the corresponding classification label can be determined according to the probability score in the classification result.

Step S1320, inquiring a preset dictionary to obtain category texts corresponding to the plurality of category labels;

a dictionary may be constructed in advance for storing a mapping of each category label in the hierarchy of the category tree to a numerical value, i.e. a classification label in the classification result of the classification model, whereby a mapping relationship between each category label to a classification label is determined in the dictionary. Therefore, by using this dictionary, the category text corresponding to each classification label of the classification path can be searched.

Step S1330, assigning the category text to the classification attribute of the commodity object:

each commodity object is associated with a classification attribute, and the classification attribute of the commodity object corresponding to the title text of the application is assigned to the plurality of inquired and obtained category texts, so that the classification of the commodity object can be completed.

The embodiment provides the process of carrying out mark classification on the commodity objects according to the classification result, and can be seen that the process can be automatically realized, and the automatic classification efficiency of the E-commerce platform can be greatly improved.

Referring to fig. 8, a commodity category classification device adapted to one of the objectives of the present application is a functional implementation of the commodity category classification method of the present application, and the device includes: a title acquisition module 1100, a feature extraction module 1200, and a classification tagging module 1300; the title obtaining module 1100 is configured to obtain a title text corresponding to a commodity object; the feature extraction module 1200 is configured to invoke a text feature extraction model to extract text feature information from the title text; in the training process of the text feature extraction model, training a hierarchical structure of a class tree corresponding to commodity classification by using the same training sample layer by layer, correcting a weight parameter of the text feature extraction model by using an actual loss value of a current layer during training of each layer until the text feature extraction model is trained to be in a convergence state, wherein the actual loss value is obtained by fusing a loss function value of the current layer with an actual loss value of a training layer; the classification labeling module 1300 is configured to perform classification based on the text feature information, and label the classification attribute of the commodity object with a classification result, where the classification result includes category labels of each level having a hierarchical membership in the category tree.

In a further embodiment, the feature extraction module 1200 invokes a text feature extraction model to extract text feature information from the title text, including: the vector construction submodule is used for constructing an embedded vector corresponding to each participle in the title text; the semantic citation submodule is used for splicing a left vector and a right vector representing the context semantics of each embedded vector to construct a middle feature vector; and the pooling abstraction sub-module is used for performing pooling operation on the intermediate feature vector to obtain a vector representing the semantics of the title text as the text feature information.

In a further embodiment, the categorical tagging module 1300 comprises: the label conversion submodule is used for determining a plurality of category labels according to the classification result; the dictionary query submodule is used for querying a preset dictionary to obtain category texts corresponding to the plurality of category labels; and the object assignment submodule is used for assigning the category text to be the classification attribute of the commodity object.

In order to solve the technical problem, an embodiment of the present application further provides a computer device. As shown in fig. 9, the internal structure of the computer device is schematically illustrated. The computer device includes a processor, a computer-readable storage medium, a memory, and a network interface connected by a system bus. The computer readable storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store control information sequences, and the computer readable instructions, when executed by the processor, can make the processor implement a commodity category classification method. The processor of the computer device is used for providing calculation and control capability and supporting the operation of the whole computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, may cause the processor to perform the method for classifying categories of items of the present application. The network interface of the computer device is used for connecting and communicating with the terminal. Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In this embodiment, the processor is configured to execute specific functions of each module and its sub-module in fig. 8, and the memory stores program codes and various data required for executing the modules or sub-modules. The network interface is used for data transmission to and from a user terminal or a server. The memory in this embodiment stores program codes and data necessary for executing all modules/sub-modules in the product category classification device of the present application, and the server can call the program codes and data of the server to execute the functions of all sub-modules.

The present application also provides a storage medium storing computer-readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the method for classifying categories of items of any of the embodiments of the present application.

The present application also provides a computer program product comprising computer programs/instructions which, when executed by one or more processors, implement the steps of the method as described in any of the embodiments of the present application.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments of the present application can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when the computer program is executed, the processes of the embodiments of the methods can be included. The storage medium may be a computer-readable storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).

In summary, the method and the device have the advantages that the number of training samples required by the text feature extraction model for representing the learning ability is reduced by adopting a layer-by-layer training mode aiming at the multi-layer structure of the category tree, the training cost is reduced, the training efficiency is improved, and the learning representing ability required by the text feature extraction model for classifying the commodity categories is improved.

Those of skill in the art will appreciate that the various operations, methods, steps in the processes, acts, or solutions discussed in this application can be interchanged, modified, combined, or eliminated. Further, other steps, measures, or schemes in various operations, methods, or flows that have been discussed in this application can be alternated, altered, rearranged, broken down, combined, or deleted. Further, steps, measures, schemes in the prior art having various operations, methods, procedures disclosed in the present application may also be alternated, modified, rearranged, decomposed, combined, or deleted.

The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims

1. A method for classifying categories of commodities is characterized by comprising the following steps:

acquiring a title text corresponding to the commodity object;

2. The method for classifying categories of commodities as claimed in claim 1, wherein said training process of said text feature extraction model comprises the steps of:

inputting the same training sample for each training task to start training;

3. The method for classifying commodity categories according to claim 1, wherein the training process of the text feature extraction model includes the steps of training for each layer in the classification structure of the category tree:

extracting text characteristic information from the training sample;

4. The method for classifying the categories of commodities as claimed in claim 1, wherein a text feature extraction model is invoked to extract text feature information from said heading text, comprising the steps of:

5. The method for classifying merchandise items according to any one of claims 1 to 4, wherein the step of marking the classification attribute of the merchandise object with the classification result comprises the steps of:

6. The method for classifying commodity categories according to any one of claims 1 to 4, wherein in the training process of the text feature extraction model, the training process of each layer is triggered to run in a multitask manner.

7. An article category classification device, comprising:

the title acquisition module is used for acquiring a title text corresponding to the commodity object;

the characteristic extraction module is used for calling a text characteristic extraction model to extract text characteristic information from the title text; in the training process of the text feature extraction model, training a hierarchical structure of a class tree corresponding to commodity classification by using the same training sample layer by layer, correcting a weight parameter of the text feature extraction model by using an actual loss value of a current layer during training of each layer until the text feature extraction model is trained to be in a convergence state, wherein the actual loss value is obtained by fusing a loss function value of the current layer with an actual loss value of a training layer;

and the classification marking module is used for classifying based on the text characteristic information and marking the classification attribute of the commodity object according to a classification result, wherein the classification result comprises the category labels of all levels with the level membership in the category tree.

8. A computer device comprising a central processor and a memory, characterized in that the central processor is adapted to invoke execution of a computer program stored in the memory to perform the steps of the method according to any one of claims 1 to 7.

9. A computer-readable storage medium, characterized in that it stores, in the form of computer-readable instructions, a computer program implemented according to the method of any one of claims 1 to 7, which, when invoked by a computer, performs the steps comprised by the corresponding method.

10. A computer program product comprising computer program/instructions, characterized in that the computer program/instructions, when executed by a processor, implement the steps of the method as claimed in any one of claims 1 to 7.