WO2019205319A1

WO2019205319A1 - Commodity information format processing method and apparatus, and computer device and storage medium

Info

Publication number: WO2019205319A1
Application number: PCT/CN2018/097082
Authority: WO
Inventors: 金鑫; 杨雨芬; 赵媛媛
Original assignee: 平安科技（深圳）有限公司
Priority date: 2018-04-25
Filing date: 2018-07-25
Publication date: 2019-10-31
Also published as: CN108563782B; CN108563782A

Abstract

A commodity information format processing method, comprising: obtaining commodity information, the commodity information comprising a plurality of commodity items; performing word segmentation on the content of the commodity items to obtain a plurality of words; obtaining weight vectors which are obtained by training of a word vector model and correspond to the plurality of words, and utilizing the weight vectors corresponding to the plurality of words to generate a weight matrix; obtaining codes corresponding to the plurality of words of the commodity items, and inputting the codes of the plurality of words into a trained multi-layer cyclic neural network; and performing operation by means of the trained multi-layer cyclic neural network based on the codes of the plurality of words and the weight matrix, and outputting the description of a preset format corresponding to the commodity items.

Description

Commodity information format processing method, device, computer device and storage medium

This application is filed on April 25, 2018, the Chinese Patent Office, the application number is 2018103807519, and the priority of the Chinese patent application entitled "Commodity Information Format Processing Method, Apparatus, Computer Equipment and Storage Medium" is applied. The citations are incorporated herein by reference.

Technical field

The present application relates to a method, device, computer device and storage medium for processing a commodity information format.

Background technique

Customs declaration forms involve a variety of product information. Although the declaration form specifies a uniform declaration format, different applicants fill in the order and form of the product information, resulting in a uniform format of the commodity information in the customs declaration. . A large number of customs officers interpret commodity information in different formats, which is time-consuming and laborious, and also causes more obstacles to the customs control of customs import and export business. In order to effectively and uniformly process the product information format, in the traditional way, the developer usually uses a specific template to match the key dictionary to match the key information, and unify the format of the product information. However, this approach requires developers to develop a variety of templates and collect a variety of proper noun libraries to build proprietary dictionaries, resulting in a lower efficiency in the processing of uniform commodity information formats.

Summary of the invention

According to various embodiments disclosed herein, a commodity information format processing method, apparatus, computer apparatus, and storage medium are provided.

A commodity information format processing method includes: acquiring commodity information, the commodity information including a plurality of commodity items; performing word segmentation processing on the content of the commodity item to obtain a plurality of words; and acquiring a plurality of pieces obtained by training the word vector model a weight vector corresponding to the word, generating a weight matrix by using a weight vector corresponding to the plurality of words; acquiring a code corresponding to the plurality of words of the commodity item, inputting the code of the plurality of words into the trained multi-layer cyclic neural network; The trained multi-layer cyclic neural network performs an operation based on the encoding of the plurality of words and the weight matrix, and outputs a description of a preset format corresponding to the commodity item.

A commodity information format processing apparatus, comprising: an information acquisition module, configured to acquire commodity information, the commodity information includes a plurality of commodity items; and a word segmentation processing module, configured to perform word segmentation processing on the content of the commodity item to obtain a plurality of a weight matrix generation module, configured to acquire a weight vector corresponding to a plurality of words trained by the word vector model, and generate a weight matrix by using a weight vector corresponding to the plurality of words; and a format unification module for acquiring the commodity item Corresponding encoding of the plurality of words, inputting the encoding of the plurality of words into the trained multi-layer cyclic neural network; and encoding, by the trained multi-layer cyclic neural network, based on the encoding of the plurality of words and the weight matrix Performing an operation to output a description of a preset format corresponding to the item of the item.

A computer device comprising a memory and one or more processors having stored therein computer readable instructions, the computer readable instructions being executable by the processor to cause the one or more processors to execute The following steps: obtaining commodity information, the commodity information includes a plurality of commodity items; performing word segmentation on the content of the commodity item to obtain a plurality of words; and acquiring a weight vector corresponding to the plurality of words obtained by training the word vector model, and utilizing Generating a weight matrix corresponding to the weight vector of the plurality of words; acquiring a code corresponding to the plurality of words of the commodity item, inputting the code of the plurality of words into the trained multi-layer cyclic neural network; and passing the multi-layer after the training a cyclic neural network, performing an operation based on the encoding of the plurality of words and the weight matrix, and outputting a description of a preset format corresponding to the commodity item.

One or more non-transitory computer readable storage mediums storing computer readable instructions, when executed by one or more processors, cause one or more processors to perform the steps of: obtaining merchandise information, The commodity information includes a plurality of commodity items; performing word segmentation processing on the content of the commodity item to obtain a plurality of words; acquiring a weight vector corresponding to the plurality of words trained by the word vector model, and using a weight vector corresponding to the plurality of words Generating a weight matrix; acquiring a code corresponding to the plurality of words of the commodity item, inputting the code of the plurality of words into the trained multi-layer cyclic neural network; and using the trained multi-layer cyclic neural network, based on The encoding of the plurality of words and the weight matrix are operated to output a description of a preset format corresponding to the commodity item.

Details of one or more embodiments of the present application are set forth in the accompanying drawings and description below. Other features and advantages of the present invention will be apparent from the description, drawings and claims.

DRAWINGS

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings to be used in the embodiments will be briefly described below. Obviously, the drawings in the following description are only some embodiments of the present application, Those skilled in the art can also obtain other drawings based on these drawings without any creative work.

1 is an application scenario diagram of a method for processing a commodity information format according to one or more embodiments;

2 is a schematic flow chart of a method for processing a commodity information format according to one or more embodiments;

3 is an expanded view of a 2-layer cyclic neural network in time in accordance with one or more embodiments;

4 is an expanded view of a 4-layer cyclic neural network in time in accordance with one or more embodiments;

5 is a developmental diagram of a 6-layer cyclic neural network in time according to one or more embodiments;

6 is a flow diagram showing the steps of word vector model training and multi-layer cyclic neural network training in accordance with one or more embodiments;

7 is a block diagram of a commodity information format processing apparatus in accordance with one or more embodiments;

FIG. 8 is a block diagram of a computer device in accordance with one or more embodiments.

detailed description

In order to make the objects, technical solutions, and advantages of the present application more comprehensible, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting.

The commodity information format processing method provided by the present application can be applied to an application environment as shown in FIG. 1. The terminal 102 communicates with the server 104 via a network. The terminal 102 can be, but is not limited to, computer equipment such as various personal computers, notebook computers, smart phones, and tablet computers. The server 104 can be implemented by a separate server or a server cluster composed of multiple servers. The product file uploaded by the terminal 102 to the server 104. A plurality of product information is recorded in the product file, and the product information includes a plurality of product items. The server 104 performs word segmentation processing on the detailed description of each item of merchandise. The trained word vector model and the trained multi-layer cyclic neural network are pre-stored in the server 104. The server 104 acquires a weight vector corresponding to a plurality of words trained by the word vector model, and generates a weight matrix by using weight vectors corresponding to the plurality of words. The server 104 acquires a code corresponding to a plurality of words of the item of the item, and inputs the code of the plurality of words to the trained multi-layer loop neural network. Through the trained multi-layer cyclic neural network, the operation is performed based on the encoding of the plurality of words and the weight matrix, and the description of the preset format corresponding to the commodity item is output. This allows a variety of different formats of original product information to be converted into a uniform format description.

In an embodiment, as shown in FIG. 2, a method for processing a commodity information format is provided. The method is applied to the server in FIG. 1 as an example, and includes the following steps:

Step 202: Acquire item information, where the item information includes a plurality of item items.

Step 204: Perform word segmentation on the content of the commodity item to obtain a plurality of words.

The server receives the product files uploaded by multiple terminals. A variety of product information is recorded in the product file. The product information includes a plurality of product items, and each product item includes a specific content, that is, a detailed information description. When the product file is from a different user, the specific content of the same product item may be different. For example, when the item is "hard disk", the corresponding content may be described as "hard disk capacity 128 GB cache capacity 32 MB", or may be described as "128 GB 32 MB". The server performs word segmentation on the detailed description of each item of merchandise. For example, the server divides one of the detailed descriptions of the item "hard disk" into "hard disk", "capacity", "128", "GB", "cache", "capacity", "32", "MB". , get multiple words.

Step 206: Acquire a weight vector corresponding to a plurality of words trained by the word vector model, and generate a weight matrix by using a weight vector corresponding to the plurality of words.

Multiple layers of hidden layers can be included in a multilayer cyclic neural network. The hidden layer includes a forward estimation layer and a backward estimation layer, which may also be referred to as an implicit layer that is a two-way estimation. The hidden layer of the first layer includes a first forward estimation layer and a first backward estimation layer, and the hidden layer of the second layer includes a second forward estimation layer and a second backward estimation layer, and the third layer implies The layer includes a third forward estimation layer and a third backward estimation layer, and so on. A corresponding weight matrix is set between the input layer and the hidden layer of the first layer, that is, a corresponding weight matrix is respectively set between the input layer and the first forward estimation layer and the input layer and the first backward estimation layer. In the conventional manner, the weight matrix corresponding to the first forward estimation layer and the first backward estimation layer are initialized to a random vector, but this may result in poor convergence of the multilayer cyclic neural network, and the output result cannot be fulfil requirements.

In this embodiment, the server uses the weight matrix corresponding to the plurality of words in the commodity item as the weight matrix between the input layer and the first hidden layer in the multi-layer cyclic neural network. The weight matrix is obtained by training the word vector model. The weight vector can reflect the vector of each word in the commodity item, effectively improve the convergence efficiency of the multi-layer cyclic neural network, thereby improving the accuracy of the output effect.

The weight matrix corresponding to each of the first forward estimation layer and the first backward estimation layer is different from each other. The server can obtain the weight vector corresponding to each word according to the description order of the multiple words in the commodity item, and the weight vector corresponding to each word can be a vector array. The server uses the weight vector corresponding to the plurality of words to generate a forward weighted weight matrix corresponding to the plurality of words. The server may obtain the weight vector of each word according to the reverse description order of the plurality of words in the commodity item, thereby generating a backward weighted weight matrix corresponding to the plurality of words. The forward weighted weight matrix is the weight matrix between the input layer and the first forward estimation layer in the multi-layer cyclic neural network. The weight matrix calculated backwards is the weight matrix between the input layer and the first backward estimation layer in the multi-layer cyclic neural network.

Continuing with the above product item "hard disk" as an example, the server can generate forward in the order of "hard disk", "capacity", "128", "GB", "cache", "capacity", "32", "MB". The weight matrix of the calculation. The server may also generate a backward weight matrix in the order of "MB", "32", "capacity", "cache", "GB", "128", "capacity", and "hard disk".

Step 208: Acquire a code corresponding to a plurality of words of the commodity item, and input the code of the plurality of words into the trained multi-layer cyclic neural network.

Step 210: Perform a calculation based on the multi-word cyclic neural network after training, based on the encoding of the plurality of words and the weight matrix, and output a description of the preset format corresponding to the commodity item.

The multilayer hidden layer in the multilayer cyclic neural network may be 2 layers, 4 layers or 6 layers. Each layer of the hidden layer includes a forward estimation layer and a backward estimation layer. As shown in FIG. 3 to FIG. 5, the time-expansion diagrams of the two-layer, four-layer, and six-layer cyclic neural networks, respectively. Among them, Relu represents the activation function, Lstm represents the long and short time memory unit, and Softmax represents the classification function. w* (* indicates a positive integer) indicates a weight matrix. As can be seen from the expanded view, each layer of the forward estimation layer and each layer of the backward estimation layer are set with corresponding initial weight matrix. For example, w2, w5 in FIG. 3, w3, w5, w6, w8 in FIG. 4, and w3, w5, w7, w8, w10, w12 in FIG.

Multi-layered cyclic neural networks can be pre-trained. When training, the multi-layer cyclic neural network can be trained by using the mapping file corresponding to the commodity information, and the mapping file records the original description of the plurality of training words in the commodity item and the description of the preset format. Thereby, the original description of the plurality of words in the item can be output in a preset format. Since the multi-layered cyclic neural network only accepts numerical inputs, the server also generates a corresponding training vocabulary based on the training words during training. The training vocabulary contains the code corresponding to each training word. After the server performs word segmentation on the commodity item, the training vocabulary can be used to query the code corresponding to the word of each commodity item.

The server invokes the trained multi-layered cyclic neural network to input the codes of the multiple words of the commodity item to the input layer of the multi-layer cyclic neural network. The input layer activates the weight matrix of the first forward estimation layer by an activation function, and activates the weight matrix of the first backward estimation layer, combined with the initial weight matrix of the first forward estimation layer and the initial weight matrix of the first backward estimation layer Start the operation. There is no information flow between the forward estimation layer and the backward estimation layer.

The 4-layer cyclic neural network after training is described as an example. The multiple words entered in the input layer may be "hard disk", "capacity", "128", "GB", "cache", "capacity", "32", "MB". As shown in FIG. 4, w1 is a weight matrix of the first forward estimation layer, and w3 is an initial weight matrix of the first forward estimation layer. After the Lstm operation, the forward weighted weight matrix w3 is outputted respectively (w3 at this time) This is different from the initial w3, here the same mark is used for the sake of brevity) and the weight matrix w4 corresponding to the second forward estimation layer. W2 is the weight matrix of the first backward estimation layer, and w6 is the initial weight matrix of the first backward estimation layer. After the Lstm operation, the weight matrix w6 calculated backward is output respectively (w6 at this time is different from the initial w6) The same is used for the sake of brevity and the weight matrix w7 corresponding to the second backward estimation layer. This is done by looping until the output layer outputs a description of the preset format of each word in turn by the classification function.

For example, if the item is "hard disk" and the original information is "Seagate/ST500LT012|003SDM1", after multi-layer cyclic neural network operation, it can be output as the following unified format:

"BRAND: SEAGATE, TYPE: HDD, SIZE: 500, CACHE: NaN, PRODUCT_NO: ST500LT012, RPM: NAN". Since each word in the item is described in a preset format, the original item information in a variety of different formats can be converted into a uniform format description. A database is deployed in the server, and after the server processes the format of the commodity file processing, the commodity file described in the unified format is stored in the database.

In this embodiment, when it is necessary to format the original description in the product information, the server may perform word segmentation on the content in the commodity item to obtain a plurality of words corresponding to the commodity item. The server may acquire a corresponding weight vector according to a plurality of words of the commodity item, and then generate a weight matrix corresponding to the plurality of words. Since the weight vector of each word is trained by the word vector model, it can accurately reflect the vector of each word, effectively improve the convergence effect of the multi-layer cyclic neural network, and thus improve the accuracy of the output effect. The server inputs the code of the plurality of words of the commodity item into the multi-layer cyclic neural network after training, and uses the multi-cycle neural network after training to perform calculation by using the coding of multiple words and the weight matrix, and outputs the preset format corresponding to the commodity item. description of. Since the multi-layered cyclic neural network is trained, each word in the commodity item can be output as a description of the preset format. The whole process does not need to develop a variety of templates and build a proprietary dictionary. Various types of product information can output the required uniform format, and the uniform efficiency of the product information format is improved.

In one embodiment, the method further comprises: a word vector model training and a step of multi-layer cyclic neural network training. As shown in Figure 6, the following are included:

Step 602: Acquire a training set corresponding to the commodity information, where the training set includes a plurality of commodity items and a plurality of training words corresponding to the commodity items.

In step 604, the number of words of the training words in the plurality of commodity items is counted, and the maximum number of words is marked as the longest input parameter.

In step 606, the word vector model is trained by using the longest input parameter and the training word, and the weight vector corresponding to the training word is obtained.

In step 608, the multi-layer cyclic neural network is trained by using the longest input parameter and the weight vector corresponding to the training word, and the trained multi-layer cyclic neural network is obtained.

A large number of sample files are stored in the database. The corresponding product information is recorded in the sample file. The item information recorded in the server sample file is marked as training data in a specific scale. Word vector models and multi-layered cyclic neural networks can be trained in advance through training data. Training data can be derived from existing product information. Product data and detailed descriptions are included in the training data. The server performs word segmentation on the detailed description of each commodity item to obtain multiple words. The server performs preprocessing such as data cleaning of multiple words and unification of output formats. For example, the server cleans the wrong data and cleans "128GD" to "128". The server unifies the capitalization of the English description and unifies "SEAGATE", "Seagate", and "SEagate" into "SEAGATE". The server uses the pre-processed word as a training word, and generates a training set by using a plurality of commodity items and training words corresponding to the commodity items.

The number of words in the training words of different commodity items is different. In order to fix the word vector model and the model structure of the multi-layered cyclic neural network, the trained word vector model and the trained multi-layered cyclic neural network are universal. In this embodiment, the longest input parameter and the longest output parameter are set for both the word vector model and the multi-layer cyclic neural network. The longest input parameter has the same value as the longest output parameter. The server may count the number of words of the training words in the plurality of item items, and mark the maximum number of words in the number of words of the training words in the item items as the longest input parameter. For a commodity item whose vocabulary quantity is less than the longest input parameter, the server may add a corresponding number of preset characters according to the vocabulary quantity of the commodity item and the longest input parameter. The preset characters may be characters that do not conflict with the product information, such as null characters. For example, the initial input parameter is 100, and the corresponding longest output parameter is also 100. If the vocabulary quantity of a commodity item is 30, the server adds 70 preset characters to the item.

The server trains the word vector model using the training words and the preset characters supplemented by the longest input parameters, thereby obtaining a weight vector corresponding to each training word and the preset character. The word vector model can adopt the Skip-Gram model, that is, the model can adopt a neural network structure, including an input vector, an implicit layer, and an output layer. In the traditional way, the final result is output through the output layer of the model, and the final result is a probability distribution. This probability distribution does not apply to multilayer cyclic neural networks. Therefore, in this embodiment, only the input vector of the model and the structure of the hidden layer are used, and the weight vector of the plurality of words is output through the hidden layer, and the operation is not continued through the output layer.

Since the word vector model and the multi-layered cyclic neural network only accept numerical inputs, the server also generates a corresponding training vocabulary based on the training words during training. Some of the preset characters are also recorded in the training vocabulary, taking into account the longest input parameters. The training vocabulary contains the code corresponding to each training word. The server generates an input vector of the word vector model according to the code corresponding to the training word, and performs an operation through the hidden layer to output a corresponding training weight matrix. The training weight matrix includes a plurality of training words and a weight vector corresponding to the preset characters. The server calls the multi-layer cyclic neural network, and obtains a plurality of training words and codes corresponding to the preset characters according to the longest input parameter, and inputs them into the multi-layer cyclic neural network for training.

In the process of training, because each training word weight vector obtained by the word vector model training is used, the vector state of each training word can be more accurately reflected, and the convergence effect of the multi-layer cyclic neural network can be effectively improved, thereby enabling Improve the accuracy of multi-layer cyclic neural network training. By setting the longest input parameter, the vocabulary corresponding to each commodity item reaches the same number as the longest data parameter, that is, the vocabulary corresponding to each commodity item is the same, thereby making the trained word vector model and Multi-layered cyclic neural networks after training are versatile. There is no need to train multiple models, which effectively reduces the workload of developers.

In one embodiment, the word vector model is trained by using the longest input parameter and the training word, and the weight vector corresponding to the training word is obtained by: obtaining a corpus corresponding to the product information, and the corpus includes a plurality of corpora; the corpus includes Partial preset characters; using the corpus to train the word vector model to obtain the corpus weight matrix; the corpus weight matrix includes multiple corpus weight vectors; using the preset characters to increase the vocabulary number of the training words of the plurality of commodity items to the longest Enter the same number of parameters; according to the product item after increasing the vocabulary quantity, select the training word and the corpus weight vector corresponding to one or more preset characters in the corpus weight matrix, and mark the input vector corresponding to the training word; A plurality of input vectors are loaded, and a training weight matrix is obtained by training the hidden layer of the word vector model. The training weight matrix includes a plurality of training words and a weight vector corresponding to the preset characters.

In order to further improve the convergence effect of the multi-layer cyclic neural network, and thus improve the accuracy of multi-layer cyclic neural network training, the server can also optimize the training process of the word vector model. Specifically, the server may crawl multiple corpus articles corresponding to the product information on multiple websites, and perform pre-processing on the corpus articles, including word segmentation, cleaning, and unified description formats. The server uses the pre-processed corpus to build a corpus. The corpus may also include some preset characters in consideration of the setting of the longest input parameter. The server encodes each corpus and preset characters in the corpus to obtain a corresponding corpus input vector. The server inputs multiple corpus input vectors into the input layer of the word vector model, and trains through the hidden layer to obtain a corpus weight matrix. The corpus weight matrix includes multiple corpus weight vectors.

The server increases the number of words per item item to the longest data parameter. The server selects the training word and the corpus weight vector corresponding to one or more preset characters in the corpus weight matrix, and marks the input vector corresponding to the training word. The word vector model loads a plurality of input vectors, and is trained by the hidden layer of the word vector model to obtain a plurality of training words and a training weight matrix corresponding to the preset characters.

In one embodiment, the multi-layer cyclic neural network is trained by using the longest input parameter, the training word, and the weight vector corresponding to the training word, and the trained multi-layer cyclic neural network includes: obtaining a mapping file corresponding to the commodity information, and mapping The document records the original description of the plurality of training words in the commodity item and the description of the preset format; uses the preset character to increase the number of vocabulary of the training words of the plurality of commodity items to the same number as the longest input parameter; And the weight vector corresponding to the preset character generates a training weight matrix corresponding to the commodity item; the training word, the preset character and the corresponding weight vector matrix in the commodity item after the vocabulary quantity is increased, and the training is performed through the multi-layer cyclic neural network, Outputs a description of the preset format of multiple training words in the item.

A mapping file corresponding to the commodity information is pre-stored in the server, and a description of the original description and the preset format of the plurality of training words in the commodity item is recorded in the mapping file. For example, if the item is "hard disk" and the original information is "Seagate/ST500LT012|003SDM1", after multi-layer cyclic neural network operation, it can be output as the following unified format:

"BRAND: SEAGATE, TYPE: HDD, SIZE: 500, CACHE: NaN, PRODUCT_NO: ST500LT012, RPM: NAN". Since each word in the product item is described in a preset format, it is possible to convert a plurality of original product information in different formats into a uniform format description.

Referring to the manner in the above embodiment, the server increases the number of words of the training words of the plurality of commodity items to the same number as the longest input parameter by using the preset characters, so that the number of words in each commodity item is the same. Using the training weight matrix obtained by the word vector model in the above embodiment, the server separately obtains a plurality of training words in each commodity item and a weight vector corresponding to the preset characters, and then generates a training weight matrix corresponding to each commodity item. The server may generate a forward weighted training weight matrix corresponding to each commodity item and a backward weighted training weight matrix with reference to the above embodiment.

Referring to the manner in the foregoing embodiment, the server acquires a plurality of words in each commodity item and a code corresponding to the preset character, inputs the corresponding code to the input layer of the multi-layer cyclic neural network, and sets the training weight matrix calculated in advance to the first A weight matrix of the forward estimation layer is set, and the backward weighted training weight matrix is set as the weight matrix of the first backward estimation layer. The initial weight matrix of each layer in the hidden layer is initialized, and the initial weight matrix of each layer in the hidden layer is estimated. After initialization, the server trains the multi-layer cyclic neural network to output a description of the preset format of multiple training words in the commodity item.

For example, if the longest input parameter is 100, the weight matrix of the first forward estimation layer in the multi-layer cyclic network may be set to 100, and the weight matrix of the first backward estimation layer in the multi-layer cyclic neural network may be set to 100. That is, each training word and preset character in the commodity item are configured with corresponding weight matrix in the loop training. The multi-layer circular network also outputs 100 results, which are described in terms of the preset format of the training words. For the output of the preset character, it can also be a preset character. There will be no impact on the training results. After training the multi-layered cyclic neural network with the longest input parameters, the trained multi-layered cyclic neural network can be adapted to use diversified commodity information.

In the traditional template matching manner, a corresponding output format is set for each training word through a mapping table, and the original description of each item in the commodity item has a one-to-one correspondence with the output description. If the item items are the same and the original information is different, the output format of the two item items cannot be unified. In this embodiment, the training is performed through the multi-layer cyclic network, so that the original description in each commodity item is not one-to-one correspondence with the output description, but it is ensured that each commodity item is output according to a preset unified format.

In one embodiment, the multi-layered cyclic neural network includes a plurality of hidden layers; a training word, a preset character, and a corresponding weight vector matrix in the commodity item after the vocabulary quantity is increased, through the multilayer cyclic neural network The training includes: assigning a random vector to each hidden layer as an initial weight matrix of the hidden layer; and training corresponding to the commodity item after the input layer and the first hidden layer are set to increase the vocabulary quantity according to the longest input parameter; Weight matrix; the code corresponding to the training word of the commodity item after increasing the vocabulary quantity and the code corresponding to the preset character are input to the input layer of the multi-layer cyclic neural network; the multi-layer hidden layer is trained by using the initial weight matrix and the training weight matrix The output layer outputs a description of a preset format of the plurality of training words in the product item.

When the server trains the multi-layer cyclic neural network through training words, each layer of hidden layers needs to be initialized. Each layer of hidden layers may include a forward estimation layer and a backward estimation layer. The forward estimation layer and the backward estimation layer of each hidden layer need to be initialized. In the traditional way, the initial weighting matrix corresponding to each layer of the hidden layer and the initial weighting matrix corresponding to the backward estimating layer are initialized to 0, but the generalized ability of the multi-layered cyclic neural network trained in this way is limited. If there are more different types of product information in the future, it may be necessary to retrain.

In this embodiment, at the time of initialization, the server assigns a random vector to the forward estimation layer and the backward estimation layer of each layer of the hidden layer as the initial weight matrix. The random vector may be an array of preset lengths, for example, 200 or 300 dimensions. After the initialization is completed, the server sets the training weight matrix corresponding to the commodity item after increasing the vocabulary quantity in the input layer and the first layer hidden layer, and increases the encoding and preset corresponding to the training word of the commodity item after the vocabulary quantity. The code corresponding to the character is input to the input layer of the multi-layer cyclic neural network. The method may be used to perform the training by using the initial weight matrix and the training weight matrix through the multi-layer hidden layer, and output the description of the preset format of the plurality of training words in the commodity item through the output layer.

Since each layer of the hidden layer configures a random vector as the initial weight matrix at the time of initialization, the generalization capability of the multi-layer cyclic neural network can be effectively improved, and it can be applied to more diverse product information in the future. Moreover, by setting the longest input parameter, the vocabulary corresponding to each commodity item is the same, thereby making the trained word vector model and the trained multi-layer cyclic neural network have versatility. There is no need to train multiple models, which effectively reduces the workload of developers.

In an embodiment, the method further includes: acquiring a number of sample files corresponding to the plurality of training sets; acquiring a verification set, the verification set includes words of the plurality of commodity items; and using the verification set to output the plurality of training sets after passing the training The preset format of the commodity item is verified; when the accuracy of the verification reaches the threshold, the number of sample files corresponding to the threshold value for the first time is marked as the number of sample files of the maximum batch training.

Multi-layered cyclic neural networks can perform batch training on training words in multiple samples. If the number of sample files for batch training is too small, the multi-layered cyclic neural network cannot learn the diversity of commodity information existing in the sample files. If the number of sample files for batch training is too large, multi-layered cyclic neural networks cannot accurately memorize diversified product information, and performance will be affected. Therefore, when training in a multi-layered cyclic neural network, it is necessary to determine the number of sample files for maximum batch training.

In this embodiment, the server may separately acquire a plurality of sample files to generate a training set. The training is performed by the word vector model and the multi-layer cyclic neural network, and the output corresponding to the number of each sample file is obtained. The server can also use the commodity information in other sample files to generate a verification set in advance. The verification set includes words corresponding to multiple item items. The server compares the output result corresponding to the number of sample files with the words in the verification set, thereby obtaining the accuracy corresponding to the number of sample files.

When the accuracy reaches the threshold, the server can mark the number of sample files when the threshold is reached for the first time as the number of sample files for the maximum batch training. Further, the server can also draw a corresponding curve by using different sample file numbers and their corresponding accuracy. There may be fluctuations in the curve. When the exact pair corresponding to the curve reaches the threshold, the ratio of the difference between the number of sample files corresponding to the threshold is calculated to be less than or equal to the preset ratio. If yes, the number of sample files that are initially less than or equal to the preset ratio is marked as the number of sample files for the maximum batch training. For example, the number of sample files whose accuracy reaches the threshold includes S1, S2, S3, S4, where S1 < S2 < S3 < S4. The preset ratio is assumed to be 2%. If (S2-S1)/S1≦2%, (S3-S1)/S1≦2%, (S4-S1)/S1≦2%, then S1 is marked as the maximum batch training. The number of sample files. Therefore, when the number of sample files in the maximum batch training enables the multi-layer cyclic neural network to perform batch training, it can effectively learn the diversification of commodity information, thereby providing the generalization ability of the multi-layer cyclic neural network.

It should be understood that although the steps in the flowcharts of FIGS. 2 and 6 are sequentially displayed as indicated by the arrows, these steps are not necessarily performed in the order indicated by the arrows. Except as explicitly stated herein, the execution of these steps is not strictly limited, and the steps may be performed in other orders. Moreover, at least some of the steps in FIG. 2 and FIG. 6 may include a plurality of sub-steps or stages, which are not necessarily performed at the same time, but may be executed at different times, or The order of execution of the stages is also not necessarily sequential, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.

In an embodiment, as shown in FIG. 7, a commodity information format processing apparatus is provided, including: an information acquisition module 702, configured to acquire commodity information, the commodity information includes a plurality of commodity items; and a word segmentation processing module 704, configured to Performing word segmentation on the content of the commodity item to obtain a plurality of words; the weight matrix generating module 706 is configured to acquire a weight vector corresponding to the plurality of words trained by the word vector model, and generate a weight matrix by using the weight vector corresponding to the plurality of words; And a format unification module 708, configured to obtain a code corresponding to a plurality of words of the commodity item, input the code of the plurality of words into the trained multi-layer cyclic neural network; and pass the trained multi-layer cyclic neural network, based on the plurality of The coding of the word and the weight matrix are operated to output a description of the preset format corresponding to the commodity item.

In an embodiment, the apparatus further includes: a first training module 710, configured to acquire a training set corresponding to the commodity information, where the training set includes a plurality of commodity items and a plurality of training words corresponding to the commodity items; and counting a plurality of commodity items The number of vocabulary of the training words, marking the maximum vocabulary quantity as the longest input parameter; training the word vector model with the longest input parameter and the training word to obtain the weight vector corresponding to the training word; and the second training module 712 for The multi-layer cyclic neural network is trained by using the longest input parameters and the weight vector corresponding to the training words, and the trained multi-layer cyclic neural network is obtained.

In one embodiment, the first training module 710 is further configured to obtain a corpus corresponding to the product information, where the corpus includes a plurality of corpus words; the corpus includes some preset characters; and the corpus is used to train the word vector model to obtain The corpus weight matrix; the corpus weight matrix includes a plurality of corpus weight vectors; the vocabulary number of the training words of the plurality of commodity items is increased to the same number as the longest input parameter by using the preset character; according to the commodity item after increasing the vocabulary quantity, The corpus weight matrix selects the training word and the corpus weight vector corresponding to one or more preset characters, which is marked as the input vector corresponding to the training word; loads multiple input vectors through the word vector model, and trains through the hidden layer of the word vector model A training weight matrix is obtained, and the training weight matrix includes a plurality of training words and a weight vector corresponding to the preset characters.

In an embodiment, the second training module 712 is further configured to obtain a mapping file corresponding to the commodity information, where the original description of the plurality of training words in the commodity item and a description of the preset format are recorded in the mapping file; The number of vocabulary of the training words of the commodity items is increased to the same number as the longest input parameter; the training weights corresponding to the training words and the preset characters are generated by the training weight matrix corresponding to the commodity items; and the commodity items after the vocabulary quantity is increased The training words, the preset characters and the corresponding weight vector matrix are trained by the multi-layer cyclic neural network to output a description of the preset format of the plurality of training words in the commodity item.

In an embodiment, the second training module 712 is further configured to allocate a random vector to each layer of the hidden layer as an initial weight matrix of the hidden layer; and set the input layer and the first layer hidden layer according to the longest input parameter. a training weight matrix corresponding to the commodity item after increasing the vocabulary quantity; the code corresponding to the training word of the commodity item after increasing the vocabulary quantity and the code corresponding to the preset character are input to the input layer of the multi-layer cyclic neural network; The hidden layer is trained by using the initial weight matrix and the training weight matrix, so that the output layer outputs a description of the preset format of the plurality of training words in the commodity item.

In an embodiment, the second training module 712 is further configured to acquire a number of sample files corresponding to the plurality of training sets; acquire a verification set, the verification set includes words of the plurality of commodity items; and use the verification set to train the plurality of training sets The preset format of the commodity item to be output is verified; when the accuracy of the verification reaches the threshold, the number of sample files corresponding to the threshold value for the first time is marked as the number of sample files of the maximum batch training.

For the specific definition of the commodity information format processing device, reference may be made to the above definition of the commodity information format processing method, and details are not described herein again. Each of the above-described commodity information format processing apparatuses may be implemented in whole or in part by software, hardware, and a combination thereof. Each of the above modules may be embedded in or independent of the processor in the computer device, or may be stored in a memory in the computer device in a software form, so that the processor invokes the operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in FIG. The computer device includes a processor, memory, network interface, and database connected by a system bus. The processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for operation of an operating system and computer readable instructions in a non-volatile storage medium. The non-volatile storage medium can be a computer-readable non-volatile storage medium. The database of the computer device is used to store commodity files as well as sample files and the like. The network interface of the computer device is used to communicate with an external server via a network connection. The computer readable instructions are executed by the processor to implement a commodity information format processing method. It will be understood by those skilled in the art that the structure shown in FIG. 8 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied. The specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.

One of ordinary skill in the art can understand that all or part of the process of implementing the above embodiments can be completed by computer readable instructions, which can be stored in a non-volatile computer. The readable storage medium, which when executed, may include the flow of an embodiment of the methods as described above. Any reference to a memory, storage, database or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory.

The technical features of the above embodiments may be arbitrarily combined. For the sake of brevity of description, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, It is considered to be the range described in this specification.

The above-mentioned embodiments are merely illustrative of several embodiments of the present application, and the description thereof is more specific and detailed, but is not to be construed as limiting the scope of the invention. It should be noted that a number of variations and modifications may be made by those skilled in the art without departing from the spirit and scope of the present application. Therefore, the scope of the invention should be determined by the appended claims.

Claims

A method for processing a commodity information format, comprising:

Obtaining commodity information, the commodity information including a plurality of commodity items;

Performing word segmentation on the content of the commodity item to obtain a plurality of words;

Obtaining a weight vector corresponding to the plurality of words trained by the word vector model, and generating a weight matrix by using a weight vector corresponding to the plurality of words;

Obtaining a code corresponding to the plurality of words of the product item, and inputting the code of the plurality of words into the trained multi-layer circulating neural network; and

And performing, by the trained multi-layer cyclic neural network, an operation based on the encoding of the plurality of words and the weight matrix, and outputting a description of a preset format corresponding to the commodity item.
The method of claim 1 further comprising:

Obtaining a training set corresponding to the commodity information, where the training set includes a plurality of commodity items and a plurality of training words corresponding to the commodity items;

Counting the number of vocabulary of training words in multiple commodity items, marking the maximum vocabulary quantity as the longest input parameter;

Using the longest input parameter and the training word, training the word vector model to obtain a weight vector corresponding to the training word; and

The multi-layer cyclic neural network is trained by using the longest input parameter and the weight vector corresponding to the training word to obtain a trained multi-layer cyclic neural network.
The method according to claim 2, wherein the training the word vector model by using the longest input parameter and the training word, and obtaining the weight vector corresponding to the training word comprises:

Obtaining a corpus corresponding to the commodity information, where the corpus includes a plurality of corpus words; the corpus includes some preset characters;

Using the corpus to train the word vector model to obtain a corpus weight matrix; the corpus weight matrix includes a plurality of corpus weight vectors;

Increasing the number of words of the training words of the plurality of commodity items to the same number as the longest input parameter by using a preset character;

Selecting a training word and a corpus weight vector corresponding to one or more preset characters in the corpus weight matrix according to the product item after increasing the vocabulary quantity, and marking the input vector corresponding to the training word;

A plurality of input vectors are loaded by the word vector model, and trained by the hidden layer of the word vector model to obtain a training weight matrix, where the training weight matrix includes a plurality of training words and a weight vector corresponding to the preset characters.
The method according to claim 2, wherein the multi-layer cyclic neural network is trained by using the longest input parameter, the training word, and the weight vector corresponding to the training word, and the training is performed after the training. The layer loop neural network includes:

Obtaining a mapping file corresponding to the commodity information, where the mapping file records a description of the original description and a preset format of the plurality of training words in the commodity item;

Increasing the number of words of the training words of the plurality of commodity items to the same number as the longest input parameter by using a preset character;

Generating, by the training word and the weight vector corresponding to the preset character, a training weight matrix corresponding to the commodity item; and

The training words, the preset characters, and the corresponding weight vector matrix in the commodity item after the vocabulary quantity are increased, and the multi-cycle neural network is trained to output a description of the preset format of the plurality of training words in the commodity item.
The method according to claim 4, wherein said multi-layered cyclic neural network nerve comprises a plurality of hidden layers; said training words, preset characters and corresponding weights in the item of goods after increasing the number of words A vector matrix, trained by the multilayer cyclic neural network, includes:

Assigning a random vector to each hidden layer as the initial weight matrix of the hidden layer;

And setting, according to the longest input parameter, a training weight matrix corresponding to the commodity item after increasing the vocabulary quantity in the input layer and the first layer hidden layer;

Transmitting the code corresponding to the training word of the commodity item after increasing the vocabulary quantity and the code corresponding to the preset character to the input layer of the multi-layer cyclic neural network; and

The initial weight matrix and the training weight matrix are trained by the multi-layer hidden layer, so that the output layer outputs a description of the preset format of the plurality of training words in the commodity item.
The method of claim 2, wherein the method further comprises:

Obtain the number of sample files corresponding to multiple training sets;

Obtaining a verification set, the verification set including words of a plurality of commodity items;

Using a verification set to verify a preset format of a plurality of training sets that are output after training; and

When the accuracy of the verification reaches the threshold, the number of sample files corresponding to the threshold for the first time is marked as the number of sample files of the maximum batch training.
A commodity information format processing device includes:

An information obtaining module, configured to acquire commodity information, where the commodity information includes a plurality of commodity items;

a word segmentation processing module, configured to perform word segmentation on the content of the commodity item to obtain a plurality of words;

a weight matrix generating module, configured to acquire a weight vector corresponding to a plurality of words trained by the word vector model, and generate a weight matrix by using a weight vector corresponding to the plurality of words; and

a format unification module, configured to acquire a code corresponding to a plurality of words of the commodity item, input a code of the plurality of words into the trained multi-layer cyclic neural network; and pass the trained multi-layer cyclic neural network, based on The encoding of the plurality of words and the weight matrix are operated to output a description of a preset format corresponding to the commodity item.
The device according to claim 7, wherein the device further comprises:

a first training module, configured to acquire a training set corresponding to the commodity information, where the training set includes a plurality of commodity items and a plurality of training words corresponding to the commodity items; and counting the number of vocabulary of the training words in the plurality of commodity items, and the maximum vocabulary The quantity is marked as the longest input parameter; the word vector model is trained by using the longest input parameter and the training word to obtain a weight vector corresponding to the training word;

The second training module is configured to train the multi-layer cyclic neural network by using the longest input parameter and the weight vector corresponding to the training word to obtain a trained multi-layer cyclic neural network.
A computer device comprising a memory and one or more processors having stored therein computer readable instructions, the computer readable instructions being executed by the one or more processors to cause the one or more The processors perform the following steps:

Obtaining commodity information, the commodity information including a plurality of commodity items;

Performing word segmentation on the content of the commodity item to obtain a plurality of words;

Obtaining a weight vector corresponding to the plurality of words trained by the word vector model, and generating a weight matrix by using a weight vector corresponding to the plurality of words;

Obtaining a code corresponding to the plurality of words of the product item, and inputting the code of the plurality of words into the trained multi-layer circulating neural network; and

And performing, by the trained multi-layer cyclic neural network, an operation based on the encoding of the plurality of words and the weight matrix, and outputting a description of a preset format corresponding to the commodity item.
The computer apparatus according to claim 9, wherein said computer readable instructions are executed by said one or more processors such that said one or more processors further perform the following steps:

Obtaining a training set corresponding to the commodity information, where the training set includes a plurality of commodity items and a plurality of training words corresponding to the commodity items;

Counting the number of vocabulary of training words in multiple commodity items, marking the maximum vocabulary quantity as the longest input parameter;

Using the longest input parameter and the training word, training the word vector model to obtain a weight vector corresponding to the training word; and

The multi-layer cyclic neural network is trained by using the longest input parameter and the weight vector corresponding to the training word to obtain a trained multi-layer cyclic neural network.
The computer apparatus according to claim 10, wherein said computer readable instructions are executed by said one or more processors such that said one or more processors further perform the following steps:

Obtaining a corpus corresponding to the commodity information, where the corpus includes a plurality of corpus words; the corpus includes some preset characters;

Using the corpus to train the word vector model to obtain a corpus weight matrix; the corpus weight matrix includes a plurality of corpus weight vectors;

Increasing the number of words of the training words of the plurality of item items to the same number as the longest input parameter by using a preset character;

Selecting a training word and a corpus weight vector corresponding to one or more preset characters in the corpus weight matrix according to the product item after increasing the vocabulary quantity, and marking the input vector corresponding to the training word;

A plurality of input vectors are loaded by the word vector model, and trained by the hidden layer of the word vector model to obtain a training weight matrix, where the training weight matrix includes a plurality of training words and a weight vector corresponding to the preset characters.
The computer apparatus according to claim 10, wherein said computer readable instructions are executed by said one or more processors such that said one or more processors further perform the following steps:

Obtaining a mapping file corresponding to the commodity information, where the mapping file records a description of the original description and a preset format of the plurality of training words in the commodity item;

Increasing the number of words of the training words of the plurality of commodity items to the same number as the longest input parameter by using a preset character;

Generating, by the training word and the weight vector corresponding to the preset character, a training weight matrix corresponding to the commodity item; and

The training words, the preset characters, and the corresponding weight vector matrix in the commodity item after the vocabulary quantity are increased, and the multi-cycle neural network is trained to output a description of the preset format of the plurality of training words in the commodity item.
The computer apparatus according to claim 12, wherein said multi-layered cyclic neural network neural comprises a plurality of hidden layers; said computer readable instructions being executed by said one or more processors such that said One or more processors also perform the following steps:

Assigning a random vector to each hidden layer as the initial weight matrix of the hidden layer;

And setting, according to the longest input parameter, a training weight matrix corresponding to the commodity item after increasing the vocabulary quantity in the input layer and the first layer hidden layer;

Transmitting the code corresponding to the training word of the commodity item after increasing the vocabulary quantity and the code corresponding to the preset character to the input layer of the multi-layer cyclic neural network; and

The initial weight matrix and the training weight matrix are trained by the multi-layer hidden layer, so that the output layer outputs a description of the preset format of the plurality of training words in the commodity item.
The computer apparatus according to claim 10, wherein said computer readable instructions are executed by said one or more processors such that said one or more processors further perform the following steps:

Obtain the number of sample files corresponding to multiple training sets;

Obtaining a verification set, the verification set including words of a plurality of commodity items;

Using a verification set to verify a preset format of a plurality of training sets that are output after training; and

When the accuracy of the verification reaches the threshold, the number of sample files corresponding to the threshold for the first time is marked as the number of sample files of the maximum batch training.
One or more non-transitory computer readable storage mediums storing computer readable instructions, when executed by one or more processors, cause the one or more processors to perform the following steps:

Obtaining commodity information, the commodity information including a plurality of commodity items;

Performing word segmentation on the content of the commodity item to obtain a plurality of words;

Obtaining a weight vector corresponding to the plurality of words trained by the word vector model, and generating a weight matrix by using a weight vector corresponding to the plurality of words;

Obtaining a code corresponding to the plurality of words of the product item, and inputting the code of the plurality of words into the trained multi-layer circulating neural network; and

And performing, by the trained multi-layer cyclic neural network, an operation based on the encoding of the plurality of words and the weight matrix, and outputting a description of a preset format corresponding to the commodity item.
The storage medium of claim 15 wherein said computer readable instructions are executed by one or more processors such that said one or more processors further perform the steps of:

Obtaining a training set corresponding to the commodity information, where the training set includes a plurality of commodity items and a plurality of training words corresponding to the commodity items;

Counting the number of vocabulary of training words in multiple commodity items, marking the maximum vocabulary quantity as the longest input parameter;

Using the longest input parameter and the training word, training the word vector model to obtain a weight vector corresponding to the training word; and

The multi-layer cyclic neural network is trained by using the longest input parameter and the weight vector corresponding to the training word to obtain a trained multi-layer cyclic neural network.
The storage medium of claim 16 wherein said computer readable instructions are executed by one or more processors such that said one or more processors further perform the steps of:

Obtaining a corpus corresponding to the commodity information, where the corpus includes a plurality of corpus words; the corpus includes some preset characters;

Using the corpus to train the word vector model to obtain a corpus weight matrix; the corpus weight matrix includes a plurality of corpus weight vectors;

Increasing the number of words of the training words of the plurality of commodity items to the same number as the longest input parameter by using a preset character;

Selecting a training word and a corpus weight vector corresponding to one or more preset characters in the corpus weight matrix according to the product item after increasing the vocabulary quantity, and marking the input vector corresponding to the training word;

A plurality of input vectors are loaded by the word vector model, and trained by the hidden layer of the word vector model to obtain a training weight matrix, where the training weight matrix includes a plurality of training words and a weight vector corresponding to the preset characters.
The storage medium of claim 16 wherein said computer readable instructions are executed by one or more processors such that said one or more processors further perform the steps of:

Obtaining a mapping file corresponding to the commodity information, where the mapping file records a description of a original description and a preset format of a plurality of training words in the commodity item;

Increasing the number of words of the training words of the plurality of commodity items to the same number as the longest input parameter by using a preset character;

Generating, by the training word and the weight vector corresponding to the preset character, a training weight matrix corresponding to the commodity item; and

The training words, the preset characters, and the corresponding weight vector matrix in the commodity item after the vocabulary quantity are increased, and the multi-cycle neural network is trained to output a description of the preset format of the plurality of training words in the commodity item.
A storage medium according to claim 18, wherein said multi-layered cyclic neural network neural comprises a plurality of hidden layers, said computer readable instructions being executed by one or more processors such that said one or more The processors also perform the following steps:

Assigning a random vector to each hidden layer as the initial weight matrix of the hidden layer;

And setting, according to the longest input parameter, a training weight matrix corresponding to the commodity item after increasing the vocabulary quantity in the input layer and the first layer hidden layer;

Transmitting the code corresponding to the training word of the commodity item after increasing the vocabulary quantity and the code corresponding to the preset character to the input layer of the multi-layer cyclic neural network; and

The initial weight matrix and the training weight matrix are trained by the multi-layer hidden layer, so that the output layer outputs a description of the preset format of the plurality of training words in the commodity item.
The storage medium of claim 16 wherein said computer readable instructions are executed by one or more processors such that said one or more processors further perform the steps of:

Obtain the number of sample files corresponding to multiple training sets;

Obtaining a verification set, the verification set including words of a plurality of commodity items;

Using a verification set to verify a preset format of a plurality of training sets that are output after training; and

When the accuracy of the verification reaches the threshold, the number of sample files corresponding to the threshold for the first time is marked as the number of sample files of the maximum batch training.