CN115374845A - Commodity information reasoning method and device - Google Patents

Commodity information reasoning method and device Download PDF

Info

Publication number
CN115374845A
CN115374845A CN202210946464.6A CN202210946464A CN115374845A CN 115374845 A CN115374845 A CN 115374845A CN 202210946464 A CN202210946464 A CN 202210946464A CN 115374845 A CN115374845 A CN 115374845A
Authority
CN
China
Prior art keywords
commodity
information
model
target
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210946464.6A
Other languages
Chinese (zh)
Inventor
浦洁
陈玄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Financial Technology Co Ltd
Original Assignee
Bank of China Financial Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Financial Technology Co Ltd filed Critical Bank of China Financial Technology Co Ltd
Priority to CN202210946464.6A priority Critical patent/CN115374845A/en
Publication of CN115374845A publication Critical patent/CN115374845A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Abstract

The invention provides a commodity information reasoning method and a device, which relate to the technical field of artificial intelligence, and the method comprises the following steps: acquiring input information of a reasoning model according to the target attribute of the target commodity and each commodity information text in the information base; inputting the input information into a reasoning model to obtain a related commodity information text of the target commodity output by the reasoning model; the associated commodity information text is a commodity information text associated with the target commodity in the information base; the inference model is obtained by training a pre-training language representation BERT model based on target attributes of sample commodities in a commodity library, commodity information texts in an information library and associated commodity information texts of the sample commodities. The method realizes that the inference model obtained based on BERT model training can be suitable for various commodities in the commodity knowledge field, has certain universality and universality, and effectively improves the accuracy of commodity inference results.

Description

Commodity information reasoning method and device
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a commodity information reasoning method and device.
Background
In the era of explosive growth and redundancy of information volume, text information is the most common and easily stored data and is widely applied to the field of e-commerce, and nowadays there are tens of thousands of commodity information and descriptions thereof. The method is a crucial problem in the field of e-commerce, how to mine and classify massive commodity information and how to quickly identify and market new commodities.
In the prior art, a text reasoning mode is usually adopted to establish an incidence relation between commodities and commodity information, and then commodity information corresponding to the commodities is deduced. For example, ESIM (Enhanced Long Short-Term Memory for Natural language Inference) is used. ESIM takes a Bi-directional Long Short-Term Memory network (Bi-directional Long Short-Term Memory network) as an encoder, converts texts at two ends into vector representations and interacts with the vectors, deduces in a model by connecting each section of vector containing different information as input, and outputs a judgment result after reasoning to deduce commodity information of a commodity.
Although the ESIM model can infer the relevance between the commodity and the commodity information, since the ESIM model only learns based on the corpus data in a specific range, the ESIM model is sensitive to the data in the specific range, the generalization performance is poor, the inference performance of the ESIM model is poor, and the commodity inference result is inaccurate.
Disclosure of Invention
The invention provides a commodity information reasoning method and a commodity information reasoning device, which are used for solving the defect that in the prior art, the reasoning result of a commodity is inaccurate due to poor reasoning performance of an ESIM (electronic signature model), and the accuracy of the commodity reasoning result is improved.
The invention provides a commodity information reasoning method, which comprises the following steps:
acquiring input information of a reasoning model according to the target attribute of the target commodity and each commodity information text in the information base;
inputting the input information into the inference model to obtain a related commodity information text of the target commodity output by the inference model; the associated commodity information text is a commodity information text associated with the target commodity in the information base;
the inference model is obtained by training a pre-training language representation BERT model based on target attributes of sample commodities in a commodity library, commodity information texts in the information library and associated commodity information texts of the sample commodities.
According to the commodity information reasoning method provided by the invention, the specific training steps of the reasoning model comprise:
extracting a first number of first sample commodities from the commodity library;
according to the target attribute of the first sample commodity, each commodity information text in the information base, the associated commodity information text of the first sample commodity, a first training strategy and a second training strategy, performing deep pre-training on the parameters of the BERT model; wherein the first training strategy is determined based on a Mask Language Model (MLM) task, and the second training strategy is determined based on a Next Sentence Prediction (NSP) task;
extracting a second quantity of a second sample item from the item store;
based on the target attribute of the second sample commodity, the commodity information texts in the information base and the associated commodity information texts of the second sample commodity, finely adjusting the parameters of the deeply pre-trained BERT model;
and acquiring the inference model according to the finely adjusted BERT model pre-trained deeply.
According to the commodity information reasoning method provided by the invention, the BERT model comprises a first BERT model and a second BERT model;
the deep pre-training of the parameters of the BERT model comprises:
respectively carrying out deep pre-training on the parameters of the first BERT model and the parameters of the second BERT model;
the fine tuning of the parameters of the deeply pre-trained BERT model comprises the following steps:
respectively finely adjusting the parameters of the first BERT model after the deep pre-training and the parameters of the second BERT model after the deep pre-training;
the obtaining of the inference model according to the finely tuned depth pre-trained BERT model includes:
selecting a model with optimal model performance from the first BERT model after the fine tuning and the second BERT model after the fine tuning and the deep pre-training;
and acquiring the inference model according to the model with the optimal model performance.
According to the commodity information inference method provided by the invention, the input information is input into the inference model to obtain the associated commodity information text of the target commodity output by the inference model, and the method comprises the following steps:
inputting the input information into the reasoning model to obtain the association degree between the target commodity and each commodity information text;
and taking the commodity information text with the maximum correlation degree as the correlated commodity information text of the target commodity.
According to the commodity information inference method provided by the invention, the input information of the inference model is obtained according to the target attribute of the target commodity and each commodity information text in the information base, and the method comprises the following steps:
respectively extracting the target attribute of the target commodity and the information texts of the commodities in the information base to obtain a first characteristic vector of the target commodity and a second characteristic vector of the information texts of the commodities;
determining whether a length of each of the first feature vector and the second feature vector is greater than a maximum input text length of the inference model;
shaping the feature vector with the length larger than the maximum input text length; wherein the length of the shaped feature vector is less than or equal to the maximum input text length;
and acquiring the input information of the inference model according to the shaped first eigenvector and the shaped second eigenvector.
According to the commodity information inference method provided by the invention, the input information of the inference model is obtained according to the target attribute of the target commodity and each commodity information text in the information base, and the method comprises the following steps:
preprocessing the target attribute of the target commodity and the information text of each commodity;
the preprocessing comprises word coding, position coding and sentence coding;
and acquiring the input information according to the target attribute of the preprocessed target commodity and the information text of each preprocessed commodity.
According to the commodity information reasoning method provided by the invention, the associated commodity information text of the target commodity comprises one or more combinations of commodity categories, commodity applicable scenes, commodity applicable objects and commodity key attributes.
The present invention also provides a commodity information inference device, including:
the processing module is used for acquiring input information of the reasoning model according to the target attribute of the target commodity and each commodity information text in the information base;
the reasoning module is used for inputting the input information into the reasoning model to obtain the associated commodity information text of the target commodity output by the reasoning model; the associated commodity information text is a commodity information text associated with the target commodity in the information base;
the inference model is obtained by training a pre-training language representation BERT model based on target attributes of sample commodities in a commodity library, commodity information texts in the information library and associated commodity information texts of the sample commodities.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the program, the commodity information reasoning method is realized.
The present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the merchandise information inference method as described in any one of the above.
The invention also provides a computer program product comprising a computer program, wherein the computer program is used for realizing the commodity information reasoning method when being executed by a processor.
According to the commodity information reasoning method and device, the BERT model is trained based on the sample commodities in the commodity library, the commodity information texts in the information library and the associated commodity information texts of the sample commodities, so that a reasoning model suitable for commodity field knowledge can be constructed, the prior general field knowledge of the BERT model is fused, the capability of accurately learning commodity information in the E-commerce field is realized, the general reasoning model can be used in different E-scenes and markets, the associated commodity information text reasoning of the commodity can be realized according to the target attribute of the target commodity, the reasoning model can be suitable for various commodities in the commodity knowledge field, and has certain universality and universality, and the obtained commodity reasoning result is more accurate.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a merchandise information inference method provided by the present invention;
FIG. 2 is a schematic structural diagram of a reasoning model in the commodity information reasoning method provided by the present invention;
FIG. 3 is a second schematic flow chart of the merchandise information inference method provided by the present invention;
FIG. 4 is a schematic diagram of a data preprocessing result in the commodity information inference method provided by the present invention;
FIG. 5 is a third schematic flow chart of a merchandise information inference method provided by the present invention;
FIG. 6 is a schematic structural diagram of a merchandise information inference device provided by the present invention;
fig. 7 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the prior art, an ESIM model is generally adopted for text reasoning, but the problem of inaccurate commodity reasoning result due to poor reasoning performance exists.
In addition, some researchers have proposed word to vector (word to vector, correlation model for generating word vectors) models or seq2 vector (sequence to vector, correlation model for generating text sequences) models and the like for generating embedded models for generating word vectors or text vectors to perform text similarity calculation, thereby realizing information inference. The word2vec model or seq2vec model is a method for establishing a text prediction task through a neural network and finally extracting text representation. Vectors extracted through the word2vec model or the seq2vec model are directly used for calculating the relevance between texts, and particularly, the relation between the texts is represented through distance measurement. And partial scholars propose a double-tower model to calculate text similarity so as to realize information reasoning. The double-tower model is characterized in that two sections of texts are trained simultaneously by using two identical neural networks, different vector representations are extracted respectively and combined, then the combined vector representations are used as the input of a downstream network structure for training, and finally the similarity measurement of the texts is output. Common two-tower models include DSSM (Deep Structured Semantic matching network) and its variant SEET (Squeeze-and-Excitation Networks), among others. However, both of the above methods are based on a characterization algorithm, that is, only one vector is generated for each text, and then calculation is performed, so that interaction between the texts is lacked, and an accurate commodity information inference result cannot be obtained.
In view of the above problems, this embodiment provides a commodity information inference method, which trains a BERT (Bidirectional Encoder retrieval from transformations) model based on a sample commodity in a commodity library, commodity information texts in an information library, and associated commodity information texts of the sample commodity to construct an inference model with general domain knowledge and commodity domain knowledge, learns an association relationship between a target commodity and each commodity information text in the information library through the inference model, and further outputs an associated commodity information text of the target commodity.
It should be noted that the main execution body of the method is a commodity information inference device, and the device may be an electronic device, a component in the electronic device, an integrated circuit, or a chip. The electronic device may be a mobile electronic device or a non-mobile electronic device. For example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted electronic device, a wearable device, or the like, and the non-mobile electronic device may be a server, a network attached storage, a personal computer, or the like, and the present invention is not limited in particular. The device can realize a commodity information method under a pyrrch framework based on python3.6, and a pyinstteller is used for packaging and packaging the inference model and the items.
The commodity information inference method of the present invention is described below with reference to fig. 1 to 5.
As shown in fig. 1, which is one of the flow diagrams of the commodity information inference method according to the embodiment of the present application, the method includes the following steps:
step 101, acquiring input information of an inference model according to target attributes of target commodities and information texts of commodities in an information base;
the target product is all products that need to perform the product information inference, such as novels, videos, movies, financing products, and home appliances, and this embodiment does not specifically limit this. The commodity information reasoning method provided by the embodiment can be suitable for information reasoning scenes of various types of commodities.
The target attribute includes, but is not limited to, one or more of a model number, a logo, and a name, which is not specifically limited in this embodiment.
The commodity information text is used for describing commodity information; the commodity information text includes, but is not limited to, commodity key attribute information, a commodity category, and a commodity usage scenario, which is not specifically limited in this embodiment.
The content included in the target attribute may be the same as or different from the content included in the same product key attribute information.
Optionally, after the target attribute of the target product and the product information texts in the information base are obtained, the target attribute of the target product and the product information texts in the information base may be directly used as input information of the inference model, or one or more data processes may be performed on the target attribute of the target product and the product information texts in the information base, respectively, and the processed target attribute of the target product and the processed product information texts may be used as input information.
The data processing includes, but is not limited to, feature extraction, data encoding, word embedding vector acquisition, and data shaping, which is not specifically limited in this embodiment.
Word Embedding (i.e., embedding) is a method of converting words into vectors in deep learning, and the corresponding vectors are called word Embedding vectors or token vectors. The corresponding word embedding is similar, and the embedding object is a single character.
Step 102, inputting the input information into the inference model to obtain a related commodity information text of the target commodity output by the inference model; the associated commodity information text is a commodity information text associated with the target commodity in the information base; the inference model is obtained by training a pre-training language representation BERT model based on target attributes of sample commodities in a commodity library, commodity information texts in the information library and associated commodity information texts of the sample commodities.
The BERT model is a pre-training language model provided by Google, wherein pre-training is to use large-scale corpus texts to perform unified training in advance, and then apply the pre-training model to downstream tasks according to actual application scenes, at the moment, part of models which are in the middle and bottom layers and are common in the downstream tasks in the model are trained well in advance, and when the model is required to process specific downstream tasks, sample data of the downstream tasks can be used for training corresponding models, so that the convergence speed of the models can be greatly increased. Therefore, the BERT model can be used as an embedded vector for extracting texts in downstream tasks or directly applied to classification tasks, and the pre-training model is finely adjusted through a new training set, so that a deep learning model with specified function requirements is achieved.
As shown in fig. 2, the BERT model is implemented based on multiple layers of transformers as encoders, each layer generates different information, and different numbers of layers are selected to extract information in downstream tasks according to actual needs.
As shown in fig. 3, before step 102 is executed, the inference model needs to be trained, and the specific training steps include:
firstly, acquiring original data, performing data conversion on target attributes of sample commodities in a commodity library in the original data and commodity information texts in an information library to serve as input information, and using associated commodity information texts of the sample commodities in the original data as tags to construct and generate a sample data set;
then, dividing the sample data set into a training data set, a verification data set and a test data set; the training data set is a data set for training the inference model, so that the inference model learns parameters according to the prediction result of the training data set, and the optimal overall parameters are achieved continuously through iteration. The training set occupies the largest proportion in the data set, and the specific proportion can be set according to actual requirements, such as the data size containing 80% of the sample data set. The verification data set is data for verifying the training result of the inference model in the training process, and the data does not participate in the training, but the model is adjusted for subsequent training according to the performance on the verification set. The test data set is data for testing the generalization performance of the model after the training process is completely finished, and the part of data is completely independent and trained, wherein the samples do not appear in the training set and the verification set.
In the training process, the BERT model can be integrally trained based on a training data set to obtain a reasoning model which can accurately reason out a corresponding associated commodity information text according to the target attribute of the target commodity; or, a part of training data is extracted from the training data set, the BERT model is trained in advance, then another part of training data is extracted, and the inference model is trained again on the basis of the inference model trained in advance, so as to obtain the inference model capable of accurately inferring the corresponding associated commodity information text according to the target commodity.
And in the training process, the verification data set and the test data set are used for evaluating the reasoning performance and effect of the trained reasoning model, and after the requirements of two stages of parameter adjusting optimization and effect evaluation are met, the reasoning model can be deployed and brought on line or certain experimental requirements are met.
In summary, in this embodiment, based on massive commodity information, an applicable scene, an applicable object, and key attributes of each commodity are extracted, and a BERT pre-training model is based on which inference learning of commodity general knowledge information is performed to obtain an inference model about commodity information.
After the inference model is obtained, input information determined by the target commodity and each commodity information text in the information base can be input into the inference model so as to quickly and accurately obtain the associated commodity information text of the target commodity.
In the embodiment, the BERT model is trained based on the sample commodities in the commodity library, the commodity information texts in the information library and the associated commodity information texts of the sample commodities, so that a reasoning model suitable for commodity field knowledge can be constructed, the prior general field knowledge of the BERT model is fused, the capability of accurately learning commodity information in the E-commerce field is realized, the general reasoning model can be used in different E-commerce scenes, the associated commodity information text reasoning on the commodity can be realized according to the target attribute of the target commodity, the reasoning model is suitable for various commodities in the commodity knowledge field, and the method has certain universality and universality, and the obtained commodity reasoning result is more accurate.
In some embodiments, the specific training step of the inference model comprises:
extracting a first number of first sample commodities from the commodity library;
according to the target attribute of the first sample commodity, the commodity information texts in the information base, the associated commodity information texts of the first sample commodity, a first training strategy and a second training strategy, deep pre-training is carried out on the parameters of the BERT model; wherein the first training strategy is determined based on a Mask Language Model (MLM) task, and the second training strategy is determined based on a Next Sentence Prediction (NSP) task;
extracting a second quantity of second sample goods from the goods repository;
based on the target attribute of the second sample commodity, the commodity information texts in the information base and the associated commodity information texts of the second sample commodity, finely adjusting the parameters of the deeply pre-trained BERT model;
and acquiring the inference model according to the finely adjusted BERT model pre-trained in depth.
The first quantity and the second quantity may be the same or different, and are specifically set according to actual requirements, for example, the first quantity is 3200 ten thousand, and the second quantity is 800 ten thousand.
The MLM (Masked Language Model) strategy randomly replaces token (lemma) in each training sequence with MASK token (lemma) at a probability of 15%, such as [ MASK ], and then predicts a training task of an original word at the position of [ MASK ], in order to enhance the comprehension capability of the context of the inference Model.
The NSP (Next sequence Prediction) strategy is used to predict whether two sentences are joined together. For sentence A and sentence B, the training sample is input into the model, and whether sentence B is the next sentence of sentence A or not is judged, so that the inference model has the capability of sentence relevance learning.
Optionally, an original data set is obtained first, and the original data set contains a large amount of sample data, for example, a total amount of 210GB.
Then, the commodities, the commodity information texts (including but not limited to commodity categories, commodity use scenes, commodity key attribute information and the like) and the association relation between the commodities and the commodity information files can be analyzed from the original data set, triples (commodity target attributes, commodity information texts, entity relations) are screened and processed to be stored, a plurality of triples of information can be obtained after processing, and the specific number can be set according to actual requirements, such as 4000 ten thousand triples.
And then, performing data conversion on the triple information to convert the triple information into an input form suitable for a BERT model, and obtaining a sample data set.
Since the BERT model is trained based on a large number of general texts, whether pre-training needs to be performed on a specific field of a task or not needs to be considered when transferring to a downstream task, so that the data distribution of the model is more consistent with the downstream task. The language material in the sample data set has a large scale, and the acquired BERT model is obtained by pre-training based on a large number of general language materials considering that all the language materials are commodity information in the E-commerce field, so that the language material in the sample data set and the language material used in the acquired BERT model have large difference; in order to obtain an inference model which has knowledge in the E-commerce field and good generalization performance and can quickly and accurately output a related commodity information text of a target commodity, commodity triple information containing a first data volume can be extracted first, and a pretraining model (namely the pretrained model after the pretraining) based on the commodity field is obtained by performing deep pretraining on the BERT model based on MLM and NSP tasks.
And performing text reasoning training by utilizing the linguistic data in the commodity triple information of the second data volume, and performing fine tuning training again on the deeply pre-trained BERT model by using the output of the last layer of the model as a representation vector and using the dynamic learning rate based on a BERT fine tuning method to obtain a reasoning model capable of accurately outputting a commodity information reasoning result.
In the process of fine tuning the BERT model after deep pre-training, various training strategies can be adopted for training, so that the BERT model with the optimal model performance is used as a final inference model. The training batch size and the maximum iteration number of each training strategy are different, and this embodiment does not specifically limit this. For example, the training batch size of the first training strategy is 1000, and the maximum number of iterations is 1000; the training batch size of the second training strategy is 1200 and the maximum number of iterations is 950.
In the training process, the selection of the learning rate is crucial, and as each layer of the BERT model has different learning rates, the selection of the proper learning rate is also a key factor which finally influences the model performance. The learning rate may be selected according to actual requirements, or may be adaptively selected according to an optimization algorithm, such as a genetic algorithm, which is not specifically limited in this embodiment.
In the embodiment, the BERT model is trained in a multi-level and multi-strategy mode through sample commodities in a commodity library, so that the inference model obtained through training not only has knowledge in the commodity information field, but also can quickly and accurately output the associated commodity information text of the target commodity, further has good generalization performance, and effectively improves the accuracy of commodity information inference.
The generalization performance refers to the prediction capability of the model to the unknown sample, and the stronger the generalization capability is, the more accurate the model is to the prediction of new unknown data.
In some embodiments, the BERT model comprises a first BERT model and a second BERT model;
the deep pre-training of parameters of the BERT model comprises:
respectively carrying out deep pre-training on the parameters of the first BERT model and the parameters of the second BERT model;
the fine tuning of the parameters of the deeply pre-trained BERT model comprises the following steps:
respectively finely adjusting the parameters of the first BERT model after the deep pre-training and the parameters of the second BERT model after the deep pre-training;
the obtaining of the inference model according to the finely tuned depth pre-trained BERT model includes:
selecting a model with optimal model performance from the first BERT model after the fine tuning and the second BERT model after the fine tuning and the deep pre-training;
and acquiring the inference model according to the model with the optimal model performance.
The first BERT model is an original BERT model, and the second BERT model is a derivative model derived based on the original BERT model, which includes but is not limited to BERT-MRC (Bidirectional Encoder reporting from computer machinery Reading understanding model based on a pre-trained language Representation model), BERT-MRC-WWM (conveying with white Word Masking for Bidirectional Encoder reporting from computer machinery Reading understanding model based on a pre-trained language Representation model), and the like, which are not specifically limited in this embodiment.
The original BERT model may be a chinese BERT model (i.e., BERT-base-Chinese) trained based on a large amount of chinese corpora.
And in the training process, the BERT-base-chip model, the BERT-MRC model and the BERT-MRC-WWM model are loaded to complete the deep pre-training, fine adjustment and other related optimization in the training process, thereby realizing the learning process of commodity common knowledge.
Optionally, the step of specifically training the inference model includes:
firstly, in order to enable a reasoning model to have good commodity information reasoning performance, training a first BERT model and deep pre-training a second BERT model according to a target attribute of a first sample commodity, each commodity information text in an information base, a related commodity information text (namely commodity triple information of a first data volume) of the first sample commodity, a first training strategy and a second training strategy to obtain the first BERT model after deep pre-training and the second BERT model after deep pre-training; and according to the target attribute based on the second sample commodity, each commodity information text in the information base and the associated commodity information text (namely the commodity triple information of the second data volume) of the second sample commodity, finely tuning the first BERT model after the deep pre-training, and finely tuning the second BERT model after the deep pre-training to obtain the finely tuned first BERT model after the deep pre-training and the finely tuned second BERT model after the deep pre-training.
And then, calculating model performance evaluation indexes of the finely adjusted depth pre-trained first BERT model and the finely adjusted depth pre-trained second BERT model, and selecting a model with optimal model performance from the finely adjusted depth pre-trained first BERT model and the finely adjusted depth pre-trained second BERT model as an inference model according to the model performance evaluation indexes.
The model performance evaluation index includes, but is not limited to, an accuracy index and a recall index, which is not specifically limited in this embodiment.
In the embodiment, the deep pre-training and the fine tuning training are respectively performed on the first BERT model and the second BERT model, and the model with the optimal model performance is selected from the first BERT model after the fine tuning and the second BERT model after the fine tuning and the deep pre-training, so as to obtain the inference model with the optimal model performance, thereby improving the accuracy of commodity information inference.
In some embodiments, the inputting the input information into the inference model in step 102 to obtain the associated product information text of the target product output by the inference model includes:
inputting the input information into the reasoning model to obtain the association degree between the target commodity and each commodity information text;
and taking the commodity information text with the maximum correlation degree as the correlated commodity information text of the target commodity.
Optionally, after the input information of the inference model is acquired, the inference model may be loaded to learn the input information, so as to acquire the association degree between the target product and each product information text.
Then, the association degrees between the target product and the product information texts are sorted, so that the product information text with the maximum association degree with the target product is selected from the information base and serves as the associated product information text of the target product.
According to the embodiment, the associated commodity information text of the target commodity can be quickly and accurately determined according to the association degree between the target commodity and each commodity information text, and the commodity information can be accurately inferred for the user.
In some embodiments, the obtaining input information of the inference model according to the target attribute of the target product and the product information texts in the information base in step 101 includes:
respectively extracting the target attribute of the target commodity and the information texts of the commodities in the information base to obtain a first characteristic vector of the target commodity and a second characteristic vector of the information texts of the commodities;
determining whether a length of each of the first feature vector and the second feature vector is greater than a maximum input text length of the inference model;
shaping the feature vector with the length larger than the maximum input text length; wherein the length of the shaped feature vector is less than or equal to the maximum input text length;
and acquiring the input information of the inference model according to the shaped first eigenvector and the shaped second eigenvector.
Optionally, the longest text sequence of the BERT model is 512, and in order to ensure validity of input information and further improve validity of commodity information inference, after a target attribute of a target commodity and each commodity information text in an information base are obtained, feature extraction may be performed on the target commodity and each commodity information text in the information base respectively to obtain a first feature vector of the target commodity and a second feature vector of each commodity information text in the information base. Wherein, the feature extraction includes but is not limited to word vector extraction and semantic feature vector extraction.
Then, judging whether the first characteristic vector of the target commodity and the second characteristic vector of each commodity information text in the information base are larger than the maximum input text length of the inference model or not; and then, shaping the feature vector which is larger than the maximum input text length of the inference model, so that the shaped first feature vector and the shaped second feature vector are both smaller than or equal to the maximum input text length.
And then, taking the shaped first characteristic vector and the shaped second characteristic vector as input information of a reasoning model so as to accurately deduce a related commodity information text of the target commodity.
In some embodiments, the obtaining input information of the inference model according to the target attribute of the target product and the product information texts in the information base in step 101 includes:
preprocessing the target attribute of the target commodity and the information text of each commodity;
the preprocessing comprises word coding, position coding and sentence coding;
and acquiring the input information according to the target attribute of the preprocessed target commodity and the information text of each preprocessed commodity.
For a commodity information reasoning scene, input information is a (commodity target attribute, commodity information text and entity relationship) triple, namely the input information comprises two text sequences, so that sentence coding (namely Segment Embedding) input information needs to be constructed when a data set is constructed, and 0 and 1 are respectively used for distinguishing a target commodity and a commodity information text; meanwhile, for Position encoding (namely Position Embedding), input of the Position encoding does not need to be specified in any scene, because the Position Embedding is a parameter initialized by the BERT model in the training process, and random initialization is already carried out during implementation.
Token Embedding is used for converting each word into a fixed dimension;
therefore, in the commodity information reasoning task, only the coding sequence of the corresponding Token of the text in the triple needs to be constructed, and the initial marker [ CLS ] and the sentence interval marker [ SEP ] are respectively added at the head position as the input information. In addition, a corresponding Segment Embedding vector needs to be generated according to the target attribute of the target product, such as the name, and the length of the product information text.
As shown in fig. 4, the target attribute of the target product and the information text of each product in the information library may be preprocessed by Position Embedding, segment Embedding, token Embedding, etc. to obtain the input information of the inference model.
In the embodiment, the input information more suitable for the BERT model is obtained by carrying out coding pretreatment on the target commodity and each commodity information text in the information base, so that the efficiency, the accuracy and the reliability of commodity information reasoning are improved.
In some embodiments, the associated commodity information text of the target commodity includes one or more combinations of commodity categories, commodity applicable scenes, commodity applicable objects and commodity key attributes.
The key attributes of the commodity include, but are not limited to, one or more of a commodity use value, commodity selling point information, and commodity description information, which is not specifically limited in this embodiment.
Correspondingly, each commodity information text also comprises one or more combinations of commodity categories, commodity use scenes and commodity key attribute information.
In this embodiment, according to the inference model, the relationship between the target product and the common sense knowledge can be inferred quickly and accurately, and the associated common sense knowledge of the target product is obtained.
As shown in fig. 5, a complete flow diagram of the commodity information inference method provided in this embodiment includes the specific steps of:
step 501, data processing; specifically, an original data set is obtained first, and the original data set is processed to obtain a sample data set suitable for the input specification of the BERT model; computing the triple information in the sample data set to carry out preprocessing such as Position Embedding, segment Embedding, token Embedding and the like so as to obtain input information of the inference model; and dividing the sample data set to obtain a training data set, a verification data set and a test data set.
502, training a reasoning model; 3200w triple corpus data are extracted from the training data set to carry out deep pre-training on the loaded one or more BERT models based on the MLM task and the NSP task again, so that a deep pre-training model based on the commodity field is obtained.
And then performing text reasoning training on the residual triple corpus data in the training data set, based on a BERT fine-tuning method, using the output of the last layer of the model as an expression vector, using the dynamic learning rate to perform model training, respectively performing the training on the model under various different training batch sizes, observing corresponding loss change conditions and evaluation indexes, and selecting a fine-tuning model with the optimal model performance as a final reasoning model.
Step 503, applying an inference model; specifically, the input target attribute and unknown commodity information text of a single target commodity are converted into an input format of a reasoning model, then the input format is sent into the trained reasoning model, and the relationship reasoning result of the target commodity and each commodity information text is output through reasoning of the reasoning model so as to obtain the commodity information text related to the target commodity.
In principle, in the commodity information inference method in this embodiment, a commodity text to be learned may be converted into an input vector based on a deep learning algorithm in the field of natural language processing, parameters of the model are updated through forward and backward feedback inside the neural network model, and a result of a commodity information text associated with a target commodity is predicted by calculating results of the input vector and the parameters.
In summary level, in the commodity information inference method in this embodiment, an inference model about commodity information is obtained by learning the relation between massive commodities and their applicable scenes, applicable objects, and key attributes. According to the reasoning model, the relation reasoning can be carried out according to the commodity target attribute information and the candidate commodity information text, and the commodity is helped to be identified and mined whether the commodity has certain information value.
In summary, in the embodiment, the BERT pre-training model is combined with the text case reasoning field, that is, the BERT pre-training model learns by using mass commodity information to achieve fine tuning of the BERT pre-training model, and the inference model more suitable for the commodity field is trained by combining methods such as fine tuning and the like with commodity corpora to improve on the basis of the existing BERT model, and meanwhile, general domain knowledge learned previously by the BERT pre-training model can be fused, so that the generalization capability of the BERT pre-training model can be improved while the accuracy of the model is ensured, that is, the information reasoning and judgment of unknown commodities can be improved. In addition, the technology of the invention has clear development specifications, can expand and update the built-in model method, and simultaneously keeps the performance of the technology unchanged.
The text reasoning is a text reasoning model based on the reasoning, the text reasoning predicts knowledge or description related to the text by calculating the input text, and the reasoning is a logic knowledge base describing the evolution rule between events and is used for expressing the sequence, cause and effect, conditions and upper and lower relations between the events.
The following describes the commodity information inference device provided by the present invention, and the commodity information inference device described below and the commodity information inference method described above may be referred to in correspondence with each other.
As shown in fig. 6, the present embodiment provides a commodity information inference apparatus, which includes a processing module 601 and an inference module 602, wherein:
the processing module 601 is configured to obtain input information of the inference model according to a target attribute of the target product and information texts of each product in the information base;
the inference module 602 is configured to input the input information into the inference model, so as to obtain a relevant product information text of the target product output by the inference model; the associated commodity information text is a commodity information text associated with the target commodity in the information base;
the inference model is obtained by training a pre-training language representation BERT model based on target attributes of sample commodities in a commodity library, commodity information texts in the information library and associated commodity information texts of the sample commodities.
In the embodiment, the BERT model is trained based on the sample commodities in the commodity library, the commodity information texts in the information library and the associated commodity information texts of the sample commodities, so that a reasoning model suitable for commodity field knowledge can be constructed, the prior general field knowledge of the BERT model is fused, the capability of accurately learning commodity information in the E-commerce field is realized, the general reasoning model can be used in different E-market scenes, the associated commodity information text reasoning on the commodity can be realized according to the target attribute of the target commodity, the reasoning model is suitable for various commodities in the commodity knowledge field, and has certain universality and universality, and the obtained commodity reasoning result is more accurate.
In some embodiments, the system further comprises a training module, specifically configured to:
extracting a first number of first sample commodities from the commodity library;
according to the target attribute of the first sample commodity, the commodity information texts in the information base, the associated commodity information texts of the first sample commodity, a first training strategy and a second training strategy, deep pre-training is carried out on the parameters of the BERT model; wherein the first training strategy is determined based on a Mask Language Model (MLM) task, and the second training strategy is determined based on a Next Sentence Prediction (NSP) task;
extracting a second quantity of a second sample item from the item store;
based on the target attribute of the second sample commodity, the commodity information texts in the information base and the associated commodity information texts of the second sample commodity, finely adjusting the parameters of the deeply pre-trained BERT model;
and acquiring the inference model according to the finely adjusted BERT model pre-trained in depth.
In some embodiments, the BERT model comprises a first BERT model and a second BERT model;
a training module further to:
respectively carrying out deep pre-training on the parameters of the first BERT model and the parameters of the second BERT model;
respectively finely adjusting the parameters of the first BERT model after the deep pre-training and the parameters of the second BERT model after the deep pre-training;
selecting a model with optimal model performance from the first BERT model after the fine tuning and the second BERT model after the fine tuning and the deep pre-training;
and acquiring the inference model according to the model with the optimal model performance.
In some embodiments, the inference module is specifically configured to:
inputting the input information into the reasoning model to obtain the association degree between the target commodity and each commodity information text; and taking the commodity information text with the maximum correlation degree as the correlated commodity information text of the target commodity.
In some embodiments, the processing module is specifically configured to:
respectively extracting the characteristics of the target attribute of the target commodity and each commodity information text in the information base to obtain a first characteristic vector of the target commodity and a second characteristic vector of each commodity information text; determining whether a length of each of the first feature vector and the second feature vector is greater than a maximum input text length of the inference model; shaping the feature vector with the length larger than the maximum input text length; wherein the length of the shaped feature vector is less than or equal to the maximum input text length; and acquiring the input information of the inference model according to the shaped first eigenvector and the shaped second eigenvector.
In some embodiments, the processing module is further configured to:
preprocessing the target attribute of the target commodity and the information text of each commodity; the preprocessing comprises word coding, position coding and sentence coding; and acquiring the input information according to the target attribute of the preprocessed target commodity and the information text of each preprocessed commodity.
In some embodiments, the associated commodity information text of the target commodity includes one or more combinations of commodity categories, commodity applicable scenes, commodity applicable objects and commodity key attributes.
Fig. 7 illustrates a physical structure diagram of an electronic device, and as shown in fig. 7, the electronic device may include: a processor (processor) 701, a communication Interface (Communications Interface) 702, a memory (memory) 703 and a communication bus 704, wherein the processor 701, the communication Interface 702 and the memory 703 are in communication with each other via the communication bus 704. Processor 701 may invoke logic instructions in memory 703 to perform a merchandise information inference method comprising: acquiring input information of a reasoning model according to the target attribute of the target commodity and each commodity information text in the information base; inputting the input information into the inference model to obtain a related commodity information text of the target commodity output by the inference model; the associated commodity information text is a commodity information text associated with the target commodity in the information base; the inference model is obtained by training a pre-training language representation BERT model based on target attributes of sample commodities in a commodity library, commodity information texts in the information library and associated commodity information texts of the sample commodities.
In addition, the logic instructions in the memory 703 can be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product including a computer program, the computer program being storable on a non-transitory computer-readable storage medium, the computer program being capable of executing, when executed by a processor, the merchandise information inference method provided by the above methods, the method including: acquiring input information of a reasoning model according to the target attribute of the target commodity and each commodity information text in the information base; inputting the input information into the inference model to obtain a related commodity information text of the target commodity output by the inference model; the associated commodity information text is a commodity information text associated with the target commodity in the information base; the inference model is obtained by training a pre-training language representation BERT model based on target attributes of sample commodities in a commodity library, commodity information texts in the information library and associated commodity information texts of the sample commodities.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements a merchandise information inference method provided by performing the above methods, the method including: acquiring input information of a reasoning model according to the target attribute of the target commodity and each commodity information text in the information base; inputting the input information into the inference model to obtain a related commodity information text of the target commodity output by the inference model; the associated commodity information text is a commodity information text associated with the target commodity in the information base; the inference model is obtained by training a pre-training language representation BERT model based on target attributes of sample commodities in a commodity library, commodity information texts in the information library and associated commodity information texts of the sample commodities.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A commodity information inference method is characterized by comprising the following steps:
acquiring input information of a reasoning model according to the target attribute of the target commodity and each commodity information text in the information base;
inputting the input information into the inference model to obtain a related commodity information text of the target commodity output by the inference model; the associated commodity information text is a commodity information text associated with the target commodity in the information base;
the inference model is obtained by training a pre-training language representation BERT model based on target attributes of sample commodities in a commodity library, commodity information texts in the information library and associated commodity information texts of the sample commodities.
2. The commodity information inference method according to claim 1, wherein said specific training step of said inference model comprises:
extracting a first number of first sample commodities from the commodity library;
according to the target attribute of the first sample commodity, the commodity information texts in the information base, the associated commodity information texts of the first sample commodity, a first training strategy and a second training strategy, deep pre-training is carried out on the parameters of the BERT model; wherein the first training strategy is determined based on a Mask Language Model (MLM) task, and the second training strategy is determined based on a Next Sentence Prediction (NSP) task;
extracting a second quantity of second sample goods from the goods repository;
based on the target attribute of the second sample commodity, the commodity information texts in the information base and the associated commodity information texts of the second sample commodity, finely adjusting the parameters of the deeply pre-trained BERT model;
and acquiring the inference model according to the finely adjusted BERT model pre-trained in depth.
3. The commodity information inference method of claim 2, wherein the BERT model includes a first BERT model and a second BERT model;
the deep pre-training of parameters of the BERT model comprises:
respectively carrying out deep pre-training on the parameters of the first BERT model and the parameters of the second BERT model;
the fine adjustment of the parameters of the BERT model after deep pre-training comprises the following steps:
respectively finely adjusting the parameters of the first BERT model after the deep pre-training and the parameters of the second BERT model after the deep pre-training;
the obtaining of the inference model according to the finely tuned depth pre-trained BERT model includes:
selecting a model with optimal model performance from the first BERT model after the fine tuning and the second BERT model after the fine tuning and the deep pre-training;
and acquiring the inference model according to the model with the optimal model performance.
4. The commodity information inference method according to any one of claims 1 to 3, wherein said inputting said input information into said inference model to obtain a commodity information text associated with said target commodity output by said inference model, comprises:
inputting the input information into the reasoning model to obtain the association degree between the target commodity and each commodity information text;
and taking the commodity information text with the maximum correlation degree as the correlation commodity information text of the target commodity.
5. The commodity information inference method according to any one of claims 1 to 3, wherein said obtaining input information of an inference model based on a target attribute of a target commodity and each commodity information text in an information base, comprises:
respectively extracting the target attribute of the target commodity and the information texts of the commodities in the information base to obtain a first characteristic vector of the target commodity and a second characteristic vector of the information texts of the commodities;
determining whether a length of each of the first feature vector and the second feature vector is greater than a maximum input text length of the inference model;
shaping the feature vector with the length larger than the maximum input text length; wherein the length of the shaped feature vector is less than or equal to the maximum input text length;
and acquiring the input information of the inference model according to the shaped first eigenvector and the shaped second eigenvector.
6. The commodity information inference method according to any one of claims 1 to 3, wherein said obtaining input information of an inference model according to target attributes of a target commodity and each commodity information text in an information base, comprises:
preprocessing the target attribute of the target commodity and the information text of each commodity;
the preprocessing comprises word coding, position coding and sentence coding;
and acquiring the input information according to the target attribute of the preprocessed target commodity and the information text of each preprocessed commodity.
7. The commodity information inference method according to any one of claims 1 to 3, wherein the associated commodity information text of the target commodity includes one or more combinations of a commodity category, a commodity applicable scene, a commodity applicable object, and a commodity key attribute.
8. A commodity information inference apparatus characterized by comprising:
the processing module is used for acquiring input information of the inference model according to the target attribute of the target commodity and each commodity information text in the information base;
the reasoning module is used for inputting the input information into the reasoning model to obtain the associated commodity information text of the target commodity output by the reasoning model; the associated commodity information text is a commodity information text associated with the target commodity in the information base;
the inference model is obtained by training a pre-training language representation BERT model based on target attributes of sample commodities in a commodity library, commodity information texts in the information library and associated commodity information texts of the sample commodities.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the merchandise information inference method according to any one of claims 1 to 7 when executing the program.
10. A non-transitory computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the merchandise information inference method according to any one of claims 1 to 7.
CN202210946464.6A 2022-08-08 2022-08-08 Commodity information reasoning method and device Pending CN115374845A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210946464.6A CN115374845A (en) 2022-08-08 2022-08-08 Commodity information reasoning method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210946464.6A CN115374845A (en) 2022-08-08 2022-08-08 Commodity information reasoning method and device

Publications (1)

Publication Number Publication Date
CN115374845A true CN115374845A (en) 2022-11-22

Family

ID=84063375

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210946464.6A Pending CN115374845A (en) 2022-08-08 2022-08-08 Commodity information reasoning method and device

Country Status (1)

Country Link
CN (1) CN115374845A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821489A (en) * 2023-06-21 2023-09-29 易方达基金管理有限公司 Stock screening method and system
CN116882412A (en) * 2023-06-29 2023-10-13 易方达基金管理有限公司 Semantic reasoning method and system based on NLP classification
CN116957140A (en) * 2023-06-29 2023-10-27 易方达基金管理有限公司 Stock prediction method and system based on NLP (non-linear point) factors

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821489A (en) * 2023-06-21 2023-09-29 易方达基金管理有限公司 Stock screening method and system
CN116882412A (en) * 2023-06-29 2023-10-13 易方达基金管理有限公司 Semantic reasoning method and system based on NLP classification
CN116957140A (en) * 2023-06-29 2023-10-27 易方达基金管理有限公司 Stock prediction method and system based on NLP (non-linear point) factors

Similar Documents

Publication Publication Date Title
CN106599226B (en) Content recommendation method and content recommendation system
CN109376222B (en) Question-answer matching degree calculation method, question-answer automatic matching method and device
CN110069709B (en) Intention recognition method, device, computer readable medium and electronic equipment
CN115374845A (en) Commodity information reasoning method and device
CN111737474A (en) Method and device for training business model and determining text classification category
CN113095415B (en) Cross-modal hashing method and system based on multi-modal attention mechanism
CN109214006B (en) Natural language reasoning method for image enhanced hierarchical semantic representation
US11537950B2 (en) Utilizing a joint-learning self-distillation framework for improving text sequential labeling machine-learning models
CN110750640A (en) Text data classification method and device based on neural network model and storage medium
CN112052684A (en) Named entity identification method, device, equipment and storage medium for power metering
CN108959474B (en) Entity relation extraction method
CN110879938A (en) Text emotion classification method, device, equipment and storage medium
CN111966812A (en) Automatic question answering method based on dynamic word vector and storage medium
CN111581923A (en) Method, device and equipment for generating file and computer readable storage medium
CN114926835A (en) Text generation method and device, and model training method and device
CN109614611B (en) Emotion analysis method for fusion generation of non-antagonistic network and convolutional neural network
Drissi et al. Program language translation using a grammar-driven tree-to-tree model
CN113392209A (en) Text clustering method based on artificial intelligence, related equipment and storage medium
CN111241232A (en) Business service processing method and device, service platform and storage medium
CN110969023B (en) Text similarity determination method and device
CN113011191A (en) Knowledge joint extraction model training method
CN114372475A (en) Network public opinion emotion analysis method and system based on RoBERTA model
CN110674642A (en) Semantic relation extraction method for noisy sparse text
CN112182167B (en) Text matching method and device, terminal equipment and storage medium
CN110852066A (en) Multi-language entity relation extraction method and system based on confrontation training mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination