WO2024022066A1 - 文案生成方法、装置及存储介质 - Google Patents
文案生成方法、装置及存储介质 Download PDFInfo
- Publication number
- WO2024022066A1 WO2024022066A1 PCT/CN2023/105876 CN2023105876W WO2024022066A1 WO 2024022066 A1 WO2024022066 A1 WO 2024022066A1 CN 2023105876 W CN2023105876 W CN 2023105876W WO 2024022066 A1 WO2024022066 A1 WO 2024022066A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- vector
- information
- sample
- item
- semantic representation
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 154
- 239000013598 vector Substances 0.000 claims abstract description 717
- 238000012545 processing Methods 0.000 claims abstract description 49
- 230000004044 response Effects 0.000 claims abstract description 12
- 238000009826 distribution Methods 0.000 claims description 157
- 239000012634 fragment Substances 0.000 claims description 149
- 230000008569 process Effects 0.000 claims description 86
- 238000012549 training Methods 0.000 claims description 79
- 238000013136 deep learning model Methods 0.000 claims description 34
- 238000011176 pooling Methods 0.000 claims description 32
- 238000003062 neural network model Methods 0.000 claims description 20
- 230000006870 function Effects 0.000 claims description 19
- 230000004913 activation Effects 0.000 claims description 15
- 238000004422 calculation algorithm Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 8
- 238000001914 filtration Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000010586 diagram Methods 0.000 description 19
- 238000005516 engineering process Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 230000002452 interceptive effect Effects 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000003064 k means clustering Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000010408 sweeping Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/258—Heading extraction; Automatic titling; Numbering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0499—Feedforward networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
Definitions
- the present application relates to the field of data processing technology, and in particular to a copywriting generation method, device and storage medium.
- Item description copy plays an important role in the e-commerce system. Compared with just recommending item titles, carefully written item description copy can better improve the user experience and prevent users from reading heavy and lengthy item details.
- the copywriting generation technology used in the current technology mainly manually extracts relevant information from external databases, and lacks the ability to extract useful knowledge for copywriting from the item details that contain all item information. This solution is not only inefficient, but the generated copywriting is also inaccurate. Reflects real item properties.
- Embodiments of the present application provide a copywriting generation method, device and storage medium.
- the embodiment of this application provides a method for generating copywriting, including:
- the client Receive the search information sent by the client, and respond to the search information to obtain the corresponding item title information, item feature information and item detail information;
- the item detail information is a plurality of item information fragments obtained by processing a variety of related information. gather;
- the first preset latent variable model is used to process the calculated first semantic representation vector to obtain the first target vector; the first preset latent variable model is based on the first pair of prior distribution and posterior distribution based on the sample item related information. Obtained by distribution training; the first semantic representation vector is calculated through the item title information and the item feature information;
- a second target vector is obtained;
- the second preset The latent variable model is a first vector obtained through each training of the first preset latent variable model, and a second pair of prior distributions and posterior distributions are trained based on the relevant information of the sample items;
- the target latent vector It is calculated by combining the first target vector with the detailed information of the item;
- the target copy is determined through the preset dialogue system model and sent to the client.
- Embodiments of this application also provide a copywriting generation device, including:
- the receiving response module is configured to receive the search information sent by the client, and respond to the search information to obtain the corresponding item title information, item feature information and item detail information; the item detail information is obtained by processing a variety of related information. A collection of multiple item information fragments;
- the processing module is configured to use a first preset latent variable model to process the calculated first semantic representation vector to obtain a first target vector; the first preset latent variable model is based on the sample item related information for the first pair The prior distribution and the posterior distribution are trained; the first semantic representation vector is obtained through the item title information and the item characteristics. information computing;
- the processing module is configured to use a second preset latent variable model to process the calculated target latent vector, the first target vector, and the second semantic representation vector calculated through the item detail information to obtain the second target Vector;
- the second preset latent variable model is the first vector obtained through each training of the first preset latent variable model, and the second pair of prior distribution and posterior distribution is trained in combination with the relevant information of the sample items Obtained;
- the target latent vector is calculated by combining the first target vector with the item detail information;
- the copy determination module is configured to combine the second target vector and the first semantic representation vector, determine the target copy through a preset dialogue system model, and send it to the client.
- An embodiment of the present application also provides a copywriting generating device, which includes a memory and a processor.
- the memory stores a computer program that can be run on the processor.
- the processor executes the program, the steps in the above method are implemented.
- Embodiments of the present application also provide a computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the steps in the above method are implemented.
- Figure 1 is an optional flow diagram of a copywriting generation method provided by an embodiment of the present application
- Figure 2 is an optional schematic diagram of the effect of the copywriting generation method provided by the embodiment of the present application.
- Figure 3 is an optional flow diagram of the copywriting generation method provided by the embodiment of the present application.
- Figure 4 is an optional flow diagram of the copywriting generation method provided by the embodiment of the present application.
- Figure 5 is an optional flow diagram of the copywriting generation method provided by the embodiment of the present application.
- Figure 6 is an optional flow diagram of the copywriting generation method provided by the embodiment of the present application.
- Figure 7 is an optional flow diagram of the copywriting generation method provided by the embodiment of the present application.
- Figure 8 is an optional flow diagram of the copywriting generation method provided by the embodiment of the present application.
- Figure 9 is a schematic structural diagram of a copywriting generation device provided by an embodiment of the present application.
- Figure 10 is a schematic diagram of a hardware entity of a copywriting generation device provided by an embodiment of the present application.
- first ⁇ second ⁇ third are only used to distinguish similar objects and do not mean Regarding the specific ordering of objects, it can be understood that “first ⁇ second ⁇ third” can interchange the specific order or sequence if permitted, so that the embodiments of the present application described here can be used in other ways than those shown in the figure here. may be performed in any order other than that shown or described.
- the automatic generation technology of item description is the technology of automatically generating item description copy based on the basic information of a given item, such as item title, item characteristics and item details introduction, etc. technique.
- Existing automatic item copy generation technology mainly relies on item title information and item feature information. By inputting it into the end-to-end generation model, the final item description copy is obtained. This method is limited by the information input of the deep learning model.
- the existing technology mainly inputs streamlined information such as item title information and item feature information into the deep learning model.
- the copywriting generated in this way is generic and lacks the specific characteristics of specific products. Users cannot obtain specific and unique product features, and the recommendation effect is poor.
- the knowledge enhancement technology used in another solution mainly manually extracts relevant information from external databases. It lacks the ability to extract useful knowledge for model generation copywriting from the item details containing all item information. Manual extraction of knowledge requires a lot of manpower and is complicated. It is not completely related to specific items, and the external knowledge base cannot extract relevant information for any item, such as relatively novel items or very unpopular items. Therefore, in related technologies, the generation scheme is not only inefficient, but also the generated copy cannot accurately reflect the characteristics of the real item.
- the embodiment of the present application provides a copywriting generation method. Please refer to Figure 1 , which is an optional flow diagram of the copywriting generation method provided by the embodiment of the present application, which will be described in conjunction with the steps shown in Figure 1 .
- S101 Receive search information sent by the client, and respond to the search information to obtain corresponding item title information, item feature information, and item detail information; item detail information is a collection of multiple item information fragments obtained by processing multiple related information.
- the server receives the search information sent by the client, and responds to the search information to obtain corresponding item title information, item feature information, and item detail information.
- the item detailed information is a collection of multiple item information fragments obtained by processing multiple related information.
- the server receives the item information keyword sent by the client, responds to the item information keyword, and retrieves item title information, item feature information, and various related information corresponding to the item information keyword in the local database.
- the server then performs denoising, filtering and classification processing on a variety of related information to obtain detailed item information.
- the server receives the item encoding information sent by the client, responds to the item encoding information, and retrieves item title information, item feature information, and various related information corresponding to the item information keywords in the local database. The server then performs denoising, filtering and classification processing on a variety of related information to obtain detailed item information.
- various related information may include: stored item advertisement information in text form, item model information in digital form, specific usage methods of items described in text, and item evaluation information.
- Servers can use heuristics to filter noise across a variety of relevant information.
- the server can use heuristic rules such as stop words to divide the various related information after noise filtering into keyword fragments.
- the server obtains the keyword fragment vector of each keyword fragment through the preset semantic model, and then obtains multiple keyword fragment vectors.
- the server clusters multiple keyword fragment vectors to obtain multiple clusters.
- the server combines the keyword fragments corresponding to the keyword fragment vectors in each cluster to obtain an item information fragment. Then, item detailed information including multiple item information fragments is obtained.
- the search information received by the server may be a high-definition network set-top box.
- the server locally retrieves the item title information "HD network set-top box” and item feature information "computer, office network box, box, HD, set-top box and wireless" through the HD network set-top box.
- a variety of related information can include: “Mobile phone screen projection, small screen becomes large screen. Just tap on the phone to cast pictures and videos to the TV. The field of view will be wider and the viewing will be more shocking. Voice control, smart and obedient support For video on demand, channel switching, volume adjustment, etc., you can give voice commands on your mobile phone and it will understand your words.” "I want to watch an action movie. What will the weather be like in Beijing tomorrow? Let the sweeping robot go sweep the floor.” "The box contains a switch. The advertising video at startup cannot be deleted or changed, and the advertising video of third-party content cannot be controlled.”
- the server can filter and classify a variety of related information to obtain item detailed description information.
- the first preset latent variable model uses the first preset latent variable model to process the calculated first semantic representation vector to obtain the first target vector; the first preset latent variable model is based on the first pair of prior distribution and posterior distribution based on the sample item related information. Obtained by distribution training; the first semantic representation vector is calculated through item title information and item feature information.
- the server uses the first preset latent variable model to process the calculated first semantic representation vector to obtain the first target vector.
- the first preset latent variable model is based on the sample item title information, sample item Feature information, sample item detail information and sample item description copy information are obtained by training the first pair of prior distribution and posterior distribution.
- the first semantic representation vector is calculated from the item title information and item feature information.
- Information related to sample items includes: sample item title information, sample item feature information, sample item detail information, and sample item description copy information.
- the server can use the first preset latent variable model to process the calculated first semantic representation vector, and determine the first first semantic representation vector from the Gaussian vector corresponding to the prior distribution in the first preset latent variable model. target vector.
- the server can combine the item title information and the item feature information to obtain the basic information of the item.
- the server inputs the basic information of the item into the deep learning model encoder to obtain the first intermediate semantic representation vector.
- the server performs average pooling on the first intermediate semantic representation vector to obtain the first semantic representation vector.
- the server uses sample item title information, sample item feature information, similar sample information fragments (similar sample information fragments are the fragments with the greatest similarity to the sample item description copy information among multiple item information fragments) and sample items.
- the description copy information calculates a plurality of first correlation vectors, and determines a first posterior parameter of the first posterior distribution through the plurality of first correlation vectors.
- the first vector is determined through the first posterior parameter to iteratively train the first posterior distribution and the first prior distribution until the training condition is reached, and the first vector is obtained including the trained first posterior distribution and the first prior distribution.
- the first preset latent variable model of the prior distribution is
- the second preset latent variable model uses the second preset latent variable model to process the calculated target latent vector, the first target vector and the second semantic representation vector calculated through the item detail information to obtain the second target vector;
- the second preset latent variable model is The first vector obtained by each training of the first preset latent variable model is obtained by combining the sample item-related information to train the second pair of prior distribution and posterior distribution;
- the target latent vector is obtained by combining the first target vector with the item detailed information computational.
- the server uses the second preset latent variable model to process the calculated target latent vector, the first target vector and the second semantic representation vector calculated through the item details information to obtain the second target vector, where,
- the second preset latent variable model is obtained by training the second pair of prior distribution and posterior distribution by combining the first vector obtained by each training of the first preset latent variable model and the sample item detailed information in the sample item related information;
- the target latent vector is calculated by combining the first target vector with the item detail information.
- the server uses the second preset latent variable model to process the target latent vector, the first target vector and the second semantic representation vector, and uses the classification corresponding to the prior distribution in the second preset latent variable model
- the second target vector is determined from the vector.
- the server will obtain a first vector after each training of the first preset latent variable model.
- the server combines the first vector and the sample item detailed information to train the second pair of prior distribution and posterior distribution to obtain The second preset latent variable model.
- the server combines the second target vector and the first semantic representation vector, determines the target copy through the preset dialogue system model, and sends it to the client.
- the server uses the second target vector and the first semantic representation vector to calculate the target representation recognition vector.
- the server processes the target representation recognition vector through the preset perceptual neural network model to obtain the target serial number; determines the corresponding target item information fragment among multiple item information fragments through the target serial number; passes the target item information fragment through the deep learning model encoder
- the final layer of processing obtains the target item fragment vector.
- the server inputs the target item fragment vector, the first target vector and the first intermediate semantic representation vector into the preset dialogue system model to obtain the target copy.
- the preset perceptual neural network model is obtained during the training process of the first preset latent variable model and the second preset latent variable model.
- the final target copy obtained by the server can be "HD network set-top box, supports voice control.
- This TV box supports voice control, and can realize video on demand, channel switching, volume adjustment and other functions through voice, voice control function Promoting high-quality audio-visual experience.”
- the search information sent by the client is received, and the corresponding item title information, item feature information and item detail information are obtained in response to the search information; the item detail information is obtained by processing a variety of related information.
- a collection of multiple item information fragments the first preset latent variable model is used to process the calculated first semantic representation vector to obtain the first target vector; the first preset latent variable model is to use the sample item related information to The prior distribution and the posterior distribution are trained; the first semantic representation vector is calculated through the item title information and item feature information; the second preset latent variable model is used to calculate the target latent vector, the first target vector and the calculated target latent vector through the item
- the second semantic representation vector calculated by the relevant information is processed to obtain the second target vector;
- the second preset latent variable model is the first vector obtained through each training of the first preset latent variable model, and the second preset latent variable model is combined with the sample item detailed information
- Two pairs of prior distributions and posterior distributions are trained; the target latent vector is calculated by combining the first target
- this scheme Since the item details in this scheme are composed of a variety of related information, and this scheme uses a pair of interactive preset latent variable models to learn two pairs of prior and posterior distributions, it can automatically learn from the heavy and lengthy item details. Useful knowledge is selected from the information to form target copy, so this solution can improve the efficiency of generating copy, and also improves the accuracy of the copy's reflection of the true characteristics of the item.
- FIG 3 is an optional flow diagram of the copywriting generation method provided by the embodiment of the present application.
- S102 shown in Figure 1 may also include S105 to S110 implementation, which will be performed in conjunction with each step. illustrate.
- the server obtains the sample item title information, sample item feature information, sample item detail information, and sample item description copy information, and determines in the sample item detail information the one that is most similar to the sample item description copy information.
- Sample information snippet the sample item title information, sample item feature information, sample item detail information, and sample item description copy information.
- the server can receive sample search information sent by any client within the historical time.
- the server obtains sample item title information, sample item feature information, sample item detail information, and sample item description copy. information.
- the sample item detailed information includes: multiple sample information fragments.
- the server calculates the similarity between the read sample information fragment and the sample item description copy information, and the server determines a similar sample information fragment among multiple sample information fragments that has the greatest similarity with the sample item description copy information.
- the sample item description copy information is pre-written copy information corresponding to the sample search information.
- S106 Calculate multiple first correlation vectors using the sample item title information, sample item feature information, similar sample information fragments and sample item description copy information, and determine the first posterior distribution of the first posterior distribution through the multiple first correlation vectors. test parameters.
- the server uses the sample item title information, sample item feature information, similar sample information fragments and sample item description copy information to calculate multiple first correlation vectors, and determines the first posteriori through the multiple first correlation vectors The first posterior parameter of the distribution.
- the server combines the sample item title information and the sample item feature information to calculate the first sample semantic representation vector.
- the server determines the basic semantic representation vector and the similar semantic representation vector through similar sample information fragments.
- the server uses the sample item description copy information to determine the basic semantic representation vector of the copy and the semantic representation vector of the copy.
- the server determines the first related latent vector and the second related latent vector through the basic semantic representation vector and the copywriting basic semantic representation vector.
- the plurality of first correlation vectors include: a first sample semantic representation vector, a similar semantic representation vector, a copywriting similarity semantic representation vector, a first correlation latent vector and a second correlation latent vector.
- the server inputs the first sample semantic representation vector, the similar semantic representation vector, the copywriting similar semantic representation vector, the first related latent vector and the second related latent vector into a layer of perceptual neural network model to obtain the first posterior parameters.
- S107 Determine the first vector through the first posterior parameter to iteratively train the first posterior distribution and the first prior distribution until the training conditions are reached, and obtain the trained first posterior distribution, The first preset latent variable model of the first prior distribution.
- the server determines the first vector through the first posterior parameter to compare the first posterior distribution and the first vector. Iterative training is performed on a prior distribution until the training condition is reached, and a first preset latent variable model including the trained first posterior distribution and the first prior distribution is obtained.
- reaching the training condition may include: the fitting function of the first posterior distribution and the first prior distribution reaches convergence.
- the server may randomly determine the first vector among the Gaussian distribution vectors determined by the first posterior parameter.
- the server calculates the training latent vector by combining the first vector with the detailed information of the sample item.
- the server calculates the final representation vector of the first sample and the final representation vector of the second sample through multiple sample information fragments included in the sample item detail information. After the server multiplies the first vector, the final representation vector of the second sample, and the preset parameters, the intermediate calculation training latent vector is obtained through activation function processing. The server multiplies the intermediate calculated training latent vector with the final representation vector of the first sample to obtain the training latent vector.
- S109 Determine the second posterior parameters of the second posterior distribution through training latent vectors, first vectors and multiple second correlation vectors; the multiple second correlation vectors are determined through the sample item title information, sample item feature information and sample Determined by item details.
- the server determines the second posterior parameters of the second posterior distribution by training latent vectors, first vectors and multiple second correlation vectors.
- the plurality of second correlation vectors are determined through the sample item title information, sample item feature information and sample item detail information.
- the server inputs the training latent vector, the first vector and a plurality of second correlation vectors into a layer of perceptual neural network model to obtain the second posterior parameters.
- the server determines the second vector through the second posterior parameter to iteratively train the second posterior distribution and the second prior distribution until it stops when the training conditions are reached, and obtains the second vector including the trained The second preset latent variable model of the second posterior distribution and the second prior distribution.
- the server may determine the second vector by determining the classification distribution vector based on the second posterior parameter.
- reaching the training condition may include: convergence of the fitting functions of the second posterior distribution and the second prior distribution.
- This program uses a pair of interactive preset latent variable models to learn two pairs of prior and posterior distributions, automatically selects useful knowledge from the heavy and lengthy item details, and then forms the target copy, thus improving the accuracy of the copy for the item. The accuracy of the characteristics reflected.
- FIG 4 is an optional flow diagram of the copy generation method provided by the embodiment of the present application.
- S105 to S107 shown in Figure 3 can also be implemented through S111 to S121, and each step will be combined Be explained.
- S111 Obtain sample item title information, sample item feature information, sample item detail information, and sample item description copy information, and use a preset semantic model to process multiple sample information fragments to obtain multiple sample information semantic vectors.
- the server obtains sample item title information, sample item feature information, sample item detail information, and sample item description copy information, and uses a preset semantic model to process multiple sample information fragments to obtain multiple sample information semantic vectors. .
- the sample item detailed information includes: multiple sample information fragments.
- the server can preprocess various sample-related information and divide the sample item details into multiple sample keyword fragments.
- the server uses heuristic rules such as stop words to divide the preprocessed sample item details into different fragments and only retains fragments with a length of 10 to 64 words. Get a series of sub-fragments KF. Through this method we filter out fragments that are not useful for model generation, such as instructions for use, product parameters, and artificial information.
- K total is the sample item details after preprocessing
- K fragi is the i-th sample keyword fragment
- m is the number of sample keyword fragments.
- the server can obtain the representation vector of each sample keyword fragment through preset algorithm processing.
- the server uses Sentence-Bert [8] to obtain the representation vector E fragi of each sample keyword fragment K fragi ⁇ KF.
- the server can use the K-means [9] algorithm to process multiple representation vectors and cluster semantically similar ones into the same group, thus obtaining the item knowledge group KP.
- K i is the i-th sample information fragment
- KP is multiple sample information fragments.
- the server processes multiple sample information fragments through a preset semantic model to obtain multiple sample information semantic vectors.
- the preset semantic model can be the Sentence-Bert model.
- S112. Use the preset semantic model to process the sample item description copy information to obtain the sample item description semantic vector.
- the server uses a preset semantic model to process the sample item description copy information to obtain a sample item description semantic vector.
- the server uses the Sentence-Bert model to process the sample item description copy information to obtain the sample item description semantic vector.
- the preset semantic model can also be other semantic models, which is not limited in the embodiment of the present application.
- the server performs similarity calculation on each sample information semantic vector and the sample item description semantic vector, and obtains multiple similarities corresponding to multiple sample information semantic vectors.
- the server can perform cosine similarity calculation on each sample information semantic vector and the sample item description semantic vector to obtain multiple similarities corresponding to multiple sample information semantic vectors.
- the server determines that the sample information fragment corresponding to the maximum similarity is a similar sample information fragment.
- the server uses the Sentence-Bert model to obtain the semantic vector of each knowledge fragment k i ⁇ KP and item description copy information, which are represented as R ki and RD respectively.
- R ki and RD we calculate the cosine similarity COSR ki , R D between R ki and R D and use the fragment with the greatest similarity as the pseudo-labeled similar sample information fragment K pse for the subsequent model training process.
- K pse can be calculated through formula (1).
- the server combines the sample item title information and the sample item feature information to calculate the first sample semantic representation vector.
- the server collects the sample keywords in the sample title information and the sample item feature information to obtain basic information of the sample item.
- the server obtains the first sample semantic representation vector through encoder and pooling processing on the basic information of the sample items.
- S115 can also be implemented through S1151 to S1153, which will be described in combination with each step.
- the server combines multiple sample keywords in the sample item title information and the sample item feature information to obtain basic information of the sample item.
- S1152. Process the basic information of the sample item through the deep learning model encoder to obtain the first sample intermediate semantic representation vector.
- the server processes the basic information of the sample item through the last layer of the deep learning model encoder to obtain the first sample intermediate semantic representation vector.
- S1153. Perform average pooling on the first sample intermediate semantic representation vector to obtain the first sample semantic representation vector.
- the server performs average pooling on the first sample intermediate semantic representation vector to obtain the first sample semantic representation vector.
- the server training data construction module adopts the native Transformer encoder network structure.
- P ⁇ T;a 1 ;a 2 ;...;a A ⁇
- a 1 represents the first sample keyword in the sample item feature information, ⁇ T; a 1 ; a 2 ; ...; a A ⁇ represents connecting the sequence, that is, directly following the text.
- the server inputs the sample item basic information P into the first layer of the deep learning model encoder, and obtains the first sample intermediate semantic representation vector EP which is the sum of the word vector (word embedding) WE and the position vector (position embedding) PE.
- EP can be calculated through formula (2).
- E P WE(P)+PE(P) (2)
- the server passes the first sample intermediate semantic representation vector through the multi-layer Transformer Encoder and passes the output of the last layer of the network After the average pooling operation, the first sample semantic representation vector H p is obtained.
- H p can be calculated through formula (3).
- the server determines the basic semantic representation vector and the similar semantic representation vector through similar sample information fragments.
- the server inputs the sample information fragments into the first layer of the deep learning model encoder, passes through the multi-layer Transformer Encoder, and outputs the basic semantic representation vector of the last layer of the network.
- the server performs average pooling on the basic semantic representation vectors to obtain similar semantic representation vectors.
- the shown S116 can also be implemented through S1161 to S1162, which will be explained in combination with each step.
- the server will process similar sample information fragments through the last layer of the deep learning model encoder to obtain a basic semantic representation vector.
- the server performs average pooling on the basic semantic representation vectors to obtain similar semantic representation vectors.
- the server uses the sample item description copy information to determine the basic semantic representation vector of the copy and the semantic representation vector of the copy.
- the server inputs the sample item description copy information into the first step of the deep learning model encoder. layer, and obtain the basic semantic representation vector of the copy through the output of the last layer of the Transformer Encoder network server pair Perform the average pooling operation to obtain the copy semantic representation vector HD .
- S117 can also be implemented through S1171 to S1172, which will be described in conjunction with each step.
- S1171. Process the sample item description copy information through the deep learning model encoder to obtain the basic semantic representation vector of the copy.
- the server processes the sample item description copy information through the last layer of the deep learning model encoder to obtain the basic semantic representation vector of the copy.
- the server processes the sample item description copy information from the first layer to the last layer of the deep learning model encoder to obtain the basic semantic representation vector of the copy.
- the server performs average pooling on the basic semantic representation vectors of the copywriting to obtain the semantic representation vector of the copywriting.
- the server determines the first related latent vector and the second related latent vector through the basic semantic representation vector and the basic semantic representation vector of the copy.
- S118 can also be implemented through S1181 to S1182, which will be explained in conjunction with each step.
- the server calculates the first related latent vector through the basic semantic representation vector and the basic semantic representation vector of the copy, combined with the first corresponding parameter.
- the shown S1181 can also be implemented through S11811 to S11813, which will be described in combination with each step.
- the server obtains the first product vector of the basic semantic representation vector and the first parameter.
- the server can calculate the first product vector K K through formula (4).
- K K W K E K (4)
- W K is the first parameter
- E K is the basic semantic representation vector
- the server obtains the second product vector of the basic semantic representation vector of the copy and the second parameter.
- the server can calculate the second product vector Q D through formula (5).
- Q D W Q E D (5)
- W Q is the second parameter
- E D is the basic semantic representation vector of the copy.
- the server multiplies the first product vector and the second product vector and then performs activation and average pooling processing to obtain the first related latent vector.
- the server can calculate the first related latent vector through formula (6)
- the server calculates the second related latent vector through the basic semantic representation vector and the basic semantic representation vector of the copy, combined with the second corresponding parameter.
- the shown S1182 can also be implemented through S11821 to S11823, which will be described in combination with each step.
- the server obtains the third product vector of the basic semantic representation vector and the third parameter.
- the server can calculate the third product vector through formula (7)
- W D is the third parameter
- E D is the basic semantic representation vector
- the server obtains the fourth product vector of the basic semantic representation vector of the copy and the fourth parameter.
- the server can calculate the third product vector through formula (8)
- W k is the fourth parameter
- E k is the basic semantic representation vector of the copy.
- the server will multiply the third product vector by the fourth product vector and then perform activation and average pooling processing to obtain the second related latent vector.
- the server can calculate the second related implicit vector through formula (9)
- the server inputs the first semantic representation vector, the similar semantic representation vector, the similar semantic representation vector of the copy, the first related latent vector and the second related latent vector into a layer of perceptual neural network model to obtain the first posterior mathematics expect.
- a pair of interactive variational autoencoders are designed to learn item description latent variables and item knowledge latent variables respectively.
- item description latent variable learning module it is intended to improve the diversity of generated copy and guide the process of knowledge selection. This module learns the Gaussian distribution of item information.
- S120 Process the first posterior mathematical expectation through the activation function to obtain the first posterior variance.
- the server processes the first posterior mathematical expectation through an activation function to obtain the first posterior variance.
- the server can Connected with HP, HK, and HD as Hdes to input a layer of perceptron neural network model (Multi-layerPerceptron, MLP) to obtain the parameters ⁇ and ⁇ of the posterior Gaussian distribution.
- MLP perceptron neural network model
- the server randomly determines the first vector from the Gaussian distribution vector determined by the first posterior parameter to iteratively train the first posterior distribution and the first prior distribution until the training conditions are reached. Stop and obtain the first preset latent variable model including the trained first posterior distribution and the first prior distribution.
- the server can calculate parameters ⁇ and ⁇ through formula (10).
- the posterior Gaussian distribution can be described as:
- T is the sample item title information
- A is the sample item characteristic information
- K is the sample item detailed information
- D is the sample item description copy information.
- the server can calculate the parameters ⁇ ′ and ⁇ ′ through formula (11).
- the server samples z d from the posterior distribution during the training process, and we sample from the prior distribution during the inference process.
- the server determines the parameters ⁇ ′ and ⁇ ′ through the first sample semantic representation vector, the similar semantic representation vector, the similar semantic representation vector of the copy, the first related latent vector and the second related latent vector. , and then combine the parameters ⁇ ′ and ⁇ ′ to determine the first vector to train the first preset latent variable model and the second preset latent variable model, so that the two preset latent variable models interactively learn two pairs of priors.
- the posterior distribution it automatically selects useful knowledge from the heavy and lengthy item details, and then forms the target copy, which improves the accuracy of the copy's reflection of the true characteristics of the item.
- Figure 5 is an optional flow diagram of the copywriting generation method provided by the embodiment of the present application.
- S108 to S109 shown in Figure 3 can also be implemented through S122 to S125, and each step will be combined Be explained.
- S122 Calculate the final representation vector of the first sample and the final representation vector of the second sample through multiple sample information fragments included in the sample item detail information.
- the server calculates the final representation vector of the first sample and the final representation vector of the second sample through multiple sample information fragments included in the sample item detail information.
- the shown S122 can also be implemented through S1221 to S1222, which will be described in combination with each step.
- S1221. Process multiple sample information fragments through the deep learning model encoder to obtain multiple first intermediate sample final representation vectors corresponding to the multiple sample information fragments, and then perform average pooling processing on the multiple first intermediate sample final representation vectors. , get the final representation vector of the first sample.
- the server processes multiple sample information fragments through the last layer of the deep learning model encoder to obtain multiple first intermediate sample final representation vectors corresponding to the multiple sample information fragments, and then processes the multiple first intermediate samples
- the final representation vector of the sample is subjected to average pooling processing to obtain the final representation vector of the first sample.
- WE(K j ) is the corresponding word vector
- PE(K j ) is the corresponding position vector
- SE(K j ) is the corresponding paragraph vector.
- the server can calculate the final representation vector H K P of the first sample through formula (13).
- the semantic representation vector of the knowledge base is also obtained through the transformer encoder. After the average pooling operation, the final representation vector of the knowledge base is obtained.
- the server will combine multiple sample information fragments, obtain the final representation vector of the second intermediate sample through the last layer of processing of the deep learning model encoder, and then average the final representation vector of the multiple second intermediate samples. Pooling processing is performed to obtain the final representation vector of the second sample.
- the intermediate training latent vector is obtained through activation function processing.
- the server multiplies the intermediate calculated training latent vector and the final representation vector of the first sample to obtain the training latent vector.
- the server can calculate the training latent vector through formula (14)
- W d is the default parameter
- z d is the first vector
- H KP is the final representation vector of the first sample.
- the server inputs the training latent vector, the first vector, the similar semantic representation vector and the first sample final representation vector into a layer of perceptual neural network model to obtain the second posterior parameters; the plurality of second correlation vectors are Determined through the sample item title information, sample item feature information and sample item detail information.
- S126 Determine a second vector from the category distribution vector determined through the second posterior parameter to iteratively train the second posterior distribution and the second prior distribution until the training condition is reached, and obtain A good second posterior distribution and a second preset latent variable model of the second prior distribution.
- the server determines a second vector from the category distribution vector determined through the second posterior parameter to iteratively train the second posterior distribution and the second prior distribution until the training condition is reached. , obtaining the second preset latent variable model including the trained second posterior distribution and the second prior distribution.
- the server will train the latent vector during the training phase
- the first vector z d is input into the MLP layer together with the first sample final representation vector H KP and the similar semantic representation vector H K to calculate the second posterior parameter ⁇ .
- the server can remove H K and then input another MLP layer to obtain the parameters ⁇ ′ of the corresponding prior distribution.
- the server can calculate the second posterior parameter ⁇ through formula (15).
- H KP is the final representation vector of the first sample
- H K is the similar semantic representation vector
- the server can calculate the second posterior parameter ⁇ ′ through formula (16).
- the second posterior part and the second prior distribution can be described as: q ⁇ (z k
- z d ,K,KP) Cat ⁇ ( ⁇ )
- z k is sampled from the second posterior distribution and the second prior distribution respectively.
- the server needs to calculate the KL divergence between the second posterior distribution and the second prior distribution to fit the distance.
- the fitting formula can be formula (17).
- the server trains the second pair of posterior distribution and prior distribution, and the second pair of posterior distribution and prior distribution will indirectly affect the second pair of posterior distribution through backpropagation in the neural network training and calculation process.
- Experimental distribution and prior distribution This allows two pairs of posterior distributions and prior distributions to be interactively trained, thereby extracting accurate target copy.
- FIG. 6 is an optional flow diagram of the copy generation method provided by the embodiment of the present application.
- S101 shown in Figure 1 can also be implemented through S127 to S133, which will be explained in combination with each step. .
- the server obtains a variety of related information, performs denoising and filtering processing on the multiple related information, and then divides the multiple related information into multiple keyword segments.
- each keyword fragment includes: multiple keywords.
- the server can use a heuristic algorithm to perform noise filtering on various related information, and then use heuristic rules such as stop words to divide it into multiple keyword fragments.
- the server uses a preset semantic model to process each keyword fragment, obtains a keyword fragment vector corresponding to each keyword fragment, and then obtains multiple keyword fragment vectors.
- the preset semantic model can be the Sentence-Bert model.
- S129 Process multiple keyword fragment vectors through a clustering algorithm to obtain multiple clusters.
- the server processes multiple keyword fragment vectors through a clustering algorithm to obtain multiple clusters.
- the clustering algorithm can be k-means clustering algorithm (k-means clustering algorithm-9, k-means-9).
- the server combines the keyword fragments corresponding to the keyword fragment vectors in each cluster to obtain an item information fragment, and then obtains multiple item information fragments to form item detailed information.
- the keywords in the server item title information and the item feature information are combined to obtain basic item information.
- the server combines the keywords in the item title information and the item feature information in order to obtain the basic information of the item.
- S132 Process the basic information of the item through the deep learning model encoder to obtain the first intermediate semantic representation vector.
- the server processes the basic information of the item through the last layer of the deep learning model encoder to obtain the first intermediate semantic representation vector.
- the server processes the basic information of the item through the first layer of the deep learning model encoder and then transmits it to the next layer until the last layer of processing obtains the first intermediate semantic representation vector.
- the server performs average pooling on the first intermediate semantic representation vector to obtain the first semantic representation vector.
- the server extracts item detail information from a variety of related information to calculate the first semantic representation vector. Since the sources of multiple related information are wide, the server can expand the search scope through this solution to obtain accurate information. target copy.
- Figure 7 is an optional flow diagram of the copy generation method provided by the embodiment of the present application.
- S104 shown in Figure 1 can also be implemented through S134 to S138, which will be explained in conjunction with each step. .
- the server uses the second target vector and the first semantic representation vector to calculate the target representation recognition vector.
- the illustrated S134 can also be implemented by S1341 to S1343.
- the server calculates the module length of the difference between the second target vector and the first semantic representation vector.
- the server calculates the product vector of the second target vector and the first semantic representation vector.
- the server combines the first semantic representation vector, the second target vector, the module length and the product vector in order to form a target representation vector.
- the server After the server obtains the item knowledge latent variables sampled from the posterior and the prior, it performs the item knowledge selection process.
- the item knowledge selection module a heuristic matching algorithm is used to calculate the target representation vector of knowledge selection.
- the server can calculate the target representation vector H sel through formula (18).
- H sel [H P ,z k ,H P -z k ,H P *z k ](18)
- H P is the first semantic representation vector
- z k is the second target vector
- S135. Process the target representation recognition vector through the preset perceptual neural network model to obtain the target serial number.
- the server processes the target representation identification vector through a preset perceptual neural network model to obtain the target serial number.
- the server inputs the target representation recognition vector into the MLP layer to predict the label of the target knowledge.
- the loss function of the preset perceptual neural network model is:
- the preset perceptual neural network model is the neural network model of the last layer of the Transformer Encoder.
- the server determines the corresponding target item information fragment among multiple item information fragments through the target serial number.
- each item information fragment in the item detail information corresponds to a sequence number.
- S137 Process the target item information fragment through the deep learning model encoder to obtain the target item fragment vector.
- the server will process the target item information fragment through the last layer of the deep learning model encoder to obtain the target item fragment vector.
- the server inputs the target item fragment vector, the first target vector and the first intermediate semantic representation vector into the preset dialogue system model to obtain the target copy.
- the server inputs the first intermediate semantic representation vector, the first target vector and the target item fragment vector into the Transformer decoder layer with copy mechanism to generate item copy.
- the server adds the first intermediate semantic representation vector, the first target vector, and the target item fragment vector together and inputs them into the decoder.
- Copy mechanism is used to copy text from item detail information, item title information and item feature information to obtain the target copy.
- the server inputs the first intermediate semantic representation vector, the first target vector and the target item fragment vector into the Transformer decoder layer with copy mechanism until it stops after generating a predetermined number of words to obtain the target copy.
- the server inputs the target item fragment vector, the first target vector and the first intermediate semantic representation vector into the preset dialogue system model to obtain the target copy. Since the target item fragment corresponding to the target item fragment vector It is determined in the item details information through the first preset latent variable model and the second preset latent variable model, and the first preset latent variable model and the second preset latent variable model are fully obtained during the training process. Interaction helps determine accurate target copy.
- Figure 8 is an optional flow diagram of the copy generation method provided by the embodiment of the present application.
- S101 to S104 shown in Figure 1 can also be implemented through S201 to S207, and each step will be combined Be explained.
- the item detail information preprocessing module is used to process a variety of related information to obtain item detail information.
- the training data construction module is used to calculate training data based on sample information.
- the item description latent variable learning module is used to train the first preset latent variable model.
- the item knowledge latent variable learning module is used to train the second preset latent variable model.
- the item knowledge selection module is used to determine the target item information fragment in the item detail information.
- the item description generation module is used to input the preset dialogue system model using the target item fragment vector, the first target vector and the first intermediate semantic representation vector to obtain the target copy.
- FIG. 9 is a schematic structural diagram of a copywriting generating device provided by an embodiment of the present application.
- the embodiment of the present application also provides a copywriting generation device 800, which includes: a response receiving module 803, a processing module 804, and a copywriting determination module 805.
- the receiving response module 803 is configured to receive the search information sent by the client, and respond to the search information to obtain the corresponding item title information, item feature information and item detail information; the item detail information is obtained by processing a variety of related information A collection of multiple item information fragments obtained;
- the processing module 804 is configured to use the first preset latent variable model to process the calculated first semantic representation vector to obtain the first target vector; the first preset latent variable model is based on the sample item related information. The prior distribution and the posterior distribution are trained; the first semantic representation vector is obtained by using the item title information and the item Feature information is calculated;
- the processing module 805 is also configured to use a second preset latent variable model to process the calculated target latent vector, the first target vector, and the second semantic representation vector calculated through the item detail information to obtain a third Two target vectors;
- the second preset latent variable model is the first vector obtained through each training of the first preset latent variable model, combined with the relevant information of the sample items to calculate the second pair of prior distribution and posterior Obtained by distribution training;
- the target latent vector is calculated by combining the first target vector with the item detail information;
- the copy determination module is configured to combine the second target vector and the first semantic representation vector, determine the target copy through a preset dialogue system model, and send it to the client.
- the copy generation device 800 is configured to obtain the sample item title information, the sample item feature information, the sample item detail information, and the sample item description copy information, and generate the sample item description information in the sample item details. Determine from the information the similar sample information fragment that is most similar to the sample item description copy information; use the sample item title information, the sample item feature information, the similar sample information fragment and the sample item description copy The information calculates a plurality of first correlation vectors, and determines a first posterior parameter of the first posterior distribution through the plurality of first correlation vectors; determines the first vector through the first posterior parameter, so as to The first posterior distribution and the first prior distribution are iteratively trained until the training condition is reached, and the third distribution including the trained first posterior distribution and the first prior distribution is obtained.
- a preset latent variable model calculating a training latent vector through the first vector combined with the sample item detail information; determining a second posterior posterior through the training latent vector, the first vector and a plurality of second correlation vectors
- the second posterior parameter of the distribution; the plurality of second correlation vectors are determined through the sample item title information, the sample item feature information and the sample item detail information; through the second posterior parameter Determine the second vector to iteratively train the second posterior distribution and the second prior distribution until the training condition is reached, and obtain the trained second posterior distribution, the second prior distribution, and the trained second posterior distribution.
- the second preset latent variable model of the prior distribution The second preset latent variable model of the prior distribution.
- the sample item detailed information includes: multiple sample information fragments; the copywriting generation device 800 is configured to use a preset semantic model to process the multiple sample information fragments to obtain multiple sample information semantic vectors ; Use the preset semantic model to process the sample item description copy information to obtain a sample item description semantic vector; perform similarity calculations on each sample information semantic vector and the sample item description semantic vector to obtain the corresponding Multiple similarities of multiple sample information semantic vectors; determine the sample information segment corresponding to the maximum similarity as the similar sample information segment.
- the copywriting generation device 800 is configured to combine the sample item title information and the sample item feature information to calculate a first sample semantic representation vector; determine the basic semantics through the similar sample information fragments Representation vectors and similar semantic representation vectors; use the sample item description copy information to determine the basic semantic representation vector of the copy and the copy semantic representation vector; determine the first semantic representation vector through the basic semantic representation vector and the basic semantic representation vector of the copy Related latent vectors and second related latent vectors; the plurality of first related vectors include: the first sample semantic representation vector, the similar semantic representation vector, the copy similar semantic representation vector, the first related latent vector and the second related latent vector.
- the copywriting generation device 800 is configured to combine the sample item title information and multiple sample keywords in the sample item feature information to obtain sample item basic information; the sample item basic information is passed
- the deep learning model encoder processes to obtain the first sample intermediate semantic representation vector; average pooling is performed on the first sample intermediate semantic representation vector to obtain the first sample semantic representation vector.
- the copywriting generation device 800 is configured to process the similar sample information fragments through a deep learning model encoder to obtain the basic semantic representation vector; perform average pooling on the basic semantic representation vector to obtain the Similar semantic representation vectors.
- the copywriting generation device 800 is configured to process the sample item description copywriting information through a deep learning model encoder to obtain the copywriting basic semantic representation vector; perform average pooling on the copywriting basic semantic representation vector, Obtain the semantic representation vector of the copy.
- the copywriting generation device 800 is configured to calculate the first related latent vector through the basic semantic representation vector and the copywriting basic semantic representation vector, combined with the first corresponding parameter; through the basic semantic representation vector and the basic semantic representation vector of the copy, combined with the second corresponding parameter to calculate the second related latent vector.
- the copywriting generation device 800 is configured to find the first product vector of the basic semantic representation vector and the first parameter; to find the second product vector of the basic semantic representation vector of the copywriting and the second parameter; The first product vector is multiplied by the second product vector and then processed through activation and average pooling to obtain the first related latent vector.
- the copywriting generation device 800 is configured to find the third product vector of the basic semantic representation vector and the third parameter; to find the fourth product vector of the basic semantic representation vector of the copywriting and the fourth parameter;
- the third product vector is multiplied by the fourth product vector and then processed through activation and average pooling to obtain the second related latent vector.
- the copywriting generating device 800 is configured to generate the first semantic representation vector, the similar semantic representation vector, the copywriting similar semantic representation vector, the first related latent vector and the second related latent vector.
- Input a layer of perceptual neural network model to obtain the first posterior mathematical expectation;
- the first posterior mathematical expectation is processed through an activation function to obtain the first posterior variance.
- the copywriting generation device 800 is configured to randomly determine the first vector among the Gaussian distribution vectors determined through the first posterior parameter.
- the copywriting generation device 800 is configured to calculate the final representation vector of the first sample and the final representation vector of the second sample through the plurality of sample information fragments included in the sample item detailed information;
- an intermediate training latent vector is obtained through activation function processing
- the intermediate calculated training latent vector is multiplied by the final representation vector of the first sample to obtain the training latent vector.
- the copywriting generation device 800 is configured to process the plurality of sample information fragments through a deep learning model encoder to obtain a plurality of first intermediate sample final representation vectors corresponding to the plurality of sample information fragments, and then Perform average pooling processing on the final representation vectors of the plurality of first intermediate samples to obtain the final representation vector of the first sample; after combining the plurality of sample information fragments, obtain the second representation vector through the deep learning model encoder.
- the intermediate sample final representation vector is then average pooled on the plurality of second intermediate sample final representation vectors to obtain the second sample final representation vector.
- the plurality of second correlation vectors include: similar semantic representation vectors and the first sample final representation vector.
- the copywriting generation device 800 is configured to input the training latent vector, the first vector, the similar semantic representation vector and the first sample final representation vector into a layer of perceptual neural network model to obtain the second posterior parameters.
- the copywriting generation device 800 is configured to determine the second vector among the category distribution vectors determined through the second posterior parameter.
- the receiving response module 803 in the copy generation device 800 is configured to obtain the multiple related information, perform denoising and filtering processing on the multiple related information and then divide it into multiple keyword segments; wherein, Each keyword segment includes: multiple keywords; each keyword segment is processed using a preset semantic model to obtain a keyword segment vector corresponding to each keyword segment, and then multiple keyword segment vectors are obtained; The multiple keyword segment vectors are processed through a clustering algorithm to obtain multiple clusters. Keyword fragments corresponding to the keyword fragment vectors in each cluster are combined to obtain an item information fragment, and then multiple item information fragments are obtained to form the item detailed information.
- the processing module 804 in the copy generation device 800 is configured to combine the item title information and the keywords in the item feature information to obtain basic item information;
- the model encoder processes to obtain a first intermediate semantic representation vector; average pooling is performed on the first intermediate semantic representation vector to obtain the first semantic representation vector.
- the copy determination module 805 in the copy generation device 800 is configured to calculate a target representation recognition vector using the second target vector and the first semantic representation vector; Assume that the perceptual neural network model is processed to obtain the target serial number; the corresponding target item information segment is determined among the multiple item information segments through the target serial number; the target item information segment is processed by the deep learning model encoder to obtain the target item segment Vector; input the target item fragment vector, the first target vector and the first intermediate semantic representation vector into the preset dialogue system model to obtain the target copy.
- the copy determination module 805 in the copy generation device 800 is configured to calculate the module length of the difference between the second target vector and the first semantic representation vector; calculate the difference between the second target vector and the first semantic representation vector; The product vector of the first semantic representation vector; the first semantic representation vector, the second target vector, the module length and the product vector are composed in order to form the target representation vector.
- the search information sent by the client is received through the receiving response module 803, and the corresponding item title information, item feature information and item detail information are obtained in response to the search information; the item detail information is obtained by processing a variety of related information.
- the vector is obtained by training the second pair of prior distribution and posterior distribution by combining the sample item detailed information; the target latent vector is calculated by combining the first target vector with the item detailed information; the target copy determination module 805 is combined with the second target vector and
- the first semantic representation vector determines the target copy through the preset dialogue system model and sends it to the client. Since the item detailed information in this scheme is composed of a variety of related information, and this scheme uses a pair of interactive preset latent variable models to learn two pairs of prior and posterior distributions, it can automatically learn from the heavy and lengthy item detailed information. Useful knowledge is selected to form the target copy, so this solution can improve the efficiency of generating copy, and also improve the accuracy of the copy's reflection of the true characteristics of the item.
- the above copywriting generation method is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer-readable storage medium.
- the computer software product is stored in a storage medium and includes a number of instructions to enable A copywriting generating device (which may be a personal computer, etc.) executes all or part of the methods described in various embodiments of this application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read Only Memory, ROM), magnetic disk or optical disk and other media that can store program code.
- embodiments of the present application are not limited to any specific combination of hardware and software.
- embodiments of the present application provide a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the above method are implemented.
- the embodiment of the present application provides a copywriting generation device, including a memory 802 and a processor 801.
- the memory 802 stores a computer program that can be run on the processor 801.
- the processor 801 executes the program, the steps in the above method.
- Figure 10 is a schematic diagram of a hardware entity of the copywriting generation device provided by the embodiment of the present application.
- the hardware entity of the copywriting generation device 800 includes: a processor 801 and a memory 802, wherein;
- the processor 801 generally controls the overall operation of the copy generation device 800.
- the memory 802 is configured to store instructions and applications executable by the processor 801, and can also cache data to be processed or processed by the processor 801 and each module in the copy generation device 800 (for example, image data, audio data, voice communication data and video communication data), which can be passed through flash memory (FLASH) or random access memory (Random Access Memory, RAM) implementation.
- flash memory FLASH
- random access Memory Random Access Memory
- the disclosed devices and methods can be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the units is only a logical function division.
- the coupling, direct coupling, or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be electrical, mechanical, or other forms. of.
- the units described above as separate components may or may not be physically separated; the components shown as units may or may not be physical units; they may be located in one place or distributed to multiple network units; Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- all functional units in the embodiments of the present application can be integrated into one processing unit, or each unit can be separately used as a unit, or two or more units can be integrated into one unit; the above-mentioned integration
- the unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
- the aforementioned program can be stored in a computer-readable storage medium.
- the execution includes: The steps of the above method embodiment; and the aforementioned storage media include: removable storage devices, read-only memory (Read Only Memory, ROM), magnetic disks or optical disks and other various media that can store program codes.
- ROM Read Only Memory
- the integrated units mentioned above in this application are implemented in the form of software function modules and sold or used as independent products, they can also be stored in a computer-readable storage medium.
- the computer software product is stored in a storage medium and includes a number of instructions to enable A computer device (which may be a personal computer, a server, a network device, etc.) executes all or part of the methods described in various embodiments of this application.
- the aforementioned storage media include: removable storage devices, ROMs, magnetic disks, optical disks and other media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
本申请提供了一种文案生成方法、装置及存储介质,方法包括:接收客户端发送的搜索信息,响应搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息;利用第一预设潜变量模型对计算的第一语义表示向量进行处理,得到第一目标向量;利用第二预设潜变量模型对计算的目标隐向量、第一目标向量和通过物品详情信息计算的第二语义表示向量进行处理,得到第二目标向量;结合第二目标向量与第一语义表示向量,通过预设对话系统模型确定出目标文案,以发送给客户端。
Description
相关申请的交叉引用
本申请基于申请号为202210905016.1、申请日为2022年07月29日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
本申请涉及数据处理技术领域,尤其涉及一种文案生成方法、装置及存储介质。
物品描述文案在电商系统中有着重要的作用,与仅仅推荐物品标题相比,精心撰写的物品描述文案可以更好地提升用户体验,避免用户阅读繁重冗长的物品详细信息。当前技术所用的文案生成技术主要是从外源数据库中手动地提取相关信息,缺少从包含全部物品信息的物品详情介绍中提取对文案有用知识,该方案不仅效率低,而且生成的文案也不能准确反映真实的物品特性。
发明内容
本申请实施例提供的一种文案生成方法、装置及存储介质。
本申请的技术方案是这样实现的:
本申请实施例提供了一种文案生成方法,包括:
接收客户端发送的搜索信息,响应所述搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息;所述物品详情信息是通过对多种相关信息进行处理得到的多个物品信息片段的集合;
利用第一预设潜变量模型对计算的第一语义表示向量进行处理,得到第一目标向量;所述第一预设潜变量模型是通过样本物品相关信息对第一对先验分布和后验分布训练得到的;所述第一语义表示向量是通过所述物品标题信息和所述物品特征信息计算的;
利用第二预设潜变量模型对计算的目标隐向量、所述第一目标向量和通过所述物品详情信息计算的第二语义表示向量进行处理,得到第二目标向量;所述第二预设潜变量模型是通过所述第一预设潜变量模型每次训练得到的第一向量,结合所述样本物品相关信息对第二对先验分布和后验分布训练得到的;所述目标隐向量是通过所述第一目标向量结合所述物品详情信息计算的;
结合所述第二目标向量与所述第一语义表示向量,通过预设对话系统模型确定出目标文案,发送给所述客户端。
本申请实施例还提供了一种文案生成装置,包括:
接收响应模块,被配置为接收客户端发送的搜索信息,响应所述搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息;所述物品详情信息是通过对多种相关信息进行处理得到的多个物品信息片段的集合;
处理模块,被配置为利用第一预设潜变量模型对计算的第一语义表示向量进行处理,得到第一目标向量;所述第一预设潜变量模型是通过样本物品相关信息对第一对先验分布和后验分布训练得到的;所述第一语义表示向量是通过所述物品标题信息和所述物品特征
信息计算的;
所述处理模块,被配置为利用第二预设潜变量模型对计算的目标隐向量、所述第一目标向量和通过所述物品详情信息计算的第二语义表示向量进行处理,得到第二目标向量;所述第二预设潜变量模型是通过所述第一预设潜变量模型每次训练得到的第一向量,结合所述样本物品相关信息对第二对先验分布和后验分布训练得到的;所述目标隐向量是通过所述第一目标向量结合所述物品详情信息计算的;
文案确定模块,被配置为结合所述第二目标向量与所述第一语义表示向量,通过预设对话系统模型确定出目标文案,发送给所述客户端。
本申请实施例还提供了一种文案生成装置,包括存储器和处理器,存储器存储有可在处理器上运行的计算机程序,处理器执行程序时实现上述方法中的步骤。
本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述方法中的步骤。
图1为本申请实施例提供的文案生成方法的一个可选的流程示意图;
图2为本申请实施例提供的文案生成方法的一个可选的效果示意图;
图3为本申请实施例提供的文案生成方法的一个可选的流程示意图;
图4为本申请实施例提供的文案生成方法的一个可选的流程示意图;
图5为本申请实施例提供的文案生成方法的一个可选的流程示意图;
图6为本申请实施例提供的文案生成方法的一个可选的流程示意图;
图7为本申请实施例提供的文案生成方法的一个可选的流程示意图;
图8为本申请实施例提供的文案生成方法的一个可选的流程示意图;
图9为本申请实施例提供的文案生成装置的结构示意图;
图10为本申请实施例提供的文案生成装置的一种硬件实体示意图。
为了使本申请的目的、技术方案和优点更加清楚,下面结合附图和实施例对本申请的技术方案进一步详细阐述,所描述的实施例不应视为对本申请的限制,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。
如果申请文件中出现“第一/第二”的类似描述则增加以下的说明,在以下的描述中,所涉及的术语“第一\第二\第三”仅仅是区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
传统的电商推荐系统将物品以列表的形式推荐给消费者。物品描述文案在电商系统中有着重要的作用,与仅仅推荐物品标题相比,精心撰写的物品描述文案可以更好地提升用户的购物体验,避免用户阅读繁重冗长的物品详细信息。物品文案自动生成技术就是给定物品的基础信息,如物品标题、物品特征和物品详情介绍等,自动生成物品描述文案的技
术。
现有物品文案自动生成技术主要依赖于物品标题信息和物品特征信息,通过将其输入到端到端的生成模型中,得到最终的物品描述文案。本方法受限于深度学习模型信息输入,现有技术主要是将物品标题信息和物品特征信息等精简的信息输入到深度学习模型中,这样生成的文案都是通用的缺少具体产品的具体特征,对用户而言无法获得具体的独特的产品特性,推荐效果差。
另一种方案所用的知识增强技术主要是从外源数据库中人工提取相关信息,缺少从包含全部物品信息的物品详情介绍中提取对模型生成文案有用知识,人工提取知识需要耗费大量的人力且并不是与具体物品完全相关的,外源知识库也并不能针对任意物品都提取出相关信息,比如相对新颖的物品或非常冷门的物品。所以相关技术中,生成方案不仅效率低,而且生成的文案也不能准确反映真实的物品特性。
本申请实施例提供了一种文案生成方法,请参阅图1,为本申请实施例提供的文案生成方法的一个可选的流程示意图,将结合图1示出的步骤进行说明。
S101、接收客户端发送的搜索信息,响应搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息;物品详情信息是通过对多种相关信息进行处理得到的多个物品信息片段的集合。
本申请实施例中,服务器接收客户端发送的搜索信息,响应搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息。其中,物品详情信息是通过对多种相关信息进行处理得到的多个物品信息片段的集合。
本申请实施例中,服务器接收客户端发送的物品信息关键词,响应该物品信息关键词在本地数据库中检索得到与该物品信息关键词对应的物品标题信息、物品特征信息和多种相关信息。服务器再对多种相关信息进行除噪过滤分类处理得到物品详情信息。
本申请实施例中,服务器接收到客户端发送的物品编码信息,响应该物品编码信息在本地数据库中检索得到与该物品信息关键词对应的物品标题信息、物品特征信息和多种相关信息。服务器再对多种相关信息进行除噪过滤分类处理得到物品详情信息。
本申请实施例中,多种相关信息可以包括:存储的文本形式的物品广告信息、数字形式的物品型号信息、文字描述的物品具体使用方法和物品评价信息。服务器可以使用启发式算法对多种相关信息进行噪音过滤。服务器可以使用停用词等启发式规则将噪音过滤处理后的多种相关信息分为关键词片段。服务器通过预设语义模型得到每个关键词片段的关键词片段向量,进而得到多个关键词片段向量。服务器对多个关键词片段向量进行聚类,得到多个簇。服务器将每个簇内的关键词片段向量对应的关键词片段组合得到一个物品信息片段。进而得到了包括多个物品信息片段的物品详情信息。
示例性的,结合图2,服务器接收的搜索信息可以为高清网络机顶盒。服务器通过高清网络机顶盒在本地检索到物品标题信息“高清网络机顶盒”,物品特征信息“电脑、办公网络盒子、盒子、高清、机顶盒和无线”。多种相关信息可以包括:“手机投屏,小屏变大屏在手机上轻轻一点,即可将图片、视频投到电视上视野更宽阔,观影更震撼。语音控制,聪明又听话支持视频点播、频道切换、音量调节等,在手机上发出语音指令,它就能听懂你的话。”“我想看动作片,明天北京天气怎么样,扫地机器人去扫地。”“该盒子含开关机等形式的广告,开机时的广告视频不能删除、更改,且第三方内容的广告视频无法控制。”服务器可以对多种相关信息通过过滤分类处理得到物品详情描述信息。
S102、利用第一预设潜变量模型对计算的第一语义表示向量进行处理,得到第一目标向量;第一预设潜变量模型是通过样本物品相关信息对第一对先验分布和后验分布训练得到的;第一语义表示向量是通过物品标题信息和物品特征信息计算的。
本申请实施例中,服务器利用第一预设潜变量模型对计算的第一语义表示向量进行处理,得到第一目标向量。其中,第一预设潜变量模型是通过样本物品标题信息、样本物品
特征信息、样本物品详情信息和样本物品描述文案信息对第一对先验分布和后验分布训练得到的。第一语义表示向量是通过物品标题信息和物品特征信息计算的。样本物品相关信息包括:样本物品标题信息、样本物品特征信息、样本物品详情信息和样本物品描述文案信息。
本申请实施例中,服务器可以利用第一预设潜变量模型对计算的第一语义表示向量进行处理,通过在第一预设潜变量模型中的先验分布对应的高斯向量中确定出第一目标向量。
本申请实施例中,服务器可以将物品标题信息和物品特征信息组合,得到物品基础信息。服务器将物品基础信息输入深度学习模型编码器得到第一中间语义表示向量。服务器对第一中间语义表示向量进行平均池化处理,得到第一语义表示向量。
本申请实施例中,服务器利用样本物品标题信息、样本物品特征信息、相似样本信息片段(相似样本信息片段是多个物品信息片段中与样本物品描述文案信息相似度最大的一个片段)和样本物品描述文案信息计算出多个第一相关向量,通过多个第一相关向量确定出第一后验分布的第一后验参数。通过第一后验参数确定出第一向量,以对第一后验分布和第一先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的第一后验分布、第一先验分布的第一预设潜变量模型。
S103、利用第二预设潜变量模型对计算的目标隐向量、第一目标向量和通过物品详情信息计算的第二语义表示向量进行处理,得到第二目标向量;第二预设潜变量模型是通过第一预设潜变量模型每次训练得到的第一向量,结合样本物品相关信息对第二对先验分布和后验分布训练得到的;目标隐向量是通过第一目标向量结合物品详情信息计算的。
本申请实施例中,服务器利用第二预设潜变量模型对计算的目标隐向量、第一目标向量和通过物品详情信息计算的第二语义表示向量进行处理,得到第二目标向量,其中,第二预设潜变量模型是通过第一预设潜变量模型每次训练得到的第一向量,结合样本物品相关信息中的样本物品详情信息对第二对先验分布和后验分布训练得到的;目标隐向量是通过第一目标向量结合物品详情信息计算的。
本申请实施例中,服务器利用第二预设潜变量模型对目标隐向量、第一目标向量和第二语义表示向量进行处理,通过在第二预设潜变量模型中的先验分布对应的分类向量中确定出第二目标向量。
本申请实施例中,服务器在每次进行第一预设潜变量模型训练之后都会得到一个第一向量,服务器结合第一向量和样本物品详情信息对第二对先验分布和后验分布训练得到的第二预设潜变量模型。
S104、结合第二目标向量与第一语义表示向量,通过预设对话系统模型确定出目标文案,发送给客户端。
本申请实施例中,服务器结合第二目标向量与第一语义表示向量,通过预设对话系统模型确定出目标文案,发送给客户端。
本申请实施例中,服务器利用第二目标向量与第一语义表示向量计算得到目标表示识向量。服务器将目标表示识向量通过预设感知神经网络模型处理,得到目标序号;通过目标序号在多个物品信息片段中确定出对应的目标物品信息片段;将目标物品信息片段通过深度学习模型编码器的最后一层处理得到目标物品片段向量。服务器将目标物品片段向量、第一目标向量和第一中间语义表示向量输入预设对话系统模型得到目标文案。其中,预设感知神经网络模型是在第一预设潜变量模型和第二预设潜变量模型训练过程中得到的。
示例性的,结合图2,服务器最终得到的目标文案可以为“高清网络机顶盒,支持语音控制。这款电视盒子支持语音操控,可通过语音实现视频点播、频道切换、音量调节等功能,声控功能助力优质影音体验。”
本申请实施例中,接收客户端发送的搜索信息,响应搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息;物品详情信息是通过对多种相关信息进行处理得到的
多个物品信息片段的集合;利用第一预设潜变量模型对计算的第一语义表示向量进行处理,得到第一目标向量;第一预设潜变量模型是通过样本物品相关信息对第一对先验分布和后验分布训练得到的;第一语义表示向量是通过物品标题信息和物品特征信息计算的;利用第二预设潜变量模型对计算的目标隐向量、第一目标向量和通过物品相关信息计算的第二语义表示向量进行处理,得到第二目标向量;第二预设潜变量模型是通过第一预设潜变量模型每次训练得到的第一向量,结合样本物品详情信息对第二对先验分布和后验分布训练得到的;目标隐向量是通过第一目标向量结合物品详情信息计算的;结合第二目标向量与第一语义表示向量,通过预设对话系统模型确定出目标文案,发送给客户端。由于本方案中的物品详情信息是由多种相关信息组成的,而且本方案利用一对交互的预设潜变量模型分别学习两对先验与后验分布,可以自动地从繁重冗长的物品详情信息中选择有用的知识,进而形成目标文案,所以本方案可以提高生成文案的效率,而且也提高了文案对物品真实特性的反映准确度。
在一些实施例中,参见图3,图3为本申请实施例提供的文案生成方法的一个可选的流程示意图,图1示出的S102之前还可以包括S105至S110实现,将结合各步骤进行说明。
S105、获取样本物品标题信息、样本物品特征信息、样本物品详情信息和样本物品描述文案信息,并在样本物品详情信息中确定出,与样本物品描述文案信息相似度最大的相似样本信息片段。
本申请实施例中,服务器获取样本物品标题信息、样本物品特征信息、样本物品详情信息和样本物品描述文案信息,并在样本物品详情信息中确定出,与样本物品描述文案信息相似度最大的相似样本信息片段。
本申请实施例中,服务器可以在历史时间内接收到任意客户端发送的样本搜索信息,服务器响应该样本搜索信息获取得到样本物品标题信息、样本物品特征信息、样本物品详情信息和样本物品描述文案信息。其中,样本物品详情信息包括:多个样本信息片段。服务器计算读个样本信息片段与样本物品描述文案信息之间的相似度,服务器确定出多个样本信息片段中与样本物品描述文案信息相似度最大的一个相似样本信息片段。
其中,样本物品描述文案信息是预先撰写完成的对应该样本搜索信息的文案信息。
S106、利用样本物品标题信息、样本物品特征信息、相似样本信息片段和样本物品描述文案信息计算出多个第一相关向量,通过多个第一相关向量确定出第一后验分布的第一后验参数。
本申请实施例中,服务器利用样本物品标题信息、样本物品特征信息、相似样本信息片段和样本物品描述文案信息计算出多个第一相关向量,通过多个第一相关向量确定出第一后验分布的第一后验参数。
本申请实施例中,服务器将样本物品标题信息和样本物品特征信息结合,计算第一样本语义表示向量。服务器通过相似样本信息片段,确定出基础语义表示向量和相似语义表示向量。服务器利用样本物品描述文案信息,确定出文案基础语义表示向量和文案语义表示向量。服务器通过基础语义表示向量和文案基础语义表示向量,确定出第一相关隐向量和第二相关隐向量。多个第一相关向量包括:第一样本语义表示向量、相似语义表示向量、文案相似语义表示向量、第一相关隐向量和第二相关隐向量。服务器将第一样本语义表示向量、相似语义表示向量、文案相似语义表示向量、第一相关隐向量和第二相关隐向量输入一层感知神经网络模型得到第一后验参数。
S107、通过第一后验参数确定出第一向量,以对第一后验分布和第一先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的第一后验分布、第一先验分布的第一预设潜变量模型。
本申请实施例中,服务器通过第一后验参数确定出第一向量,以对第一后验分布和第
一先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的第一后验分布、第一先验分布的第一预设潜变量模型。
本申请实施例中,达到训练条件可以包括:第一后验分布和第一先验分布的拟合函数达到收敛。
本申请实施例中,服务器可以在第一后验参数确定的高斯分布向量中随机确定出第一向量。
S108、通过第一向量结合样本物品详情信息计算训练隐向量。
本申请实施例中,服务器通过第一向量结合样本物品详情信息计算训练隐向量。
本申请实施例中,服务器通过样本物品详情信息包括的多个样本信息片段,计算得到第一样本最终表示向量和第二样本最终表示向量。服务器将第一向量与第二样本最终表示向量,及预设参数相乘之后,通过激活函数处理得到中间计算训练隐向量。服务器将中间计算训练隐向量与第一样本最终表示向量相乘,得到训练隐向量。
S109、通过训练隐向量、第一向量和多个第二相关向量确定出第二后验分布的第二后验参数;多个第二相关向量是通过样本物品标题信息、样本物品特征信息和样本物品详情信息确定出的。
本申请实施例中,服务器通过训练隐向量、第一向量和多个第二相关向量确定出第二后验分布的第二后验参数。其中,多个第二相关向量是通过样本物品标题信息、样本物品特征信息和样本物品详情信息确定出的。
本申请实施例中,服务器将训练隐向量、第一向量和多个第二相关向量输入一层感知神经网络模型,得到第二后验参数。
S110、通过第二后验参数确定出第二向量,以对第二后验分布和第二先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的第二后验分布、第二先验分布的第二预设潜变量模型。
本申请实施例中,服务器通过第二后验参数确定出第二向量,以对第二后验分布和第二先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的第二后验分布、第二先验分布的第二预设潜变量模型。
本申请实施例中,服务器可以通过在第二后验参数确定出分类分布向量中确定出第二向量。
本申请实施例中,达到训练条件可以包括:第二后验分布和第二先验分布的拟合函数收敛。
本方案利用一对交互的预设潜变量模型分别学习两对先验与后验分布,自动地从繁重冗长的物品详情信息中选择有用的知识,进而形成目标文案,所以提高了文案对物品真实特性的反映准确度。
在一些实施例中,参见图4,图4为本申请实施例提供的文案生成方法的一个可选的流程示意图,图3示出的S105至S107还可以通过S111至S121实现,将结合各步骤进行说明。
S111、获取样本物品标题信息、样本物品特征信息、样本物品详情信息和样本物品描述文案信息,利用预设语义模型对多个样本信息片段进行处理,得到多个样本信息语义向量。
本申请实施例中,服务器获取样本物品标题信息、样本物品特征信息、样本物品详情信息和样本物品描述文案信息,利用预设语义模型对多个样本信息片段进行处理,得到多个样本信息语义向量。
本申请实施例中,样本物品详情信息包括:多个样本信息片段。服务器可以将多种样本相关信息预处理后样本物品详情,分为多个样本关键词片段。服务器利用停用词等启发式规则将预处理后样本物品详情划分为不同的片段并且仅保留长度在10到64个词的片段,
得到一系列子片段KF。通过这种方法我们过滤掉了如使用说明、产品参数和人工信息等对于模型生成效果没用的片段。
示例性的,Ktotal为预处理后样本物品详情,Kfragi为第i个样本关键词片段。m是样本关键词片段个数。
本申请实施例中,服务器可以通过预设算法处理得到每一样本关键词片段的表示向量。
本申请实施例中,服务器使用Sentence-Bert[8]获得每一样本关键词片段Kfragi∈KF的表示向量Efragi。服务器可以使用K-means[9]算法处理多个表示向量,将语义相似的聚类到同一组,这样得到了物品知识组KP。我们将同一蔟内(即语义表示相似的片段)的片段按照字母表的顺序连接在一起获得样本信息片段Ki,表示为KP,即样本物品详情信息。
示例性的,Ki为第i个样本信息片段,KP为多个样本信息片段。
本申请实施例中,服务器再通过预设语义模型对多个样本信息片段处理,得到多个样本信息语义向量。
其中,预设语义模型可以为Sentence-Bert模型。
S112、利用预设语义模型对样本物品描述文案信息进行处理,得到样本物品描述语义向量。
本申请实施例中,服务器利用预设语义模型对样本物品描述文案信息进行处理,得到样本物品描述语义向量。
本申请实施例中,服务器利用Sentence-Bert模型对样本物品描述文案信息进行处理,得到样本物品描述语义向量。
本申请实施例中,预设语义模型还可以为其他的语义模型,本申请实施例中不做限制。
S113、对每个样本信息语义向量与样本物品描述语义向量进行相似度计算,得到对应多个样本信息语义向量的多个相似度。
本申请实施例中,服务器对每个样本信息语义向量与样本物品描述语义向量进行相似度计算,得到对应多个样本信息语义向量的多个相似度。
本申请实施例中,服务器可以对每个样本信息语义向量与样本物品描述语义向量进行余弦相似度计算,得到对应多个样本信息语义向量的多个相似度。
S114、确定最大相似度对应的样本信息片段为相似样本信息片段。
本申请实施例中,服务器确定最大相似度对应的样本信息片段为相似样本信息片段。
本申请实施例中,服务器使用Sentence-Bert模型获得每个知识片段ki∈KP和物品描述文案信息的语义向量,分别表示为Rki和RD。我们计算Rki和RD之间的余弦相似度COSRki,RD并将相似度最大的片段作为伪标签标注的相似样本信息片段Kpse用于后续模型训练过程。示例性的,可以通过公式(1)计算Kpse。
S115、将样本物品标题信息和样本物品特征信息结合,计算第一样本语义表示向量。
本申请实施例中,服务器将样本物品标题信息和样本物品特征信息结合,计算第一样本语义表示向量。
本申请实施例中,服务器将样本标题信息和样本物品特征信息中的样本关键词集合,得到样本物品基础信息。服务器对样本物品基础信息通过编码器和池化处理得到第一样本语义表示向量。
示出的S115还可以通过S1151至S1153实现,将结合各步骤进行说明。
S1151、将样本物品标题信息和样本物品特征信息中的多个样本关键词组合,得到样本物品基础信息。
本申请实施例中,服务器将样本物品标题信息和样本物品特征信息中的多个样本关键词组合,得到样本物品基础信息。
S1152、将样本物品基础信息通过深度学习模型编码器处理得到第一样本中间语义表示向量。
本申请实施例中,服务器将样本物品基础信息通过深度学习模型编码器的最后一层处理得到第一样本中间语义表示向量。
S1153、对第一样本中间语义表示向量进行平均池化,得到第一样本语义表示向量。
本申请实施例中,服务器对第一样本中间语义表示向量进行平均池化,得到第一样本语义表示向量。
本申请实施例中,服务器训练数据构造模块采用原生的Transformer encoder网络结构。将样本物品标题信息T和样本物品特征信息A={a1;a2;.....;aA}连接起来作为样本物品基础信息P。
P={T;a1;a2;.....;aA}
P={T;a1;a2;.....;aA}
其中,a1代表样本物品特征信息中的第一个样本关键词,{T;a1;a2;.....;aA}代表将序列连接,即将文本直接接在后面。服务器将样本物品基础信息P输入到深度学习模型编码器的第一层,得到第一样本中间语义表示向量EP是词向量(word embedding)WE与位置向量(position embedding)PE的和。示例性的,可以通过公式(2)计算EP。
EP=WE(P)+PE(P) (2)
EP=WE(P)+PE(P) (2)
服务器将第一样本中间语义表示向量通过多层Transformer Encoder并将最后一层网络的输出经过average pooling池化操作得到第一样本语义表示向量Hp。示例性的,可以通过公式(3)计算Hp。
S116、通过相似样本信息片段,确定出基础语义表示向量和相似语义表示向量。
本申请实施例中,服务器通过相似样本信息片段,确定出基础语义表示向量和相似语义表示向量。
本申请实施例中。服务器将样本信息片段输入到深度学习模型编码器的第一层,通过多层Transformer Encoder并将最后一层网络的输出基础语义表示向量。服务器对基础语义表示向量进行average pooling池化操作得到相似语义表示向量。
示出的S116还可以通过S1161至S1162实现,将结合各步骤进行说明。
S1161、将相似样本信息片段通过深度学习模型编码器处理得到基础语义表示向量。
本申请实施例中,服务器将将相似样本信息片段通过深度学习模型编码器的最后一层处理得到基础语义表示向量。
S1162、对基础语义表示向量进行平均池化,得到相似语义表示向量。
本申请实施例中,服务器对基础语义表示向量进行平均池化,得到相似语义表示向量。
S117、利用样本物品描述文案信息,确定出文案基础语义表示向量和文案语义表示向量。
本申请实施例中,服务器利用样本物品描述文案信息,确定出文案基础语义表示向量和文案语义表示向量。
本申请实施例中,服务器将样本物品描述文案信息输入到深度学习模型编码器的第一
层,通过Transformer Encoder最后一层网络的输出获取文案基础语义表示向量服务器对进行average pooling池化操作得到文案语义表示向量HD。
示出的S117还可以通过S1171至S1172实现,将结合各步骤进行说明。
S1171、将样本物品描述文案信息通过深度学习模型编码器处理得到文案基础语义表示向量。
本申请实施例中,服务器将样本物品描述文案信息通过深度学习模型编码器的最后一层处理得到文案基础语义表示向量。
本申请实施例中,服务器将样本物品描述文案信息通过深度学习模型编码器的第一层至最后一层处理得到文案基础语义表示向量。
S1172、对文案基础语义表示向量进行平均池化,得到文案语义表示向量。
本申请实施例中,服务器对文案基础语义表示向量进行平均池化,得到文案语义表示向量。
S118、通过基础语义表示向量和文案基础语义表示向量,确定出第一相关隐向量和第二相关隐向量。
本申请实施例中,服务器通过基础语义表示向量和文案基础语义表示向量,确定出第一相关隐向量和第二相关隐向量。
示出的S118还可以通过S1181至S1182实现,将结合各步骤进行说明。
S1181、通过基础语义表示向量和文案基础语义表示向量,结合第一对应参数计算出第一相关隐向量。
本申请实施例中,服务器通过基础语义表示向量和文案基础语义表示向量,结合第一对应参数计算出第一相关隐向量。
示出的S1181还可以通过S11811至S11813实现,将结合各步骤进行说明。
S11811、求基础语义表示向量与第一参数的第一乘积向量。
本申请实施例中,服务器求基础语义表示向量与第一参数的第一乘积向量。
本申请实施例中,服务器可以通过公式(4)计算得到第一乘积向量KK。
KK=WKEK (4)
KK=WKEK (4)
其中,WK为第一参数,EK基础语义表示向量。
S11812、求文案基础语义表示向量与第二参数的第二乘积向量。
本申请实施例中,服务器求文案基础语义表示向量与第二参数的第二乘积向量。
本申请实施例中,服务器可以通过公式(5)计算得到第二乘积向量QD。
QD=WQED (5)
QD=WQED (5)
其中,WQ为第二参数,ED文案基础语义表示向量。
S11813、将第一乘积向量,与第二乘积向量相乘再通过激活和平均池化处理,得到第一相关隐向量。
本申请实施例中,服务器将第一乘积向量,与第二乘积向量相乘再通过激活和平均池化处理,得到第一相关隐向量。
本申请实施例中,服务器可以通过公式(6)计算得到第一相关隐向量
S1182、通过基础语义表示向量和文案基础语义表示向量,结合第二对应参数计算出第二相关隐向量。
本申请实施例中,服务器通过基础语义表示向量和文案基础语义表示向量,结合第二对应参数计算出第二相关隐向量。
示出的S1182还可以通过S11821至S11823实现,将结合各步骤进行说明。
S11821、求基础语义表示向量与第三参数的第三乘积向量。
本申请实施例中,服务器求基础语义表示向量与第三参数的第三乘积向量。
本申请实施例中,服务器可以通过公式(7)计算得到第三乘积向量
其中,WD为第三参数,ED为基础语义表示向量。
S11822、求文案基础语义表示向量与第四参数的第四乘积向量。
本申请实施例中,服务器求文案基础语义表示向量与第四参数的第四乘积向量。
本申请实施例中,服务器可以通过公式(8)计算得到第三乘积向量
其中,Wk为第四参数,Ek文案基础语义表示向量。
S11823、将第三乘积向量,与第四乘积向量相乘再通过激活和平均池化处理,得到第二相关隐向量。
本申请实施例中,服务器将将第三乘积向量,与第四乘积向量相乘再通过激活和平均池化处理,得到第二相关隐向量。
本申请实施例中,服务器可以通过公式(9)计算得到第二相关隐向量
S119、将第一语义表示向量、相似语义表示向量、文案相似语义表示向量、第一相关隐向量和第二相关隐向量输入一层感知神经网络模型,得到第一后验数学期望。
本申请实施例中,服务器将第一语义表示向量、相似语义表示向量、文案相似语义表示向量、第一相关隐向量和第二相关隐向量输入一层感知神经网络模型,得到第一后验数学期望。
为建立知识选择与物品文案生成的联系设计了一对交互的变分自编码器,分别学习物品描述潜在变量和物品知识潜在变量。对于物品描述潜在变量学习模块,意在提升生成文案的多样性并指导知识选择的过程,本模块学习了关于物品信息的高斯分布。首先,为了增强文案和伪标签之间的关系,计算了他们之间的隐向量表示和
S120、对第一后验数学期望通过激活函数处理,得到第一后验方差。
本申请实施例中,服务器对第一后验数学期望通过激活函数处理,得到第一后验方差。
本申请实施例中,服务器可以将
同HP、HK、HD连接起来作为Hdes输入一层感知神经网络模型(Multi-layerPerceptron,MLP)获得后验高斯分布的参数μ和σ。
S121、在通过第一后验参数确定出的高斯分布向量中随机确定出第一向量,以对第一后验分布和第一先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的第一后验分布、第一先验分布的第一预设潜变量模型。
本申请实施例中,服务器在通过第一后验参数确定出的高斯分布向量中随机确定出第一向量,以对第一后验分布和第一先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的第一后验分布、第一先验分布的第一预设潜变量模型。
本申请实施例中,服务器可以通过公式(10)计算得到参数μ和σ。
后验高斯分布可以描述为:
qφ(zd|D,A,T,K)=Nφ(zd|μ,σI)其中,zd为第一向量。T为样本物品标题信息、A为样本物品特征信息、K为样本物品详情信息,D为样本物品描述文案信息。
对于先验高斯分布服务器仅输入HP,参数μ′和σ′的计算方法同后验分布相似。
本申请实施例中,服务器可以通过公式(11)计算得到参数μ′和σ′。
本申请实施例中,服务器在训练过程中从后验分布采样zd,在推断过程中我们从先验分布中采样。
本申请实施例中,服务器训练过程中,由于高斯分布不可导,我们使用重参数技巧进行随机采样。为拟合先验与后验分布表示,我们引入KL散度。第一后验分布和第一先验分布的拟合函数可以为公式(12)。
本申请实施例中,服务器通过第一样本语义表示向量、相似语义表示向量、文案相似语义表示向量、所述第一相关隐向量和所述第二相关隐向量确定出参数μ′和σ′,再结合参数μ′和σ′确定出第一向量来进行第一预设潜变量模型和第二预设潜变量模型的训练,使得两个预设潜变量模型分别交互学习了两对先验与后验分布,自动地从繁重冗长的物品详情信息中选择有用的知识,进而形成目标文案,提高了文案对物品真实特性的反映准确度。
在一些实施例中,参见图5,图5为本申请实施例提供的文案生成方法的一个可选的流程示意图,图3示出的S108至S109还可以通过S122至S125实现,将结合各步骤进行说明。
S122、通过样本物品详情信息包括的多个样本信息片段,计算得到第一样本最终表示向量和第二样本最终表示向量。
本申请实施例中,服务器通过样本物品详情信息包括的多个样本信息片段,计算得到第一样本最终表示向量和第二样本最终表示向量。
示出的S122还可以通过S1221至S1222实现,将结合各步骤进行说明。
S1221、将多个样本信息片段分别通过深度学习模型编码器处理得到多个样本信息片段对应的多个第一中间样本最终表示向量,再对多个第一中间样本最终表示向量进行平均池化处理,得到第一样本最终表示向量。
本申请实施例中,服务器将多个样本信息片段分别通过深度学习模型编码器的最后一层处理得到多个样本信息片段对应的多个第一中间样本最终表示向量,再对多个第一中间样本最终表示向量进行平均池化处理,得到第一样本最终表示向量。
本申请实施例中,对于样本物品详情信息,在其预处理模块我们获得了物品知识组KP,为了区别同一组内的不同知识,在其基础向量中增加段落向量。
其中,为第一中间样本最终表示向量,WE(Kj)为对应的词向量,PE(Kj)为对应的位置向量,SE(Kj)为对应的段落向量。
本申请实施例中,服务器可以通过公式(13)计算第一样本最终表示向量HKP。
其中,表示多个样本信息片段中第j个样本信息片段的第一中间样本最终表示向量。同样经过transformer encoder获得知识库的语义表示向量经过average pooling操作获得知识库的最终表示向量。
S1222、将多个样本信息片段组合后,通过深度学习模型编码器处理得到第二中间样本最终表示向量,再对多个第二中间样本最终表示向量进行平均池化处理,得到第二样本最终表示向量。
本申请实施例中,服务器将将多个样本信息片段组合后,通过深度学习模型编码器的最后一层处理得到第二中间样本最终表示向量,再对多个第二中间样本最终表示向量进行平均池化处理,得到第二样本最终表示向量。
S123、将第一向量与第二样本最终表示向量,及预设参数相乘之后,通过激活函数处理得到中间训练隐向量。
本申请实施例中,服务器将第一向量与第二样本最终表示向量,及预设参数相乘之后,通过激活函数处理得到中间训练隐向量。
S124、将中间计算训练隐向量与第一样本最终表示向量相乘,得到训练隐向量。
本申请实施例中,服务器将中间计算训练隐向量与第一样本最终表示向量相乘,得到训练隐向量。
本申请实施例中,服务器可以通过公式(14)计算得到训练隐向量
其中,Wd是预设参数,是第二样本最终表示向量,zd是第一向量,HKP是第一样本最终表示向量。
S125、将训练隐向量、第一向量、相似语义表示向量和第一样本最终表示向量输入一层感知神经网络模型,得到第二后验参数;多个第二相关向量是通过样本物品标题信息、样本物品特征信息和样本物品详情信息确定出的。
本申请实施例中,服务器将训练隐向量、第一向量、相似语义表示向量和第一样本最终表示向量输入一层感知神经网络模型,得到第二后验参数;多个第二相关向量是通过样本物品标题信息、样本物品特征信息和样本物品详情信息确定出的。
S126、在通过第二后验参数确定出的类别分布向量中确定出第二向量,以对第二后验分布和第二先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的第二后验分布、第二先验分布的第二预设潜变量模型。
本申请实施例中,服务器在通过第二后验参数确定出的类别分布向量中确定出第二向量,以对第二后验分布和第二先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的第二后验分布、第二先验分布的第二预设潜变量模型。
本申请实施例中,服务器在训练阶段将训练隐向量和第一向量zd同第一样本最终表示向量HKP和相似语义表示向量HK一同输入MLP层计算第二后验参数π。服务器可以去掉HK后输入另一个MLP层获得相应先验分布的参数π′。
示例性的,服务器可以通过公式(15)计算第二后验参数π。
其中,是训练隐向量,zd是第一向量,HKP是第一样本最终表示向量,HK是相似语义表示向量。
示例性的,服务器可以通过公式(16)计算第二后验参数π′。
第二后验分部和第二先验分布可以被描述为:
qφ(zk|zd,K,KP)=Catφ(π)
qφ(zk|zd,K,KP)=Catφ(π)
对于训练过程和推断过程,zk分别从第二后验分布与第二先验分布中采样。
服务器需要计算第二后验分布与第二先验分布之间的KL散度来拟合其距离,其拟合公式可以为公式(17)。
本申请实施例中,由于分布不可导我们使用重参数技巧进行采用,由于分布是离散的我们使用gumbel-softmax。
本申请实施例中,服务器通过训练第二对后验分布与先验分布,而且第二对后验分布与先验分布会通过神经网络训练计算过程中的反向传播间接地影响第二对后验分布与先验分布。使得两对后验分布与先验分布交互训练,进而能提取出准确的目标文案。
在一些实施例中,参见图6,图6为本申请实施例提供的文案生成方法的一个可选的流程示意图,图1示出的S101还可以通过S127至S133实现,将结合各步骤进行说明。
S127、获取多种相关信息,对多种相关信息进行除噪过滤处理后划分为多个关键词片段。
本申请实施例中,服务器获取多种相关信息,对多种相关信息进行除噪过滤处理后划分为多个关键词片段。
其中,每个关键词片段包括:多个关键词。
本申请实施例中,服务器可以采用启发式算法对多种相关信息进行噪音过滤,再利用停用词等启发式规则将其划分为多个关键词片段。
S128、利用预设语义模型对每个关键词片段进行处理,得到每个关键词片段对应的关键词片段向量,进而得到多个关键词片段向量。
本申请实施例中,服务器利用预设语义模型对每个关键词片段进行处理,得到每个关键词片段对应的关键词片段向量,进而得到多个关键词片段向量。
其中,预设语义模型可以为Sentence-Bert模型。
S129、通过聚类算法对多个关键词片段向量进行处理,得到多个簇。
本申请实施例中,服务器通过聚类算法对多个关键词片段向量进行处理,得到多个簇。
其中,聚类算法可以为k均值聚类算法(k-means clustering algorithm-9,k-means-9)。
S130、将每个簇中的关键词片段向量对应的关键词片段组合得到一个物品信息片段,进而得到多个物品信息片段,以形成物品详情信息。
本申请实施例中,服务器将将每个簇中的关键词片段向量对应的关键词片段组合得到一个物品信息片段,进而得到多个物品信息片段,以形成物品详情信息。
S131、将物品标题信息和物品特征信息中的关键词组合,得到物品基础信息。
本申请实施例中,服务器物品标题信息和物品特征信息中的关键词组合,得到物品基础信息。
本申请实施例中,服务器将物品标题信息和物品特征信息中的关键词按照顺序组合,得到物品基础信息。
S132、将物品基础信息通过深度学习模型编码器处理得到第一中间语义表示向量。
本申请实施例中,服务器将物品基础信息通过深度学习模型编码器的最后一层处理得到第一中间语义表示向量。
本申请实施例中,服务器将物品基础信息通过深度学习模型编码器第一层处理后,传输到下一层,直至最后一层处理得到第一中间语义表示向量。
S133、对第一中间语义表示向量进行平均池化,得到第一语义表示向量。
本申请实施例中,服务器对第一中间语义表示向量进行平均池化,得到第一语义表示向量。
本申请实施例中,服务器通过在多种相关信息中提取出物品详情信息,以计算第一语义表示向量,由于多种相关信息的来源广泛,进而服务器可以通过该方案扩大查找范围,以得到准确的目标文案。
在一些实施例中,参见图7,图7为本申请实施例提供的文案生成方法的一个可选的流程示意图,图1示出的S104还可以通过S134至S138实现,将结合各步骤进行说明。
S134、利用第二目标向量与第一语义表示向量计算得到目标表示识向量。
本申请实施例中,服务器利用第二目标向量与第一语义表示向量计算得到目标表示识向量。
示出的S134还可以通过S1341至S1343实现。
S1341、计算第二目标向量与第一语义表示向量之差的模长。
本申请实施例中,服务器计算第二目标向量与第一语义表示向量之差的模长。
S1342、计算第二目标向量与第一语义表示向量的乘积向量。
本申请实施例中,服务器计算第二目标向量与第一语义表示向量的乘积向量。
S13431、将第一语义表示向量、第二目标向量、模长和乘积向量按照顺序组成目标表示向量。
本申请实施例中,服务器将第一语义表示向量、第二目标向量、模长和乘积向量按照顺序组成目标表示向量。
服务器获得了从后验和先验采样的物品知识潜在变量后,进行物品知识选择过程。在物品知识选择模块中,使用启发式匹配算法计算知识选择的目标表示向量。
示例性的,服务器可以通过公式(18)计算目标表示向量Hsel。
Hsel=[HP,zk,HP-zk,HP*zk](18)
Hsel=[HP,zk,HP-zk,HP*zk](18)
其中,HP为第一语义表示向量,zk为第二目标向量。
S135、将目标表示识向量通过预设感知神经网络模型处理,得到目标序号。
本申请实施例中,服务器将目标表示识向量通过预设感知神经网络模型处理,得到目标序号。
本申请实施例中,服务器将目标表示识向量输入到MLP层预测目标知识的标签。预设感知神经网络模型的损失函数是:
本申请实施例中,预设感知神经网络模型是Transformer Encoder最后一层的神经网络模型。
S136、通过目标序号在多个物品信息片段中确定出对应的目标物品信息片段。
本申请实施例中,服务器通过目标序号在多个物品信息片段中确定出对应的目标物品信息片段。
本申请实施例中,由于物品详情信息中的每个物品信息片段都对应有序号。
S137、将目标物品信息片段通过深度学习模型编码器处理得到目标物品片段向量。
本申请实施例中,服务器将将目标物品信息片段通过深度学习模型编码器的最后一层处理得到目标物品片段向量。
S138、将目标物品片段向量、第一目标向量和第一中间语义表示向量输入预设对话系统模型得到目标文案。
本申请实施例中,服务器将目标物品片段向量、第一目标向量和第一中间语义表示向量输入预设对话系统模型得到目标文案。
本申请实施例中,服务器将第一中间语义表示向量,第一目标向量和目标物品片段向量,输入到带有copy mechanism的Transformer decoder层去生成物品文案。服务器将第一中间语义表示向量,第一目标向量,和目标物品片段向量相加在一起共同输入decoder。Copy mechanism用于从物品详情信息、物品标题信息和物品特征信息中复制文字,得到目标文案。本申请实施例中,服务器将第一中间语义表示向量,第一目标向量和目标物品片段向量,输入到带有copy mechanism的Transformer decoder层,直至生成预定个数词后停止,得到目标文案。
本申请实施例中,服务器将所述目标物品片段向量、所述第一目标向量和第一中间语义表示向量输入所述预设对话系统模型得到目标文案,由于目标物品片段向量对应的目标物品片段是通过第一预设潜变量模型和第二预设潜变量模型在物品详情信息中确定出来的,而且第一预设潜变量模型和第二预设潜变量模型在训练过程中得到的充分的交互,有利于确定出准确的目标文案。
在一些实施例中,参见图8,图8为本申请实施例提供的文案生成方法的一个可选的流程示意图,图1示出的S101至S104还可以通过S201至S207实现,将结合各步骤进行说明。
S201、物品详情信息预处理模块。
本申请实施例中,物品详情信息预处理模块用于对多种相关信息处理,得到物品详情信息。
S202、训练数据构造模块。
本申请实施例中,训练数据构造模块用于根据样本信息,计算训练数据。
S203、物品信息语义理解模块。
S204、物品描述隐变量学习模块。
本申请实施例中,物品描述隐变量学习模块用于对第一预设潜变量模型进行训练。
S205、物品知识隐变量学习模块。
本申请实施例中,物品知识隐变量学习模块用于对第二预设潜变量模型进行训练。
S206、物品知识选择模块。
本申请实施例中,物品知识选择模块用于在物品详情信息中确定出目标物品信息片段
S207、物品描述生成模块。
本申请实施例中,物品描述生成模块用于利用目标物品片段向量、第一目标向量和第一中间语义表示向量输入预设对话系统模型得到目标文案。
参见图9,图9为本申请实施例提供的文案生成装置的结构示意图。
本申请实施例还提供了一种文案生成装置800,包括:接收响应模块803、处理模块804和文案确定模块805。
接收响应模块803,被配置为接收客户端发送的搜索信息,响应所述搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息;所述物品详情信息是通过对多种相关信息进行处理得到的多个物品信息片段的集合;
处理模块804,被配置为利用第一预设潜变量模型对计算的第一语义表示向量进行处理,得到第一目标向量;所述第一预设潜变量模型是通过样本物品相关信息对第一对先验分布和后验分布训练得到的;所述第一语义表示向量是通过所述物品标题信息和所述物品
特征信息计算的;
所述处理模块805,还被配置为利用第二预设潜变量模型对计算的目标隐向量、所述第一目标向量和通过所述物品详情信息计算的第二语义表示向量进行处理,得到第二目标向量;所述第二预设潜变量模型是通过所述第一预设潜变量模型每次训练得到的第一向量,结合所述样本物品相关信息对第二对先验分布和后验分布训练得到的;所述目标隐向量是通过所述第一目标向量结合所述物品详情信息计算的;
文案确定模块,被配置为结合所述第二目标向量与所述第一语义表示向量,通过预设对话系统模型确定出目标文案,发送给所述客户端。
本申请实施例中,文案生成装置800被配置为获取所述样本物品标题信息、所述样本物品特征信息、所述样本物品详情信息和所述样本物品描述文案信息,并在所述样本物品详情信息中确定出,与所述样本物品描述文案信息相似度最大的相似样本信息片段;利用所述样本物品标题信息、所述样本物品特征信息、所述相似样本信息片段和所述样本物品描述文案信息计算出多个第一相关向量,通过所述多个第一相关向量确定出第一后验分布的第一后验参数;通过所述第一后验参数确定出所述第一向量,以对所述第一后验分布和第一先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的所述第一后验分布、所述第一先验分布的所述第一预设潜变量模型;通过所述第一向量结合所述样本物品详情信息计算训练隐向量;通过所述训练隐向量、所述第一向量和多个第二相关向量确定出第二后验分布的第二后验参数;所述多个第二相关向量是通过所述样本物品标题信息、所述样本物品特征信息和所述样本物品详情信息确定出的;通过所述第二后验参数确定出第二向量,以对所述第二后验分布和第二先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的所述第二后验分布、所述第二先验分布的所述第二预设潜变量模型。
本申请实施例中,所述样本物品详情信息包括:多个样本信息片段;文案生成装置800被配置为利用预设语义模型对所述多个样本信息片段进行处理,得到多个样本信息语义向量;利用所述预设语义模型对所述样本物品描述文案信息进行处理,得到样本物品描述语义向量;对每个样本信息语义向量与所述样本物品描述语义向量进行相似度计算,得到对应所述多个样本信息语义向量的多个相似度;确定最大相似度对应的样本信息片段为所述相似样本信息片段。
本申请实施例中,文案生成装置800被配置为将所述样本物品标题信息和所述样本物品特征信息结合,计算第一样本语义表示向量;通过所述相似样本信息片段,确定出基础语义表示向量和相似语义表示向量;利用所述样本物品描述文案信息,确定出文案基础语义表示向量和文案语义表示向量;通过所述基础语义表示向量和所述文案基础语义表示向量,确定出第一相关隐向量和第二相关隐向量;所述多个第一相关向量包括:所述第一样本语义表示向量、相似语义表示向量、所述文案相似语义表示向量、所述第一相关隐向量和所述第二相关隐向量。
本申请实施例中,文案生成装置800被配置为将所述样本物品标题信息和所述样本物品特征信息中的多个样本关键词组合,得到样本物品基础信息;将所述样本物品基础信息通过深度学习模型编码器处理得到第一样本中间语义表示向量;对所述第一样本中间语义表示向量进行平均池化,得到所述第一样本语义表示向量。
本申请实施例中,文案生成装置800被配置为将所述相似样本信息片段通过深度学习模型编码器处理得到所述基础语义表示向量;对所述基础语义表示向量进行平均池化,得到所述相似语义表示向量。
本申请实施例中,文案生成装置800被配置为将所述样本物品描述文案信息通过深度学习模型编码器处理得到所述文案基础语义表示向量;对所述文案基础语义表示向量进行平均池化,得到所述文案语义表示向量。
本申请实施例中,文案生成装置800被配置为通过所述基础语义表示向量和所述文案基础语义表示向量,结合第一对应参数计算出所述第一相关隐向量;通过所述基础语义表示向量和所述文案基础语义表示向量,结合第二对应参数计算出所述第二相关隐向量。
本申请实施例中,文案生成装置800被配置为求所述基础语义表示向量与第一参数的第一乘积向量;求所述文案基础语义表示向量与第二参数的第二乘积向量;将所述第一乘积向量,与所述第二乘积向量相乘再通过激活和平均池化处理,得到所述第一相关隐向量。
本申请实施例中,文案生成装置800被配置为求所述基础语义表示向量与第三参数的第三乘积向量;求所述文案基础语义表示向量与第四参数的第四乘积向量;将所述第三乘积向量,与所述第四乘积向量相乘再通过激活和平均池化处理,得到所述第二相关隐向量。
本申请实施例中,文案生成装置800被配置为将所述第一语义表示向量、相似语义表示向量、所述文案相似语义表示向量、所述第一相关隐向量和所述第二相关隐向量输入一层感知神经网络模型,得到第一后验数学期望;
对所述第一后验数学期望通过激活函数处理,得到第一后验方差。
本申请实施例中,文案生成装置800被配置为在通过所述第一后验参数确定出的高斯分布向量中随机确定出所述第一向量。
本申请实施例中,文案生成装置800被配置为通过所述样本物品详情信息包括的所述多个样本信息片段,计算得到第一样本最终表示向量和第二样本最终表示向量;
将所述第一向量与所述第二样本最终表示向量,及预设参数相乘之后,通过激活函数处理得到中间训练隐向量;
将所述中间计算训练隐向量与所述第一样本最终表示向量相乘,得到所述训练隐向量。
本申请实施例中,文案生成装置800被配置为将所述多个样本信息片段分别通过深度学习模型编码器处理得到所述多个样本信息片段对应的多个第一中间样本最终表示向量,再对所述多个第一中间样本最终表示向量进行平均池化处理,得到所述第一样本最终表示向量;将所述多个样本信息片段组合后,通过深度学习模型编码器处理得到第二中间样本最终表示向量,再对所述多个第二中间样本最终表示向量进行平均池化处理,得到所述第二样本最终表示向量。
本申请实施例中,所述多个第二相关向量包括:相似语义表示向量和所述第一样本最终表示向量。,文案生成装置800被配置为将所述训练隐向量、所述第一向量、所述相似语义表示向量和所述第一样本最终表示向量输入一层感知神经网络模型,得到所述第二后验参数。
本申请实施例中,文案生成装置800被配置为在通过所述第二后验参数确定出的类别分布向量中确定出所述第二向量。
本申请实施例中,文案生成装置800中的接收响应模块803被配置为获取所述多种相关信息,对所述多种相关信息进行除噪过滤处理后划分为多个关键词片段;其中,每个关键词片段包括:多个关键词;利用预设语义模型对每个关键词片段进行处理,得到所述每个关键词片段对应的关键词片段向量,进而得到多个关键词片段向量;通过聚类算法对所述多个关键词片段向量进行处理,得到多个簇。将每个簇中的关键词片段向量对应的关键词片段组合得到一个物品信息片段,进而得到多个物品信息片段,以形成所述物品详情信息。
本申请实施例中,文案生成装置800中的处理模块804被配置为将所述物品标题信息和所述物品特征信息中的关键词组合,得到物品基础信息;将所述物品基础信息通过深度学习模型编码器处理得到第一中间语义表示向量;对所述第一中间语义表示向量进行平均池化,得到所述第一语义表示向量。
本申请实施例中,文案生成装置800中的文案确定模块805被配置为利用所述第二目标向量与所述第一语义表示向量计算得到目标表示识向量;将所述目标表示识向量通过预
设感知神经网络模型处理,得到目标序号;通过所述目标序号在所述多个物品信息片段中确定出对应的目标物品信息片段;将目标物品信息片段通过深度学习模型编码器处理得到目标物品片段向量;将所述目标物品片段向量、所述第一目标向量和第一中间语义表示向量输入所述预设对话系统模型得到所述目标文案。
本申请实施例中,文案生成装置800中的文案确定模块805被配置为计算所述第二目标向量与所述第一语义表示向量之差的模长;计算所述第二目标向量与所述第一语义表示向量的乘积向量;将所述第一语义表示向量、所述第二目标向量、所述模长和所述乘积向量按照顺序组成所述目标表示向量。
本申请实施例中,通过接收响应模块803接收客户端发送的搜索信息,响应搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息;物品详情信息是通过对多种相关信息进行处理得到的多个物品信息片段的集合;通过处理模块804利用第一预设潜变量模型对计算的第一语义表示向量进行处理,得到第一目标向量;第一预设潜变量模型是通过样本物品相关信息对第一对先验分布和后验分布训练得到的;第一语义表示向量是通过物品标题信息和物品特征信息计算的;通过处理模块804利用第二预设潜变量模型对计算的目标隐向量、第一目标向量和通过物品相关信息计算的第二语义表示向量进行处理,得到第二目标向量;第二预设潜变量模型是通过第一预设潜变量模型每次训练得到的第一向量,结合样本物品详情信息对第二对先验分布和后验分布训练得到的;目标隐向量是通过第一目标向量结合物品详情信息计算的;通过目标文案确定模块805结合第二目标向量与第一语义表示向量,通过预设对话系统模型确定出目标文案,发送给客户端。由于本方案中的物品详情信息是由多种相关信息组成的,而且本方案利用一对交互的预设潜变量模型分别学习两对先验与后验分布,自动地从繁重冗长的物品详情信息中选择有用的知识,进而形成目标文案,所以本方案可以提高生成文案的效率,而且也提高了文案对物品真实特性的反映准确度。
需要说明的是,本申请实施例中,如果以软件功能模块的形式实现上述的文案生成方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台文案生成装置(可以是个人计算机等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本申请实施例不限制于任何特定的硬件和软件结合。
对应地,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述方法中的步骤。
对应地,本申请实施例提供一种文案生成装置,包括存储器802和处理器801,所述存储器802存储有可在处理器801上运行的计算机程序,所述处理器801执行所述程序时实现上述方法中的步骤。
这里需要指出的是:以上存储介质和装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请存储介质和装置实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。
需要说明的是,图10为本申请实施例提供的文案生成装置的一种硬件实体示意图,如图10所示,该文案生成装置800的硬件实体包括:处理器801和存储器802,其中;
处理器801通常控制文案生成装置800的总体操作。
存储器802配置为存储由处理器801可执行的指令和应用,还可以缓存待处理器801以及文案生成装置800中各模块待处理或已经处理的数据(例如,图像数据、音频数据、语音通信数据和视频通信数据),可以通过闪存(FLASH)或随机访问存储器(Random
Access Memory,RAM)实现。
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。
另外,在本申请各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储装置、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。
或者,本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机装置(可以是个人计算机、服务器、或者网络装置等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储装置、ROM、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
Claims (20)
- 一种文案生成方法,包括:接收客户端发送的搜索信息,响应所述搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息;所述物品详情信息是通过对多种相关信息进行处理得到的多个物品信息片段的集合;利用第一预设潜变量模型对计算的第一语义表示向量进行处理,得到第一目标向量;所述第一预设潜变量模型是通过样本物品相关信息对第一对先验分布和后验分布训练得到的;所述第一语义表示向量是通过所述物品标题信息和所述物品特征信息计算的;利用第二预设潜变量模型对计算的目标隐向量、所述第一目标向量和通过所述物品详情信息计算的第二语义表示向量进行处理,得到第二目标向量;所述第二预设潜变量模型是通过所述第一预设潜变量模型每次训练得到的第一向量,结合所述样本物品相关信息对第二对先验分布和后验分布训练得到的;所述目标隐向量是通过所述第一目标向量结合所述物品详情信息计算的;结合所述第二目标向量与所述第一语义表示向量,通过预设对话系统模型确定出目标文案,发送给所述客户端。
- 根据权利要求1所述的文案生成方法,其中,所述利用第一预设潜变量模型对计算的第一语义表示向量进行处理,得到第一目标向量之前,所述方法还包括:获取样本物品标题信息、样本物品特征信息、样本物品详情信息和样本物品描述文案信息,并在所述样本物品详情信息中确定出,与所述样本物品描述文案信息相似度最大的相似样本信息片段;所述样本物品相关信息包括:样本物品标题信息、样本物品特征信息、样本物品详情信息和样本物品描述文案信息;利用所述样本物品标题信息、所述样本物品特征信息、所述相似样本信息片段和所述样本物品描述文案信息计算出多个第一相关向量,通过所述多个第一相关向量确定出第一后验分布的第一后验参数;通过所述第一后验参数确定出所述第一向量,以对所述第一后验分布和第一先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的所述第一后验分布、所述第一先验分布的所述第一预设潜变量模型;通过所述第一向量结合所述样本物品详情信息计算训练隐向量;通过所述训练隐向量、所述第一向量和多个第二相关向量确定出第二后验分布的第二后验参数;所述多个第二相关向量是通过所述样本物品标题信息、所述样本物品特征信息和所述样本物品详情信息确定出的;通过所述第二后验参数确定出第二向量,以对所述第二后验分布和第二先验分布进行迭代训练,直至达到训练条件时停止,得到包括了训练好的所述第二后验分布、所述第二先验分布的所述第二预设潜变量模型。
- 根据权利要求2所述的文案生成方法,其中,所述样本物品详情信息包括:多个样本信息片段;所述在所述样本物品详情信息中确定出,与所述样本物品描述文案信息相似度最大的相似样本信息片段,包括:利用预设语义模型对所述多个样本信息片段进行处理,得到多个样本信息语义向量;利用所述预设语义模型对所述样本物品描述文案信息进行处理,得到样本物品描述语义向量;对每个样本信息语义向量与所述样本物品描述语义向量进行相似度计算,得到对应所述多个样本信息语义向量的多个相似度;确定最大相似度对应的样本信息片段为所述相似样本信息片段。
- 根据权利要求3所述的文案生成方法,其中,所述利用所述样本物品标题信息、所述样本物品特征信息、所述相似样本信息片段和所述样本物品描述文案信息计算出多个第一相关向量,包括:将所述样本物品标题信息和所述样本物品特征信息结合,计算第一样本语义表示向量;通过所述相似样本信息片段,确定出基础语义表示向量和相似语义表示向量;利用所述样本物品描述文案信息,确定出文案基础语义表示向量和文案语义表示向量;通过所述基础语义表示向量和所述文案基础语义表示向量,确定出第一相关隐向量和第二相关隐向量;所述多个第一相关向量包括:所述第一样本语义表示向量、相似语义表示向量、所述文案相似语义表示向量、所述第一相关隐向量和所述第二相关隐向量。
- 根据权利要求4所述的文案生成方法,其中,所述将所述样本物品标题信息和所述样本物品特征信息结合,计算第一样本语义表示向量,包括:将所述样本物品标题信息和所述样本物品特征信息中的多个样本关键词组合,得到样本物品基础信息;将所述样本物品基础信息通过深度学习模型编码器处理得到第一样本中间语义表示向量;对所述第一样本中间语义表示向量进行平均池化,得到所述第一样本语义表示向量。
- 根据权利要求4所述的文案生成方法,其中,所述通过所述相似样本信息片段,确定出基础语义表示向量和相似语义表示向量,包括:将所述相似样本信息片段通过深度学习模型编码器处理得到所述基础语义表示向量;对所述基础语义表示向量进行平均池化,得到所述相似语义表示向量。
- 根据权利要求4所述的文案生成方法,其中,所述利用所述样本物品描述文案信息,确定出文案基础语义表示向量和文案语义表示向量,包括:将所述样本物品描述文案信息通过深度学习模型编码器处理得到所述文案基础语义表示向量;对所述文案基础语义表示向量进行平均池化,得到所述文案语义表示向量。
- 根据权利要求4所述的文案生成方法,其中,所述通过所述基础语义表示向量和所述文案基础语义表示向量,确定出第一相关隐向量和第二相关隐向量,包括:通过所述基础语义表示向量和所述文案基础语义表示向量,结合第一对应参数计算出所述第一相关隐向量;通过所述基础语义表示向量和所述文案基础语义表示向量,结合第二对应参数计算出所述第二相关隐向量。
- 根据权利要求4所述的文案生成方法,其中,所述通过所述多个第一相关向量确定出第一后验分布的第一后验参数,包括:将所述第一语义表示向量、相似语义表示向量、所述文案相似语义表示向量、所述第一相关隐向量和所述第二相关隐向量输入一层感知神经网络模型,得到第一后验数学期望;对所述第一后验数学期望通过激活函数处理,得到第一后验方差。
- 根据权利要求2所述的文案生成方法,其中,所述通过所述第一后验参数确定出所述第一向量,包括:在通过所述第一后验参数确定出的高斯分布向量中随机确定出所述第一向量。
- 根据权利要求3所述的文案生成方法,其中,所述通过所述第一向量结合所述样本物品详情信息计算训练隐向量,包括:通过所述样本物品详情信息包括的所述多个样本信息片段,计算得到第一样本最终表示向量和第二样本最终表示向量;将所述第一向量与所述第二样本最终表示向量,及预设参数相乘之后,通过激活函数处理得到中间训练隐向量;将所述中间计算训练隐向量与所述第一样本最终表示向量相乘,得到所述训练隐向量。
- 根据权利要求11所述的文案生成方法,其中,所述多个第二相关向量包括:相似语义表示向量和所述第一样本最终表示向量;所述通过所述训练隐向量、所述第一向量和多个第二相关向量确定出第二后验分布的第二后验参数,包括:将所述训练隐向量、所述第一向量、所述相似语义表示向量和所述第一样本最终表示向量输入一层感知神经网络模型,得到所述第二后验参数。
- 根据权利要求2所述的文案生成方法,其中,所述通过所述第二后验参数确定出第二向量,包括:在通过所述第二后验参数确定出的类别分布向量中确定出所述第二向量。
- 根据权利要求1所述的文案生成方法,其中,所述接收客户端发送的搜索信息,响应所述搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息之前,所述方法还包括:获取所述多种相关信息,对所述多种相关信息进行除噪过滤处理后划分为多个关键词片段;其中,每个关键词片段包括:多个关键词;利用预设语义模型对每个关键词片段进行处理,得到所述每个关键词片段对应的关键词片段向量,进而得到多个关键词片段向量;通过聚类算法对所述多个关键词片段向量进行处理,得到多个簇;将每个簇中的关键词片段向量对应的关键词片段组合得到一个物品信息片段,进而得到多个物品信息片段,以形成所述物品详情信息。
- 根据权利要求1所述的文案生成方法,其中,所述接收客户端发送的搜索信息,响应所述搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息之后,所述利用第一预设潜变量模型对计算的第一语义表示向量进行处理,得到第一目标向量之前,所述方法还包括:将所述物品标题信息和所述物品特征信息中的关键词组合,得到物品基础信息;将所述物品基础信息通过深度学习模型编码器处理得到第一中间语义表示向量;对所述第一中间语义表示向量进行平均池化,得到所述第一语义表示向量。
- 根据权利要求15所述的文案生成方法,其中,所述结合所述第二目标向量与所述第一语义表示向量,通过预设对话系统模型确定出目标文案,包括:利用所述第二目标向量与所述第一语义表示向量计算得到目标表示识向量;将所述目标表示识向量通过预设感知神经网络模型处理,得到目标序号;通过所述目标序号在所述多个物品信息片段中确定出对应的目标物品信息片段;将目标物品信息片段通过深度学习模型编码器处理得到目标物品片段向量;将所述目标物品片段向量、所述第一目标向量和所述第一中间语义表示向量输入所述预设对话系统模型得到所述目标文案。
- 根据权利要求16所述的文案生成方法,其中,所述利用所述第二目标向量与所述第一语义表示向量计算得到目标表示识向量,包括:计算所述第二目标向量与所述第一语义表示向量之差的模长;计算所述第二目标向量与所述第一语义表示向量的乘积向量;将所述第一语义表示向量、所述第二目标向量、所述模长和所述乘积向量按照顺序组成所述目标表示向量。
- 一种文案生成装置,包括:接收响应模块,被配置为接收客户端发送的搜索信息,响应所述搜索信息获取对应的物品标题信息、物品特征信息和物品详情信息;所述物品详情信息是通过对多种相关信息进行处理得到的多个物品信息片段的集合;处理模块,被配置为利用第一预设潜变量模型对计算的第一语义表示向量进行处理, 得到第一目标向量;所述第一预设潜变量模型是通过样本物品相关信息对第一对先验分布和后验分布训练得到的;所述第一语义表示向量是通过所述物品标题信息和所述物品特征信息计算的;所述处理模块,被配置为利用第二预设潜变量模型对计算的目标隐向量、所述第一目标向量和通过所述物品详情信息计算的第二语义表示向量进行处理,得到第二目标向量;所述第二预设潜变量模型是通过所述第一预设潜变量模型每次训练得到的第一向量,结合所述样本物品相关信息对第二对先验分布和后验分布训练得到的;所述目标隐向量是通过所述第一目标向量结合所述物品详情信息计算的;文案确定模块,被配置为结合所述第二目标向量与所述第一语义表示向量,通过预设对话系统模型确定出目标文案,发送给所述客户端。
- 一种文案生成装置,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,所述处理器执行所述程序时实现权利要求1至17任一项所述方法中的步骤。
- 一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现权利要求1至17任一项所述方法中的步骤。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210905016.1A CN117521660A (zh) | 2022-07-29 | 2022-07-29 | 文案生成方法、装置及存储介质 |
CN202210905016.1 | 2022-07-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024022066A1 true WO2024022066A1 (zh) | 2024-02-01 |
Family
ID=89705305
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/105876 WO2024022066A1 (zh) | 2022-07-29 | 2023-07-05 | 文案生成方法、装置及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN117521660A (zh) |
WO (1) | WO2024022066A1 (zh) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112148959A (zh) * | 2019-06-27 | 2020-12-29 | 百度在线网络技术(北京)有限公司 | 信息推荐方法和装置 |
WO2022001896A1 (zh) * | 2020-07-01 | 2022-01-06 | 北京沃东天骏信息技术有限公司 | 推荐理由生成方法、装置、设备及存储介质 |
-
2022
- 2022-07-29 CN CN202210905016.1A patent/CN117521660A/zh active Pending
-
2023
- 2023-07-05 WO PCT/CN2023/105876 patent/WO2024022066A1/zh unknown
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112148959A (zh) * | 2019-06-27 | 2020-12-29 | 百度在线网络技术(北京)有限公司 | 信息推荐方法和装置 |
WO2022001896A1 (zh) * | 2020-07-01 | 2022-01-06 | 北京沃东天骏信息技术有限公司 | 推荐理由生成方法、装置、设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN117521660A (zh) | 2024-02-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110232152B (zh) | 内容推荐方法、装置、服务器以及存储介质 | |
CN108509465B (zh) | 一种视频数据的推荐方法、装置和服务器 | |
CN110442781B (zh) | 一种基于生成对抗网络的对级排序项目推荐方法 | |
CN102334118B (zh) | 基于用户兴趣学习的个性化广告推送方法与系统 | |
KR20230087622A (ko) | 스트리밍 비디오 내의 객체를 검출하고, 필터링하고 식별하기 위한 방법 및 장치 | |
US20120123978A1 (en) | Learning Tags for Video Annotation Using Latent Subtags | |
WO2022134701A1 (zh) | 视频处理方法及装置 | |
CN102622399A (zh) | 搜索装置、搜索方法和程序 | |
WO2022089467A1 (zh) | 视频数据的排序方法、装置、计算机设备和存储介质 | |
US20230237093A1 (en) | Video recommender system by knowledge based multi-modal graph neural networks | |
CN113806588B (zh) | 搜索视频的方法和装置 | |
CN112528053A (zh) | 多媒体库分类检索管理系统 | |
CN116975615A (zh) | 基于视频多模态信息的任务预测方法和装置 | |
Levinas | An analysis of memory based collaborative filtering recommender systems with improvement proposals | |
US8688716B1 (en) | Recommending pairwise video comparisons to improve ranking | |
Tautkute et al. | What looks good with my sofa: Multimodal search engine for interior design | |
US12056174B2 (en) | System and method for improved content discovery | |
CN117390289B (zh) | 基于用户画像的房屋建造方案推荐方法、装置、设备 | |
CN116051192A (zh) | 处理数据的方法和装置 | |
WO2024022066A1 (zh) | 文案生成方法、装置及存储介质 | |
CN112685623B (zh) | 一种数据处理方法、装置、电子设备及存储介质 | |
JP5625792B2 (ja) | 情報処理装置、潜在特徴量算出方法、及びプログラム | |
CN116523024B (zh) | 召回模型的训练方法、装置、设备及存储介质 | |
Wang et al. | This text has the scent of starbucks: A laplacian structured sparsity model for computational branding analytics | |
CN118152668B (zh) | 媒体信息处理方法及装置、设备、存储介质、程序产品 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23845279 Country of ref document: EP Kind code of ref document: A1 |