WO2022156576A1 - 物品文案生成网络训练方法、物品文案生成方法、装置 - Google Patents

物品文案生成网络训练方法、物品文案生成方法、装置 Download PDF

Info

Publication number
WO2022156576A1
WO2022156576A1 PCT/CN2022/071588 CN2022071588W WO2022156576A1 WO 2022156576 A1 WO2022156576 A1 WO 2022156576A1 CN 2022071588 W CN2022071588 W CN 2022071588W WO 2022156576 A1 WO2022156576 A1 WO 2022156576A1
Authority
WO
WIPO (PCT)
Prior art keywords
item
vector
generation network
information
copy
Prior art date
Application number
PCT/CN2022/071588
Other languages
English (en)
French (fr)
Inventor
张海楠
陈宏申
丁卓冶
包勇军
龙波
Original Assignee
北京沃东天骏信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京沃东天骏信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京沃东天骏信息技术有限公司
Priority to US18/273,473 priority Critical patent/US20240135146A1/en
Publication of WO2022156576A1 publication Critical patent/WO2022156576A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • G06Q30/0625Directed, with specific intent or strategy
    • G06Q30/0627Directed, with specific intent or strategy using item specifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Definitions

  • Embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method for training a network for item copy generation, a method, apparatus, electronic device, and computer-readable medium for item copy generation.
  • Some embodiments of the present disclosure propose an item copy generation network training method, an item copy generation network training, an apparatus, a device, and a computer-readable medium to solve one or more of the technical problems mentioned in the above background section.
  • some embodiments of the present disclosure provide an item copy generation network training method, the method includes: acquiring item description information of each item in an item set, wherein the item description information includes: title information of the item, item description information attribute information and at least one comment information of the item; perform data preprocessing on the item description information set corresponding to the above item set to obtain the processed item description information set; train the initial first item copywriting generation network to obtain the trained item description information set.
  • the first item copywriting generation network wherein the training samples corresponding to the above-mentioned initial first item copywriting generation network include: the item description information and the corresponding item copywriting in the above-mentioned processed item description information set, and the above-mentioned item copywriting is pre-written;
  • the title information, attribute information of each item description information in the above-mentioned processed item description information set, and the item copy corresponding to each item description information above are used as the training samples of the initial second item copy generation network.
  • the copy generation network uses the knowledge distillation method to train the above-mentioned initial second item copy generation network to obtain a trained second item copy generation network.
  • some embodiments of the present disclosure provide an item copy generation method, the method includes: acquiring title information and attribute information of a target item; inputting the title information and the attribute information into the trained second item copy Generating a network to obtain the item copy corresponding to the target item, wherein the second item copy generation network after training is based on the trained first item copy generation network, and the knowledge distillation method is used to train the initial second item copy generation network.
  • some embodiments of the present disclosure provide an apparatus for training a network for item copy generation.
  • the apparatus includes: an acquisition unit configured to acquire item description information of each item in an item set, wherein the item description information includes: an item The title information, attribute information of the item, and at least one comment information of the item; the preprocessing unit is configured to perform data preprocessing on the item description information set corresponding to the above item set to obtain the processed item description information set; the first training The unit is to train the initial first item copy generation network to obtain a trained first item copy generation network, wherein the training samples corresponding to the above-mentioned initial first item copy generation network include: the items in the above-mentioned processed item description information set description information and corresponding item copy, the item copy is pre-written; the second training unit is configured to collect the title information, attribute information and each item description of each item description information into the processed item description information The item copy corresponding to the information is used as the training sample of the initial second item copy generation network. According to the above-mentioned trained first
  • some embodiments of the present disclosure provide an article copy generation device, the device includes: an acquisition unit configured to acquire title information and attribute information of a target article; an input unit configured to combine the above title information and the above The attribute information is input into the trained second item copy generation network to obtain the item copy corresponding to the target item, wherein the trained second item copy generation network is based on the trained first item copy generation network, using knowledge distillation The method is trained on the initial second item copy generation network.
  • some embodiments of the present disclosure provide an electronic device, comprising: one or more processors; a storage device on which one or more programs are stored, when the one or more programs are stored by one or more The processors execute such that the one or more processors implement a method as in any one of the first aspect or the second aspect.
  • some embodiments of the present disclosure provide a computer-readable medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the method according to any one of the first aspect or the second aspect.
  • FIG. 1 is a schematic diagram of an application scenario diagram of an item copy generation network training method according to some embodiments of the present disclosure
  • FIG. 2 is a flow chart of some embodiments of an article copy generation network training method according to the present disclosure
  • FIG. 3 is a schematic diagram of an application scenario diagram of the method for generating article copy according to some embodiments of the present disclosure
  • FIG. 5 is a schematic structural diagram of some embodiments of an article copy generation network training apparatus according to the present disclosure.
  • FIG. 6 is a schematic structural diagram of some embodiments of an article copy generation device according to the present disclosure.
  • FIG. 7 is a schematic structural diagram of an electronic device suitable for implementing some embodiments of the present disclosure.
  • some embodiments of the present disclosure propose a method and device for training an item copy generation network, which can guide the training of the initial second item copy generation network through the trained first item copy generation network to generate item copy, so that the item copy can be generated.
  • the trained second item copy generation network can accurately and effectively generate item copy according to the title information of the item and the attribute information of the item.
  • FIG. 1 is a schematic diagram of an application scenario diagram of an item copy generation network training method according to some embodiments of the present disclosure.
  • the electronic device 101 may first obtain item description information of each item in the item set 102 .
  • the above-mentioned item description information includes: title information of the item, attribute information of the item, and at least one comment information of the item.
  • the above-mentioned item set 102 includes: a first item 1021 , a second item 1022 , and a third item 1023 .
  • the above-mentioned first item 1021 corresponds to the item description information 103 .
  • the above-mentioned second item 1022 corresponds to the item description information 104 .
  • the above-mentioned third item 1023 corresponds to the item description information 105 .
  • the above-mentioned item description information 103 includes: title information 1031 , attribute information 1033 , and at least one comment information 1032 .
  • the at least one comment information 1032 includes: first comment information, second comment information, and third comment information.
  • the above item description information 104 includes: title information 1041 , attribute information 1043 , and at least one comment information 1042 .
  • the above at least one comment information 1042 includes: fourth comment information and fifth comment information.
  • the above item description information 105 includes: title information 1051 , attribute information 1053 , and at least one comment information 1052 .
  • the at least one comment information 1052 includes: sixth comment information, seventh comment information, and eighth comment information.
  • the above processed item description information set includes: item description information 103 and item description information 105 .
  • the initial first item copy generation network 108 is trained to obtain the trained first item copy generation network 109 .
  • the training samples corresponding to the initial first item copy generation network 108 include: item description information and corresponding item copy in the processed item description information set, and the item copy is pre-written.
  • the training sample set of the initial first item copy generation network 108 may include: a training sample consisting of item description information 105 and item copy 106 , and a training sample consisting of item description information 103 and item copy 107 .
  • the title information, attribute information of each item description information in the above processed item description information set, and the item copy corresponding to each item description information above are used as the training samples of the initial second item copy generation network 110.
  • the first item copy generation network 109 of uses the knowledge distillation method to train the above-mentioned initial second item copy generation network 110 to obtain the trained second item copy generation network 111 .
  • the training sample set of the initial second item copy generation network 110 includes: a training sample consisting of item copy 106, attribute information in item description information 105, and title information in item description information 105, and a training sample consisting of item copy 107, A training sample composed of attribute information in the item description information 103 and title information in the item description information 103 .
  • the above electronic device 101 may be hardware or software.
  • the electronic device When the electronic device is hardware, it can be implemented as a distributed cluster composed of multiple servers or terminal devices, or can be implemented as a single server or a single terminal device.
  • the electronic device When the electronic device is embodied as software, it can be installed in the hardware devices listed above. It can be implemented, for example, as multiple software or software modules for providing distributed services, or as a single software or software module. There is no specific limitation here.
  • FIG. 1 the number of electronic devices in FIG. 1 is merely illustrative. There may be any number of electronic devices depending on implementation needs.
  • the article copy generation network training method includes the following steps:
  • Step 201 Obtain item description information of each item in the item set.
  • the execution body of the item copy generation network training method may acquire item description information of each item in the item set through a wired connection or a wireless connection.
  • the above-mentioned item description information includes: title information of the item, attribute information of the item, and at least one comment information of the item.
  • the title information of the above-mentioned article may be a short sentence indicating the content of the article.
  • the attribute information of the item may include, but is not limited to, at least one of the following: function information of the item, appearance color information of the item, material information of the item, and style information to which the item belongs.
  • the above-mentioned item may be a shoe.
  • the title information of the item can be: "Special price, *** official school, *** co-branded women's shoes, retro canvas shoes, women's shoes, Baogai shoes, white and red”.
  • Item attribute information can be:
  • Item review information can be:
  • wireless connection methods may include but are not limited to 3G/4G/5G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other wireless connection currently known or developed in the future connection method.
  • Step 202 Perform data preprocessing on the item description information set corresponding to the above item set to obtain a processed item description information set.
  • the above-mentioned execution body may perform data preprocessing on the item description information set corresponding to the above-mentioned item set to obtain the processed item description information set.
  • the above-mentioned data preprocessing on the item description information set corresponding to the above-mentioned item set to obtain the processed item description information set may include the following steps:
  • the first step is to determine the number of comment information of each item description information in the above item description information set.
  • the above-mentioned execution body may determine the number of comment information of each item description information in the above-mentioned item description information set by querying a database storing comment information.
  • item description information whose number of comment information is less than a predetermined threshold is removed from the above item description information set, and a removed item description information set is obtained.
  • a predetermined threshold may be the value "3".
  • the fourth step is to remove the comment information whose comment content satisfies the predetermined condition from each item description information in the above-mentioned removed item description information set to generate processed item description information, and obtain the above-mentioned processed item description information set.
  • the above-mentioned comment content satisfying the predetermined condition may be that the comment content is content satisfying a predetermined template or the comment content is content that does not have much reference value.
  • Step 203 Train the initial first item copywriting generation network to obtain a trained first item copywriting generation network.
  • the above-mentioned execution body trains the initial first item copywriting generation network to obtain a trained first item copywriting generation network.
  • the training samples corresponding to the above-mentioned initial first item copywriting generation network include: the item description information and the corresponding item copywriting in the above-mentioned processed item description information set.
  • the above article copy is pre-written. It should be noted that the above-mentioned training process of the initial first item copywriting generation network is a relatively conventional training step at present, and will not be repeated here.
  • Step 204 using the knowledge distillation method to train the above-mentioned initial second item copy generation network to obtain a trained second item copy generation network.
  • the above-mentioned execution body may use the title information and attribute information of each item description information in the above-mentioned processed item description information set as training data in the training sample and the above-mentioned item copy corresponding to each item description information as training data
  • the knowledge distillation method is used to train the above-mentioned initial second item copy generation network, and the trained second item copy generation network is obtained.
  • the first item copy generation network may be a Teacher network in a Teacher-Student network.
  • the second item copy generation network may be a Student network in the Teacher-Student network.
  • the above-mentioned knowledge distillation method can use transfer knowledge to obtain a small model more suitable for reasoning through the trained large model.
  • the trained first item copy generation network can learn to generate high-quality item copy by using the title information, attribute information and at least one comment information of the item description information.
  • the recommendation system most items have a small amount of comment information, which leads to the low quality of the item copy generated by the first item copy generation network after training.
  • the training sample of the second item copy generation network includes the title information and attribute information of the item description information, but does not include the comment information of the item.
  • the trained first item copy generation network guides the second item copy generation network to train, so that the second item copy generation network learns the knowledge of the trained first item copy generation network to generate high-quality item copy.
  • conditional probability corresponding to the first target vector internally output by the trained first item copy generation network and the second target vector output internally by the second item copy generation network
  • the KL distance set between the conditional probability sets corresponding to the sets is used as a training constraint, and the above-mentioned initial second item copy generation network is trained to obtain the above-mentioned trained second item copy generation network.
  • the conditional probability corresponding to the first target vector may be: p(y t
  • H item may be the corresponding first target vector of the item.
  • the conditional probability corresponding to the second target vector may be p(y t
  • the KL distance can be found by the following formula:
  • can be a parameter
  • the first article copy generation network after training first, perform word vector conversion on at least one comment information, attribute information and title information of the target object, and obtain the vector corresponding to the above at least one comment information and the corresponding attribute information.
  • vector and the vector corresponding to the above title information are respectively input into the coding network including the multi-layer coding layer, and the encoded vector set corresponding to the at least one comment information is obtained. , an encoded vector set corresponding to the above attribute information, and an encoded vector set corresponding to the above title information.
  • the above-mentioned encoding network is a network in which the multi-layer encoding layers are connected in series.
  • Each encoding layer in the above encoding network corresponds to an encoding output vector.
  • the encoding network described above may be the encoding network of the Transformer model.
  • the encoding network of the above Transformer model includes multiple layers of encoding layers.
  • the vector with the highest weight is selected as the first target vector from the encoded vector set corresponding to the at least one comment information.
  • the above-mentioned first target vector is input into the pre-trained forward neural network to obtain the first output vector.
  • the encoded vector set and the encoded vector set corresponding to the above title information are subjected to feature fusion to obtain a fused vector set.
  • the above activation function may be a GELU (Gaussian Error Linerar Units) activation function.
  • the vector with the highest weight is selected from the set of the first output vector and the second output vector as the third output vector.
  • the above-mentioned third output vector and the above-mentioned first output vector are vector-added to obtain a first addition vector.
  • the above-mentioned first added vector is normalized to obtain a fourth output vector.
  • the above fourth output vector is input to the pre-trained forward neural network to obtain the fifth output vector.
  • the above-mentioned fifth output vector and the above-mentioned fourth output vector are added to obtain a second addition vector.
  • the above-mentioned second added vector is normalized to obtain a sixth output vector as the first target vector outputted by the trained first article copy generation network.
  • the second target vector set internally output by the second item copy generation network may correspond to the fifth vector set in the above item copy generation method.
  • the loss function of the above-mentioned initial second item copy generation network is based on the KL distance formula to represent the correlation between the item copy generated by the above-mentioned first item copy generation network and the attribute information of the item.
  • the loss function and the loss function representing the correlation between the item copy generated by the above-mentioned second item copy generation network and the attribute information of the item are generated.
  • the loss function of the above initial second item copy generation network is the following formula:
  • can be an adjustment parameter, and the value range is [0,1]. It can be a loss function with a parameter of ⁇ that represents the correlation between the item copy generated by the second item copy generation network and the attribute information of the item. It can be a loss function with a parameter of ⁇ that represents the correlation between the item copy generated by the above-mentioned first item copy generation network and the attribute information of the item.
  • E′ R may be the second target vector corresponding to the target item.
  • y t may be the t-th word of the generated copy of the target item.
  • E′ R ) can characterize the output of the decoding network.
  • can be a parameter.
  • y t can be the t-th word of the generated copy of the target item.
  • can be the number of words in the generated item copy.
  • y ⁇ t can represent the set of words from the 1st word to the t-1th word in the generated copy of the target item.
  • T may be the first vector corresponding to the title information.
  • A may be the second vector corresponding to the attribute information.
  • R fuse (y ⁇ t) may be the relevance score of each of the 1 th word to the t-1 th word.
  • R fuse (y ⁇ t) is generated by the following formula:
  • R fuse (y * ) ⁇ R Coh (y * )+(1- ⁇ )RRG(y * ),
  • can be an adjustment parameter, and the value range is [0,1].
  • y * may be y ⁇ t.
  • R Coh (y * ) can characterize the degree of overlap of the generated y * with the pre-written copy.
  • R RG (y*) can characterize the ROUGE score of y * .
  • R Coh (y * ) can be the following formula:
  • the first item copy generation network after training can be used to guide the training of the initial second item copy generation network to generate the item copy.
  • the trained second item copy generation network can accurately and effectively generate item copy according to the title information of the item and the attribute information of the item. Specifically, most items have less comment information or the value of the existing comment information is low, and it is impossible to effectively generate high-quality item copy based on the comment information.
  • the item copy generation network training method in some embodiments of the present disclosure will first acquire item description information of each item in the item set, wherein the item description information includes: title information of the item, attribute information of the item, and at least the item description information. a comment message. Then, data preprocessing is performed on the item description information set corresponding to the above item set to obtain the processed item description information set. Here, data preprocessing is performed on the above item description information set to remove meaningless comment information, so as to avoid affecting the training accuracy of the second item copy generation network.
  • the initial first item copy generation network is trained to obtain a trained first item copy generation network, wherein the training samples corresponding to the above-mentioned initial first item copy generation network include: the items in the above processed item description information set Description information and corresponding item copy, the above item copy is pre-written.
  • the trained first item copy generation network can generate high-quality item copy according to the input item description information.
  • the title information and attribute information of each item description information in the above processed item description information set, and the item copy corresponding to each item description information above are used as the training samples of the initial second item copy generation network.
  • the first item copy generation network uses the knowledge distillation method to train the above-mentioned initial second item copy generation network to obtain a trained second item copy generation network.
  • Instructing the trained first item copy generation network to guide the training of the initial second item copy generation network enables the second item copy generation network to learn the trained first item copy without relying on at least one comment information of the input item Generative network generates some characteristic information of high-quality item copy. In this way, it can effectively solve the problem that most items have less comment information or the value of the existing comment information is low, and it is impossible to effectively generate a relatively high-quality article copy based on the comment information. It can be seen that the above-mentioned item copy generation network training method can enable the trained second item copy generation network to accurately and effectively generate relatively high-quality item copy according to the title information of the item and the attribute information of the item.
  • FIG. 3 is a schematic diagram of an application scenario diagram of the method for generating article copy according to some embodiments of the present disclosure.
  • the electronic device 301 can acquire title information 3031 and attribute information 3032 of the target item 302 . Then, the above-mentioned title information 3031 and the above-mentioned attribute information 3032 are input into the trained second item copy generation network 304 to obtain the item copy 305 corresponding to the above-mentioned target item 302 .
  • the above-mentioned second item copy generation network 304 after training is based on the trained first item copy generation network, and the knowledge distillation method is used to train the initial second item copy generation network.
  • the above-mentioned target item 302 may be: "shoes".
  • the title information 3031 in the above item description information 303 may be: "Title information: special offer, *** official solo, *** co-branded women's shoes, retro canvas shoes, *** women's shoes Baogai shoes women's white and red”.
  • the attribute information 3032 in the above item description information 303 may be: "attribute information: function: breathable, wear-resistant; style: leisure; color: white, red, black, blue; upper material: fabric;”.
  • the above item copy 305 can be: "Shoe copy: *** joint cooperation model, combining classic elements with current trends, logo design embellished on the tongue, personalized fashion, high street style, take you to the streets easily, Soft and comfortable, enhancing the wearing experience.”
  • the above electronic device 301 may be hardware or software.
  • the electronic device When the electronic device is hardware, it can be implemented as a distributed cluster composed of multiple servers or terminal devices, or can be implemented as a single server or a single terminal device.
  • the electronic device When the electronic device is embodied as software, it can be installed in the hardware devices listed above. It can be implemented, for example, as multiple software or software modules for providing distributed services, or as a single software or software module. There is no specific limitation here.
  • FIG. 3 is merely illustrative. There may be any number of electronic devices depending on implementation needs.
  • the article copy generation method includes the following steps:
  • Step 401 Obtain title information and attribute information of the target item.
  • the execution body of the item copy generation method may obtain item description information of each item in the item set through a wired connection or a wireless connection.
  • Step 402 inputting the above title information and the above attribute information into the trained second item copy generation network to obtain the item copy corresponding to the above target item.
  • the above-mentioned execution subject may input the above-mentioned title information and the above-mentioned attribute information into the second item copy generation network after training to obtain the item copy corresponding to the above-mentioned target item.
  • the above-mentioned second item copy generation network after training is based on the trained first item copy generation network, and the knowledge distillation method is used to train the initial second item copy generation network.
  • the above-mentioned inputting the above-mentioned title information and the above-mentioned attribute information into the trained second item copywriting generation network to obtain the item copying corresponding to the above-mentioned target item may include the following steps:
  • the first step is to perform word vector conversion on the above title information and the above attribute information to obtain a first vector corresponding to the above title information and a second vector corresponding to the above attribute information.
  • the above-mentioned execution body may firstly perform word segmentation on the above-mentioned title information and the above-mentioned attribute information, and obtain a word set corresponding to the above-mentioned title information and a word set corresponding to the above-mentioned attribute information. Then, word embedding processing is performed on the word set corresponding to the above-mentioned title information and the word set corresponding to the above-mentioned attribute information to obtain the above-mentioned first vector and the above-mentioned second vector.
  • the above-mentioned first vector and the above-mentioned second vector are encoded to obtain a third vector set corresponding to the above-mentioned title information and a fourth vector set corresponding to the above-mentioned attribute information.
  • the fourth step is to decode the obtained fifth vector set to obtain the item copy corresponding to the above target item.
  • the above-mentioned first vector and the above-mentioned third vector are respectively input into a pre-trained coding network to obtain the above-mentioned third vector set and the above-mentioned fourth vector set.
  • the above encoding network includes at least one encoding layer. It should be noted that, the above-mentioned encoding network is a network in which the multi-layer encoding layers are connected in series. Each encoding layer in the above encoding network corresponds to an encoding output vector.
  • linearly combining the third vector and the fourth vector to obtain a fifth vector may include the following steps:
  • the first step is to multiply the above-mentioned third vector by the numerical value ⁇ to obtain the first multiplication result.
  • the above-mentioned numerical value ⁇ is a numerical value between 0-1.
  • the above-mentioned fourth vector is multiplied by the numerical value 1- ⁇ to obtain the second multiplication result.
  • the above-mentioned first multiplication result and the above-mentioned second multiplication result are added to obtain the above-mentioned fifth vector.
  • the above-mentioned fifth vector set is input into a pre-trained decoding network with a copy mechanism to obtain the item copy corresponding to the above-mentioned target item.
  • the title information and attribute information of the target article can be obtained first through the method for generating article text in some embodiments of the present disclosure. Then, the above-mentioned title information and the above-mentioned attribute information are input into the trained second item copywriting generation network, which can accurately and effectively generate relatively high-quality item copywriting corresponding to the above-mentioned target item.
  • the above-mentioned second item copy generation network after training is based on the trained first item copy generation network, and the knowledge distillation method is used to train the initial second item copy generation network.
  • the present disclosure provides some embodiments of a network training apparatus for item copy generation, and these apparatus embodiments correspond to the above-mentioned method embodiments in FIG. 2 . Specifically, it can be applied to various electronic devices.
  • the apparatus 500 for training a network for item copy generation in some embodiments includes: an acquisition unit 501 , a preprocessing unit 502 , a first training unit 503 and a second training unit 504 .
  • the obtaining unit 501 is configured to obtain item description information of each item in the item set, wherein the item description information includes: title information of the item, attribute information of the item, and at least one comment information of the item.
  • the preprocessing unit 502 is configured to perform data preprocessing on the item description information set corresponding to the above item set to obtain the processed item description information set.
  • the first training unit 503 is configured to train the initial first item copy generation network to obtain a trained first item copy generation network, wherein the training samples corresponding to the above-mentioned initial first item copy generation network include: The item description information and the corresponding item copy in the item description information set of , and the above item copy is pre-written.
  • the second training unit 504 is configured to use the title information, attribute information of each item description information in the above-mentioned processed item description information set, and the item copy corresponding to each item description information above as the training of the initial second item copy generation network Sample, according to the above-mentioned trained first item copy generation network, use the knowledge distillation method to train the above-mentioned initial second item copy generation network, and obtain the trained second item copy generation network.
  • the preprocessing unit 502 of the item copy generation network training apparatus 500 may be further configured to: determine the number of comment information of each item description information in the above item description information set; The item description information whose number of comment information is less than a predetermined threshold is removed from the item description information set, and the removed item description information set is obtained; the comments whose review content meets the predetermined condition are removed from each item description information in the above-mentioned removed item description information set. information to generate processed item description information, and obtain the above processed item description information set.
  • the second training unit 504 of the item copy generation network training apparatus 500 may be further configured to: generate the first target vector outputted by the network from the first item copy after the above training
  • the KL distance set between the corresponding conditional probability and the conditional probability set corresponding to the second target vector set output from the second item copy generation network is used as a training constraint, and the above initial second item copy generation network is trained to obtain the above The trained second item copy generation network.
  • the units recorded in the apparatus 500 correspond to the respective steps in the method described with reference to FIG. 2 . Therefore, the operations, features and beneficial effects described above with respect to the method are also applicable to the apparatus 500 and the units included therein, and details are not described herein again.
  • the present disclosure provides some embodiments of a device for generating article text. These device embodiments correspond to those method embodiments described above in FIG. 4 , and the device may specifically be Used in various electronic devices.
  • the article copy generation apparatus 600 in some embodiments includes: an acquisition unit 601 and an input unit 602 .
  • the obtaining unit 601 is configured to obtain title information and attribute information of the target item.
  • the input unit 602 inputs the above-mentioned title information and the above-mentioned attribute information into the trained second item copywriting generation network to obtain the item copying corresponding to the above-mentioned target item, wherein the above-mentioned trained second item copywriting generation network is based on the trained
  • the first item copy generation network is trained on the initial second item copy generation network using the knowledge distillation method.
  • the input unit 602 of the article copy generation device 600 may be further configured to: perform word vector transformation on the above-mentioned title information and the above-mentioned attribute information to obtain the first corresponding to the above-mentioned title information. vector and the second vector corresponding to the above attribute information; encode the above-mentioned first vector and the above-mentioned second vector to obtain a third vector set corresponding to the above-mentioned title information and a fourth vector set corresponding to the above-mentioned attribute information; the third vector and the fourth vector corresponding to the third vector, linearly combine the third vector and the fourth vector to obtain a fifth vector; decode the fifth vector set obtained to obtain the corresponding target item Item copy.
  • the input unit 602 of the article copy generation device 600 may be further configured to: input the above-mentioned first vector and the above-mentioned third vector into a pre-trained encoding network respectively, to obtain the above-mentioned first vector The three-vector set and the above-mentioned fourth vector set, wherein, the above-mentioned encoding network includes at least one encoding layer.
  • the input unit 602 of the article copy generation device 600 may be further configured to: multiply the above-mentioned third vector by the numerical value n to obtain a first multiplication result, wherein the above-mentioned numerical value ⁇ is a value between 0-1; multiply the above-mentioned fourth vector by the numerical value 1-n to obtain the second multiplication result; add the above-mentioned first multiplication result and the above-mentioned second multiplication result to obtain the above-mentioned Fifth vector.
  • the input unit 602 of the article copy generation device 600 may be further configured to: input the above-mentioned fifth vector set into a pre-trained decoding network with a copy mechanism to obtain the above-mentioned target Item text corresponding to the item.
  • the units recorded in the apparatus 600 correspond to the respective steps in the method described with reference to FIG. 4 . Therefore, the operations, features, and beneficial effects described above with respect to the method are also applicable to the apparatus 600 and the units included therein, and will not be repeated here.
  • FIG. 7 a schematic structural diagram of an electronic device (eg, the electronic device in FIG. 1 or FIG. 3 ) 700 suitable for implementing some embodiments of the present disclosure is shown.
  • the electronic device shown in FIG. 7 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • an electronic device 700 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 701 that may be loaded into random access according to a program stored in a read only memory (ROM) 702 or from a storage device 708 Various appropriate actions and processes are executed by the programs in the memory (RAM) 703 . In the RAM 703, various programs and data required for the operation of the electronic device 700 are also stored.
  • the processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704.
  • An input/output (I/O) interface 705 is also connected to bus 704 .
  • the following devices can be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration An output device 707 of a computer, etc.; a storage device 708 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 709. Communication means 709 may allow electronic device 700 to communicate wirelessly or by wire with other devices to exchange data.
  • FIG. 7 shows an electronic device 700 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in FIG. 7 can represent one device, and can also represent multiple devices as required.
  • the processes described above with reference to the flowcharts may be implemented as computer software programs.
  • some embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from a network via communication device 709, or from storage device 708, or from ROM 702.
  • the processing device 701 the above-mentioned functions defined in the methods of some embodiments of the present disclosure are performed.
  • the computer-readable medium described above may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the foregoing two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein.
  • Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
  • the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects.
  • HTTP HyperText Transfer Protocol
  • Examples of communication networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned apparatus; or may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires item description information of each item in the item set, wherein the above-mentioned item description information includes : the title information of the item, the attribute information of the item, and at least one comment information of the item; perform data preprocessing on the item description information set corresponding to the above item set, and obtain the processed item description information set; Perform training to obtain a trained first item copy generation network, wherein the training samples corresponding to the above-mentioned initial first item copy generation network include: the item description information and the corresponding item copy in the above-mentioned processed item description information set, the above item The copy is pre-written; the title information, attribute information of each item description information in the above processed item description information set, and the item copy
  • the above-mentioned trained first item copy generation network uses the knowledge distillation method to train the above-mentioned initial second item copy generation network to obtain a trained second item copy generation network. Obtain the title information and attribute information of the target item; input the above title information and the above attribute information into the trained second item copywriting generation network to obtain the item copywriting corresponding to the above-mentioned target item, wherein the above-mentioned trained second item copywriting is generated
  • the network is based on the first item copy generation network after training, and the knowledge distillation method is used to train the initial second item copy generation network.
  • Computer program code for carrying out operations of some embodiments of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, or a combination thereof, Also included are conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider to via Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units described in some embodiments of the present disclosure may be implemented by means of software, and may also be implemented by means of hardware.
  • the described unit can also be set in the processor, for example, it can be described as: a processor includes an acquisition unit, a preprocessing unit, a first training unit and a second training unit. Wherein, the names of these units do not constitute a limitation on the unit itself under certain circumstances.
  • the acquiring unit may also be described as "a unit that acquires item description information of each item in the item set".
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs Systems on Chips
  • CPLDs Complex Programmable Logical Devices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)
  • Stored Programmes (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本公开的实施例公开了物品文案生成网络训练方法、物品文案生成方法、装置、电子设备和计算机可读介质。该方法的一具体实施方式包括:获取物品集中每个物品的物品描述信息;对该物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集;对初始第一物品文案生成网络进行训练,得到训练后的第一物品文案生成网络;利用知识蒸馏方法,对该初始第二物品文案生成网络进行训练,得到训练后的第二物品文案生成网络。该实施方式通过训练后的第一物品文案生成网络指导训练初始第二物品文案生成网络生成物品文案,使得训练后的第二物品文案生成网络可以依据物品的标题信息和物品的属性信息准确、有效的生成物品文案。

Description

物品文案生成网络训练方法、物品文案生成方法、装置
相关申请的交叉引用
本申请要求了申请日为2021年01月21日,申请号为202110084578.X、发明名称为“物品文案生成网络训练方法、物品文案生成方法、装置”的中国专利申请的优先权,其全部内容作为整体并入本申请中。
技术领域
本公开的实施例涉及计算机技术领域,具体涉及物品文案生成网络训练方法、物品文案生成方法、装置、电子设备和计算机可读介质。
背景技术
目前,传统的物品搜索与推荐的技术应用已经无法较好地满足用户日益增长的需求。在浏览推荐系统时,用户常常面临信息爆炸的问题。用户希望通过一些优致的物品文案来快速找到需要的产品以节省成本。通常采用的方式为:人工分析物品的标题、属性等内容信息和物品的评论信息来生成物品文案。
发明内容
本公开的内容部分用于以简要的形式介绍构思,这些构思将在后面的具体实施方式部分被详细描述。本公开的内容部分并不旨在标识要求保护的技术方案的关键特征或必要特征,也不旨在用于限制所要求的保护的技术方案的范围。
本公开的一些实施例提出了物品文案生成网络训练方法、物品文案生成网络训练、装置、设备和计算机可读介质,来解决以上背景技术部分提到的技术问题中的一项或多项。
第一方面,本公开的一些实施例提供了一种物品文案生成网络训 练方法,该方法包括:获取物品集中每个物品的物品描述信息,其中,上述物品描述信息包括:物品的标题信息、物品的属性信息和物品的至少一个评论信息;对上述物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集;对初始第一物品文案生成网络进行训练,得到训练后的第一物品文案生成网络,其中,上述初始第一物品文案生成网络对应的训练样本包括:上述处理后的物品描述信息集中的物品描述信息和对应的物品文案,上述物品文案是预先撰写的;将上述处理后的物品描述信息集中每个物品描述信息的标题信息、属性信息和上述每个物品描述信息对应的物品文案作为初始第二物品文案生成网络的训练样本,根据上述训练后的第一物品文案生成网络,利用知识蒸馏方法,对上述初始第二物品文案生成网络进行训练,得到训练后的第二物品文案生成网络。
第二方面,本公开的一些实施例提供了一种物品文案生成方法,该方法包括:获取目标物品的标题信息和属性信息;将上述标题信息和上述属性信息输入至训练后的第二物品文案生成网络,得到上述目标物品对应的物品文案,其中,上述训练后的第二物品文案生成网络是根据训练后的第一物品文案生成网络,利用知识蒸馏方法对初始第二物品文案生成网络训练的。
第三方面,本公开的一些实施例提供了一种物品文案生成网络训练装置,装置包括:获取单元,被配置成获取物品集中每个物品的物品描述信息,其中,上述物品描述信息包括:物品的标题信息、物品的属性信息和物品的至少一个评论信息;预处理单元,被配置成对上述物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集;第一训练单元,对初始第一物品文案生成网络进行训练,得到训练后的第一物品文案生成网络,其中,上述初始第一物品文案生成网络对应的训练样本包括:上述处理后的物品描述信息集中的物品描述信息和对应的物品文案,上述物品文案是预先撰写的;第二训练单元,被配置成将上述处理后的物品描述信息集中每个物品描述信息的标题信息、属性信息和上述每个物品描述信息对应的物品文案作为初始第二物品文案生成网络的训练样本,根据上述训练后的第一物 品文案生成网络,利用知识蒸馏方法,对上述初始第二物品文案生成网络进行训练,得到训练后的第二物品文案生成网络。
第四方面,本公开的一些实施例提供了一种物品文案生成装置,装置包括:获取单元,被配置成获取目标物品的标题信息和属性信息;输入单元,被配置成将上述标题信息和上述属性信息输入至训练后的第二物品文案生成网络,得到上述目标物品对应的物品文案,其中,上述训练后的第二物品文案生成网络是根据训练后的第一物品文案生成网络,利用知识蒸馏方法对初始第二物品文案生成网络训练的。
第五方面,本公开的一些实施例提供了一种电子设备,包括:一个或多个处理器;存储装置,其上存储有一个或多个程序,当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现如第一方面或第二方面中任一的方法。
第六方面,本公开的一些实施例提供了一种计算机可读介质,其上存储有计算机程序,其中,程序被处理器执行时实现如第一方面或第二方面中任一的方法。
附图说明
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,元件和元素不一定按照比例绘制。
图1是本公开的一些实施例的物品文案生成网络训练方法的一个应用场景图的示意图;
图2是根据本公开的物品文案生成网络训练方法一些实施例的流程图;
图3是本公开的一些实施例的物品文案生成方法的一个应用场景图的示意图;
图4是根据本公开的物品文案生成方法一些实施例的流程图;
图5是根据本公开的物品文案生成网络训练装置的一些实施例的结构示意图;
图6是根据本公开的物品文案生成装置的一些实施例的结构示意图;
图7是适于用来实现本公开的一些实施例的电子设备的结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例。相反,提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互组合。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
相关的文案生成方法,例如:人工分析物品的标题、属性等内容信息和物品的评论信息来生成物品文案,经常会存在如下技术问题:大多数物品的评论信息较少或存在评论信息的价值较低,不能有效的依据评论信息生成较为优质的物品文案。
为了解决以上所阐述的问题,本公开的一些实施例提出了物品文案生成网络训练方法及装置,可以通过训练后的第一物品文案生成网 络指导训练初始第二物品文案生成网络生成物品文案,使得训练后的第二物品文案生成网络可以依据物品的标题信息和物品的属性信息准确、有效的生成物品文案。
下面将参考附图并结合实施例来详细说明本公开。
图1是本公开的一些实施例的物品文案生成网络训练方法的一个应用场景图的示意图。
如图1所示,电子设备101可以首先获取物品集102中每个物品的物品描述信息。其中,上述物品描述信息包括:物品的标题信息、物品的属性信息和物品的至少一个评论信息。在本应用场景中,上述物品集102包括:第一物品1021、第二物品1022、第三物品1023。上述第一物品1021对应着物品描述信息103。上述第二物品1022对应着物品描述信息104。上述第三物品1023对应着物品描述信息105。上述物品描述信息103包括:标题信息1031、属性信息1033、至少一个评论信息1032。上述至少一个评论信息1032包括:第一评论信息、第二评论信息、第三评论信息。上述物品描述信息104包括:标题信息1041、属性信息1043、至少一个评论信息1042。上述至少一个评论信息1042包括:第四评论信息、第五评论信息。上述物品描述信息105包括:标题信息1051、属性信息1053、至少一个评论信息1052。上述至少一个评论信息1052包括:第六评论信息、第七评论信息、第八评论信息。然后,对上述物品集102对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集。在本应用场景中,上述处理后的物品描述信息集包括:物品描述信息103和物品描述信息105。进而,对初始第一物品文案生成网络108进行训练,得到训练后的第一物品文案生成网络109。其中,上述初始第一物品文案生成网络108对应的训练样本包括:上述处理后的物品描述信息集中的物品描述信息和对应的物品文案,上述物品文案是预先撰写的。在本应用场景中,上述初始第一物品文案生成网络108的训练样本集可以包括:由物品描述信息105和物品文案106组成的训练样本和由物品描述信息103和物品文案107组成的训练样本。最后,将上述处理后的物品描述信 息集中每个物品描述信息的标题信息、属性信息和上述每个物品描述信息对应的物品文案作为初始第二物品文案生成网络110的训练样本,根据上述训练后的第一物品文案生成网络109,利用知识蒸馏方法,对上述初始第二物品文案生成网络110进行训练,得到训练后的第二物品文案生成网络111。在本应用场景中,初始第二物品文案生成网络110的训练样本集包括:由物品文案106、物品描述信息105中属性信息、物品描述信息105中标题信息组成的训练样本和由物品文案107、物品描述信息103中属性信息、物品描述信息103中标题信息组成的训练样本。
需要说明的是,上述电子设备101可以是硬件,也可以是软件。当电子设备为硬件时,可以实现成多个服务器或终端设备组成的分布式集群,也可以实现成单个服务器或单个终端设备。当电子设备体现为软件时,可以安装在上述所列举的硬件设备中。其可以实现成例如用来提供分布式服务的多个软件或软件模块,也可以实现成单个软件或软件模块。在此不做具体限定。
应该理解,图1中的电子设备的数目仅仅是示意性的。根据实现需要,可以具有任意数目的电子设备。
继续参考图2,示出了根据本公开的物品文案生成网络训练方法的一些实施例的流程200。该物品文案生成网络训练方法,包括以下步骤:
步骤201,获取物品集中每个物品的物品描述信息。
在一些实施例中,物品文案生成网络训练方法的执行主体(例如图1所示的电子设备101)可以通过有线连接方式或者无线连接方式获取物品集中每个物品的物品描述信息。其中,上述物品描述信息包括:物品的标题信息、物品的属性信息和物品的至少一个评论信息。在这里,上述物品的标题信息可以是标明物品内容的简短语句。上述物品的属性信息可以包括但不限于以下至少一项:物品的功能信息,物品的外观颜色信息,物品的材质信息,物品所属风格信息。
作为示例,上述物品可以是鞋子。
物品的标题信息可以是:“特价款,***官方旗舰,***联名女鞋复古帆布鞋,女鞋,宝盖鞋子,白红”。
物品的属性信息可以是:
“功能:透气,耐磨;
风格:休闲;
颜色:白色,黑色,蓝色,
鞋面材质:织物”。
物品的评论信息可以是:
“鞋子非常透气,颜色非常好看”,
“鞋子款式比较新颖,价格比较实惠”,
“鞋子质量较好,送货时间较长”。
需要指出的是,上述无线连接方式可以包括但不限于3G/4G/5G连接、WiFi连接、蓝牙连接、WiMAX连接、Zigbee连接、UWB(ultra wideband)连接、以及其他现在已知或将来开发的无线连接方式。
步骤202,对上述物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集。
在一些实施例中,上述执行主体可以对上述物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集。
在一些实施例的一些可选的实现方式中,上述对上述物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集,可以包括以下步骤:
第一步,确定上述物品描述信息集中每个物品描述信息的评论信息的数目。作为示例,上述执行主体可以通过查询存储评论信息的数据库来确定上述物品描述信息集中每个物品描述信息的评论信息的数目。
第二步,从上述物品描述信息集中去除评论信息的数目小于预定阈值的物品描述信息,得到去除后的物品描述信息集。作为示例,上述预定阈值可以是数值“3”。
第四步,从上述去除后的物品描述信息集中的每个物品描述信息中去除评论内容满足预定条件的评论信息以生成处理后的物品描述信 息,得到上述处理后的物品描述信息集。其中,上述评论内容满足预定条件可以是评论内容为满足预定模板的内容或评论内容是没有太多的参考价值的内容。
步骤203,对初始第一物品文案生成网络进行训练,得到训练后的第一物品文案生成网络。
在一些实施例中,上述执行主体对初始第一物品文案生成网络进行训练,得到训练后的第一物品文案生成网络。其中,上述初始第一物品文案生成网络对应的训练样本包括:上述处理后的物品描述信息集中的物品描述信息和对应的物品文案。上述物品文案是预先撰写的。需要说明的是,上述初始第一物品文案生成网络的训练过程为目前较为常规的训练步骤,在此不再赘述。
步骤204,利用知识蒸馏方法,对上述初始第二物品文案生成网络进行训练,得到训练后的第二物品文案生成网络。
在一些实施例中,上述执行主体可以将上述处理后的物品描述信息集中每个物品描述信息的标题信息、属性信息作为训练样本中的训练数据和上述每个物品描述信息对应的物品文案作为训练样本中训练数据的标注,根据上述训练后的第一物品文案生成网络,利用知识蒸馏方法,对上述初始第二物品文案生成网络进行训练,得到训练后的第二物品文案生成网络。其中,第一物品文案生成网络可以是Teacher-Student网络中的Teacher网络。对应地,第二物品文案生成网络可以是Teacher-Student网络中的Student网络。上述知识蒸馏方法可以是利用迁移知识,从而通过训练好的大模型得到更加适合推理的小模型。
需要说明的是,由于训练后的第一物品文案生成网络的训练样本中包括物品的评论信息。所以,训练后的第一物品文案生成网络可以学习到通过物品描述信息的标题信息、属性信息和至少一个评论信息来生成优质的物品文案。但是,在推荐系统中,大多数的物品都存在少量的评论信息,导致依据训练后的第一物品文案生成网络生成的物品文案不够优质。进而,通过将第一物品文案生成网络作为Teacher网络,第二物品文案生成网络作为Student网络。其中,第二物品文案 生成网络的训练样本中包括物品描述信息的标题信息、属性信息,不包括物品的评论信息。由训练后的第一物品文案生成网络指导第二物品文案生成网络进行训练,使得第二物品文案生成网络学习到训练后的第一物品文案生成网络生成优质物品文案的知识。
在一些实施例的一些可选的实现方式中,将上述训练后的第一物品文案生成网络内部输出的第一目标向量对应的条件概率与上述第二物品文案生成网络内部输出的第二目标向量集对应的条件概率集之间的KL距离集作为训练约束条件,对上述初始第二物品文案生成网络进行训练,得到上述训练后的第二物品文案生成网络。第一目标向量对应的条件概率可以是:p(y t|H item)。其中,y t可以是生成的物品的文案的第t个词语。H item可以是物品的对应的第一目标向量。第二目标向量对应的条件概率可以是p(y t|E′ R)。E′ R可以是物品对应的第二目标向量。可以通过以下公式求取KL距离:
Figure PCTCN2022071588-appb-000001
其中,θ可以是参数,
Figure PCTCN2022071588-appb-000002
可以是参数为θ的KL距离。
作为示例,对于训练后的第一物品文案生成网络,首先,将目标物体的至少一个评论信息、属性信息和标题信息进行词向量转换,得到上述至少一个评论信息对应的向量、上述属性信息对应的向量和上述标题信息对应的向量。将上述至少一个评论信息对应的向量、上述属性信息对应的向量和上述标题信息对应的向量分别输入至包括多层编码层的编码网络,得到与上述至少一个评论信息对应的、编码后的向量集、与上述属性信息对应的、编码后的向量集和与上述标题信息对应的、编码后的向量集。其中,上述编码网络是多层编码层是串行连接的网络。上述编码网络中的每层编码层对应着一个编码输出向量。作为示例,上述编码网络可以是Transformer模型的编码网络。上述Transformer模型的编码网络包括多层编码层。
然后,从与上述至少一个评论信息对应的、编码后的向量集中筛选出权重最高的向量作为第一目标向量。将上述第一目标向量输入至预先训练的前向神经网络,得到第一输出向量。
接着,依据与上述属性信息对应的、编码后的向量集中每个向量对应的编码层和与上述标题信息对应的、编码后的向量集中每个向量每个向量,对与上述属性信息对应的、编码后的向量集和与上述标题信息对应的、编码后的向量集进行特征融合,得到融合后的向量集。将融合后的向量集输入至激活函数,得到第二输出向量集。其中,上述激活函数可以是GELU(Gaussian Error Linerar Units)激活函数。
最后,从上述第一输出向量和上述第二输出向量集中筛选出权重最高的向量作为第三输出向量。将上述第三输出向量与上述第一输出向量进行向量相加,得到第一相加向量。对上述第一相加向量进行归一化处理,得到第四输出向量。将上述第四输出向量输入至预先训练的前向神经网络,得到第五输出向量。将上述第五输出向量和上述第四输出向量进行相加,得到第二相加向量。对上述第二相加向量进行归一化处理,得到第六输出向量作为训练后的第一物品文案生成网络内部输出的第一目标向量。
对于训练后的第一物品文案生成网络,上述第二物品文案生成网络内部输出的第二目标向量集可以对应着上述物品文案生成方法中的第五向量集。
在一些实施例的一些可选的实现方式中,上述初始第二物品文案生成网络的损失函数是根据KL距离公式、表征上述第一物品文案生成网络生成的物品文案与物品的属性信息相关性的损失函数和表征上述第二物品文案生成网络生成的物品文案与物品的属性信息相关性的损失函数生成的。作为示例,上述初始第二物品文案生成网络的损失函数是以下公式:
Figure PCTCN2022071588-appb-000003
其中,
Figure PCTCN2022071588-appb-000004
可以是参数为θ的初始第二物品文案生成网络的损失函数。α可以是调节参数,数值范围为[0,1]。
Figure PCTCN2022071588-appb-000005
可以是参数为θ的、表征上述第二物品文案生成网络生成的物品文案与物品的属性信息相关性的损失函数。
Figure PCTCN2022071588-appb-000006
可以是参数为θ的、表征上述第一物品文案生成网络生成的物品文案与物品的属性信息相关性的损失 函数。
其中,
Figure PCTCN2022071588-appb-000007
可以是以下公式:
Figure PCTCN2022071588-appb-000008
其中,E′ R可以是目标物品对应的第二目标向量。y t可以是生成的目标物品的文案的第t个词语。p(y t|E′ R)可以表征解码网络的输出。θ可以是参数。
Figure PCTCN2022071588-appb-000009
可以是以下公式:
Figure PCTCN2022071588-appb-000010
其中,y t可以是生成的目标物品的文案的第t个词语。|S|可以是生成的物品文案的词语的数目。y<t可以表征生成的目标物品的文案的第1个词语到第t-1词语的集合。T可以是标题信息对应的第一向量。A可以是属性信息对应的第二向量。R fuse(y<t)可以是第1个词语到第t-1词语中的各个的相关性分数。
R fuse(y<t)中是通过以下公式生成的:
R fuse(y *)=βR Coh(y *)+(1-β)RRG(y *),
其中,β可以是调节参数,数值范围为[0,1]。y *可以是y<t。R Coh(y *)可以表征生成的y *与预先撰写的文案的重叠度。R RG(y*)可以表征y *的ROUGE分数。
R Coh(y *)可以是以下公式:
Figure PCTCN2022071588-appb-000011
其中,|y *|可以是词语的数量。f(·)可以是词频函数。
需要说明的是,如果y *中的各个词语存在于预先撰写的、作为标注的物品文案中,则,
Figure PCTCN2022071588-appb-000012
如果y *中的各个词语存在于 预先撰写的、作为标注的物品文案中,则
Figure PCTCN2022071588-appb-000013
本公开的上述各个实施例中具有如下有益效果:通过本公开的一些实施例的物品文案生成网络训练方法可以通过训练后的第一物品文案生成网络指导训练初始第二物品文案生成网络生成物品文案,使得训练后的第二物品文案生成网络可以依据物品的标题信息和物品的属性信息准确、有效的生成物品文案。具体来说,大多数物品的评论信息较少或存在评论信息的价值较低,不能有效的依据评论信息生成较为优质的物品文案。基于此,本公开的一些实施例的物品文案生成网络训练方法会首先获取物品集中每个物品的物品描述信息,其中,上述物品描述信息包括:物品的标题信息、物品的属性信息和物品的至少一个评论信息。然后,对上述物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集。在这里,对上述物品描述信息集进行数据预处理用于去除没有意义的评论信息,避免影响第二物品文案生成网络的训练准确度。进而,对初始第一物品文案生成网络进行训练,得到训练后的第一物品文案生成网络,其中,上述初始第一物品文案生成网络对应的训练样本包括:上述处理后的物品描述信息集中的物品描述信息和对应的物品文案,上述物品文案是预先撰写的。在这里,训练后的第一物品文案生成网络可以依据输入的物品描述信息来生成优质的物品文案。最后,将上述处理后的物品描述信息集中每个物品描述信息的标题信息、属性信息和上述每个物品描述信息对应的物品文案作为初始第二物品文案生成网络的训练样本,根据上述训练后的第一物品文案生成网络,利用知识蒸馏方法,对上述初始第二物品文案生成网络进行训练,得到训练后的第二物品文案生成网络。指示训练后的第一物品文案生成网络来指导训练初始第二物品文案生成网络可以使得第二物品文案生成网络在不依靠输入物品的至少一个评论信息的情况下,学习训练后的第一物品文案生成网络生成优质物品文案的一些特征信息。以此,可以有效的解决大多数物品的评论信息较少或存在评论信息的价值较低,不能有效的依据评论信息生成较为优质的物品文案的问题。由此可得,上述物品文案生成网络训练方法可以使得训练后的第二物品文案生成网络可以依据物品的 标题信息和物品的属性信息准确、有效的生成较为优质的物品文案。
图3是本公开的一些实施例的物品文案生成方法的一个应用场景图的示意图。
如图3所示,电子设备301可以获取目标物品302的标题信息3031和属性信息3032。然后,将上述标题信息3031和上述属性信息3032输入至训练后的第二物品文案生成网络304,得到上述目标物品302对应的物品文案305。其中,上述训练后的第二物品文案生成网络304是根据训练后的第一物品文案生成网络,利用知识蒸馏方法对初始第二物品文案生成网络训练的。在本应用场景中,上述目标物品302可以是:“鞋子”。上述物品描述信息303中的标题信息3031可以是:“标题信息:特价款,***官方旗舰,***联名女鞋,复古帆布鞋,***女鞋宝盖鞋子女白红”。上述物品描述信息303中的属性信息3032可以是:“属性信息:功能:透气,耐磨;风格:休闲;颜色:白色,红色,黑色,蓝色;鞋面材质:织物;”。上述物品文案305可以是:“鞋子文案:***联名合作款,将经典元素与当下潮流结合,鞋舌处点缀的logo设计,个性时尚,高街范立现,带你轻松玩转街头,柔软舒适,提升穿着体验”。
需要说明的是,上述电子设备301可以是硬件,也可以是软件。当电子设备为硬件时,可以实现成多个服务器或终端设备组成的分布式集群,也可以实现成单个服务器或单个终端设备。当电子设备体现为软件时,可以安装在上述所列举的硬件设备中。其可以实现成例如用来提供分布式服务的多个软件或软件模块,也可以实现成单个软件或软件模块。在此不做具体限定。
应该理解,图3中的电子设备的数目仅仅是示意性的。根据实现需要,可以具有任意数目的电子设备。
继续参考图4,示出了根据本公开的物品文案生成方法的一些实施例的流程400。该物品文案生成方法,包括以下步骤:
步骤401,获取目标物品的标题信息和属性信息。
在一些实施例中,物品文案生成方法的执行主体(例如图3所示的电子设备301)可以通过有线连接方式或者无线连接方式获取物品集中每个物品的物品描述信息。
步骤402,将上述标题信息和上述属性信息输入至训练后的第二物品文案生成网络,得到上述目标物品对应的物品文案。
在一些实施例中,上述执行主体可以将上述标题信息和上述属性信息输入至训练后的第二物品文案生成网络,得到上述目标物品对应的物品文案。其中,上述训练后的第二物品文案生成网络是根据训练后的第一物品文案生成网络,利用知识蒸馏方法对初始第二物品文案生成网络训练的。
在一些实施例的一些可选的实现方式中,上述将上述标题信息和上述属性信息输入至训练后的第二物品文案生成网络,得到上述目标物品对应的物品文案,可以包括以下步骤:
第一步,对上述标题信息和上述属性信息进行词向量转换,得到与上述标题信息对应的第一向量和上述属性信息对应的第二向量。作为示例,上述执行主体可以首先对上述标题信息和上述属性信息进行分词,得到上述标题信息对应的词集和上述属性信息对应的词集。然后对上述上述标题信息对应的词集和上述属性信息对应的词集进行词嵌入处理,得到上述第一向量和上述第二向量。
第二步,对上述第一向量和上述第二向量进行编码,得到上述标题信息对应的第三向量集和上述属性信息对应第四向量集。
第三步,对于第三向量集中每个第三向量和上述第三向量对应的第四向量,对上述第三向量和上述第四向量进行线性组合,得到第五向量。
第四步,对所得到的第五向量集进行解码,得到上述目标物品对应的物品文案。
可选的,将上述第一向量和上述第三向量分别输入至预先训练的编码网络,得到上述第三向量集和上述第四向量集。其中,上述编码网络包括至少一层编码层。需要说明的是,其中,上述编码网络是多层编码层是串行连接的网络。上述编码网络中的每层编码层对应着一 个编码输出向量。
可选的,上述对于第三向量集中每个第三向量和上述第三向量对应的第四向量,对上述第三向量和上述第四向量进行线性组合,得到第五向量,可以包括以下步骤:
第一步,将上述第三向量与数值η相乘,得到第一相乘结果。其中,上述数值η为0-1之间的数值。
第二步,将上述第四向量与数值1-η相乘,得到第二相乘结果。
第三步,将上述第一相乘结果与上述第二相乘结果进行相加,得到上述第五向量。
可选的,将上述第五向量集输入至预先训练的带有拷贝(copy)机制的解码网络,得到上述目标物品对应的物品文案。
本公开的上述各个实施例中具有如下有益效果:通过本公开的一些实施例的物品文案生成方法可以首先获取目标物品的标题信息和属性信息。然后,将上述标题信息和上述属性信息输入至训练后的第二物品文案生成网络,可以准确、有效的生成上述目标物品对应的较为优质的物品文案。其中,上述训练后的第二物品文案生成网络是根据训练后的第一物品文案生成网络,利用知识蒸馏方法对初始第二物品文案生成网络训练的。
继续参考图5,作为对上述各图上述方法的实现,本公开提供了一种物品文案生成网络训练装置的一些实施例,这些装置实施例与图2上述的那些方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图5所示,一些实施例的物品文案生成网络训练装置500包括:获取单元501、预处理单元502、第一训练单元503和第二训练单元504。其中,获取单元501,被配置成获取物品集中每个物品的物品描述信息,其中,上述物品描述信息包括:物品的标题信息、物品的属性信息和物品的至少一个评论信息。预处理单元502,被配置成对上述物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集。第一训练单元503,被配置成对初始第一物品文案生成 网络进行训练,得到训练后的第一物品文案生成网络,其中,上述初始第一物品文案生成网络对应的训练样本包括:上述处理后的物品描述信息集中的物品描述信息和对应的物品文案,上述物品文案是预先撰写的。第二训练单元504,被配置成将上述处理后的物品描述信息集中每个物品描述信息的标题信息、属性信息和上述每个物品描述信息对应的物品文案作为初始第二物品文案生成网络的训练样本,根据上述训练后的第一物品文案生成网络,利用知识蒸馏方法,对上述初始第二物品文案生成网络进行训练,得到训练后的第二物品文案生成网络。
在一些实施例的一些可选的实现方式中,物品文案生成网络训练装置500的预处理单元502可以进一步被配置成:确定上述物品描述信息集中每个物品描述信息的评论信息的数目;从上述物品描述信息集中去除评论信息的数目小于预定阈值的物品描述信息,得到去除后的物品描述信息集;从上述去除后的物品描述信息集中的每个物品描述信息中去除评论内容满足预定条件的评论信息以生成处理后的物品描述信息,得到上述处理后的物品描述信息集。
在一些实施例的一些可选的实现方式中,物品文案生成网络训练装置500的第二训练单元504可以进一步被配置成:将上述训练后的第一物品文案生成网络内部输出的第一目标向量对应的条件概率与上述第二物品文案生成网络内部输出的第二目标向量集对应的条件概率集之间的KL距离集作为训练约束条件,对上述初始第二物品文案生成网络进行训练,得到上述训练后的第二物品文案生成网络。
可以理解的是,该装置500中记载的诸单元与参考图2描述的方法中的各个步骤相对应。由此,上文针对方法描述的操作、特征以及产生的有益效果同样适用于装置500及其中包含的单元,在此不再赘述。
继续参考图6,作为对上述各图上述方法的实现,本公开提供了一种物品文案生成装置的一些实施例,这些装置实施例与图4上述的那些方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图6所示,一些实施例的物品文案生成装置600包括:获取单元601和输入单元602。其中,获取单元601,被配置成获取目标物品的标题信息和属性信息。输入单元602,将上述标题信息和上述属性信息输入至训练后的第二物品文案生成网络,得到上述目标物品对应的物品文案,其中,上述训练后的第二物品文案生成网络是根据训练后的第一物品文案生成网络,利用知识蒸馏方法对初始第二物品文案生成网络训练的。
在一些实施例的一些可选的实现方式中,物品文案生成装置600的输入单元602可以进一步被配置成:对上述标题信息和上述属性信息进行词向量转换,得到与上述标题信息对应的第一向量和上述属性信息对应的第二向量;对上述第一向量和上述第二向量进行编码,得到上述标题信息对应的第三向量集和上述属性信息对应第四向量集;对于第三向量集中每个第三向量和上述第三向量对应的第四向量,对上述第三向量和上述第四向量进行线性组合,得到第五向量;对所得到的第五向量集进行解码,得到上述目标物品对应的物品文案。
在一些实施例的一些可选的实现方式中,物品文案生成装置600的输入单元602可以进一步被配置成:将上述第一向量和上述第三向量分别输入至预先训练的编码网络,得到上述第三向量集和上述第四向量集,其中,上述编码网络包括至少一层编码层。
在一些实施例的一些可选的实现方式中,物品文案生成装置600的输入单元602可以进一步被配置成:将上述第三向量与数值η相乘,得到第一相乘结果,其中,上述数值η为0-1之间的数值;将上述第四向量与数值1-η相乘,得到第二相乘结果;将上述第一相乘结果与上述第二相乘结果进行相加,得到上述第五向量。
在一些实施例的一些可选的实现方式中,物品文案生成装置600的输入单元602可以进一步被配置成:将上述第五向量集输入至预先训练的带有拷贝机制的解码网络,得到上述目标物品对应的物品文案。
可以理解的是,该装置600中记载的诸单元与参考图4描述的方法中的各个步骤相对应。由此,上文针对方法描述的操作、特征以及产生的有益效果同样适用于装置600及其中包含的单元,在此不再赘 述。
下面参考图7,其示出了适于用来实现本公开的一些实施例的电子设备(例如图1或图3中的电子设备)700的结构示意图。图7示出的电子设备仅仅是一个示例,不应对本公开的实施例的功能和使用范围带来任何限制。
如图7所示,电子设备700可以包括处理装置(例如中央处理器、图形处理器等)701,其可以根据存储在只读存储器(ROM)702中的程序或者从存储装置708加载到随机访问存储器(RAM)703中的程序而执行各种适当的动作和处理。在RAM 703中,还存储有电子设备700操作所需的各种程序和数据。处理装置701、ROM 702以及RAM 703通过总线704彼此相连。输入/输出(I/O)接口705也连接至总线704。
通常,以下装置可以连接至I/O接口705:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置706;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置707;包括例如磁带、硬盘等的存储装置708;以及通信装置709。通信装置709可以允许电子设备700与其他设备进行无线或有线通信以交换数据。虽然图7示出了具有各种装置的电子设备700,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。图7中示出的每个方框可以代表一个装置,也可以根据需要代表多个装置。
特别地,根据本公开的一些实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的一些实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的一些实施例中,该计算机程序可以通过通信装置709从网络上被下载和安装,或者从存储装置708被安装,或者从ROM 702被安装。在该计算机程序被处理装置701执行时,执行本公开的一些实施例的方法中限定的上述功能。
需要说明的是,本公开的一些实施例上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开的一些实施例中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开的一些实施例中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(HyperText Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述装置中所包含的;也可以是单独存在,而未装配入该电子设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取物品集中每个物品的物品描述信息,其中,上述物品 描述信息包括:物品的标题信息、物品的属性信息和物品的至少一个评论信息;对上述物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集;对初始第一物品文案生成网络进行训练,得到训练后的第一物品文案生成网络,其中,上述初始第一物品文案生成网络对应的训练样本包括:上述处理后的物品描述信息集中的物品描述信息和对应的物品文案,上述物品文案是预先撰写的;将上述处理后的物品描述信息集中每个物品描述信息的标题信息、属性信息和上述每个物品描述信息对应的物品文案作为初始第二物品文案生成网络的训练样本,根据上述训练后的第一物品文案生成网络,利用知识蒸馏方法,对上述初始第二物品文案生成网络进行训练,得到训练后的第二物品文案生成网络。获取目标物品的标题信息和属性信息;将上述标题信息和上述属性信息输入至训练后的第二物品文案生成网络,得到上述目标物品对应的物品文案,其中,上述训练后的第二物品文案生成网络是根据训练后的第一物品文案生成网络,利用知识蒸馏方法对初始第二物品文案生成网络训练的。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的一些实施例的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)——连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实 现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开的一些实施例中的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括获取单元、预处理单元、第一训练单元和第二训练单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,获取单元还可以被描述为“获取物品集中每个物品的物品描述信息的单元”。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。
以上描述仅为本公开的一些较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开的实施例中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开的实施例中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。

Claims (13)

  1. 一种物品文案生成网络训练方法,包括:
    获取物品集中每个物品的物品描述信息,其中,所述物品描述信息包括:物品的标题信息、物品的属性信息和物品的至少一个评论信息;
    对所述物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集;
    对初始第一物品文案生成网络进行训练,得到训练后的第一物品文案生成网络,其中,所述初始第一物品文案生成网络对应的训练样本包括:所述处理后的物品描述信息集中的物品描述信息和对应的物品文案,所述物品文案是预先撰写的;
    将所述处理后的物品描述信息集中每个物品描述信息的标题信息、属性信息和所述每个物品描述信息对应的物品文案作为初始第二物品文案生成网络的训练样本,根据所述训练后的第一物品文案生成网络,利用知识蒸馏方法,对所述初始第二物品文案生成网络进行训练,得到训练后的第二物品文案生成网络。
  2. 根据权利要求1所述的方法,其中,所述对所述物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集,包括:
    确定所述物品描述信息集中每个物品描述信息的评论信息的数目;
    从所述物品描述信息集中去除评论信息的数目小于预定阈值的物品描述信息,得到去除后的物品描述信息集;
    从所述去除后的物品描述信息集中的每个物品描述信息中去除评论内容满足预定条件的评论信息以生成处理后的物品描述信息,得到所述处理后的物品描述信息集。
  3. 根据权利要求1或2所述的方法,其中,所述根据所述训练后的第一物品文案生成网络,利用知识蒸馏方法,对所述初始第二物品 文案生成网络进行训练,得到训练后的第二物品文案生成网络,包括:
    将所述训练后的第一物品文案生成网络内部输出的第一目标向量对应的条件概率与所述第二物品文案生成网络内部输出的第二目标向量集对应的条件概率集之间的KL距离集作为训练约束条件,对所述初始第二物品文案生成网络进行训练,得到所述训练后的第二物品文案生成网络。
  4. 根据权利要求1-3之一所述的方法,其中,所述初始第二物品文案生成网络的损失函数是根据KL距离公式、表征所述第一物品文案生成网络生成的物品文案与物品的属性信息相关性的损失函数和表征所述第二物品文案生成网络生成的物品文案与物品的属性信息相关性的损失函数生成的。
  5. 一种物品文案生成方法,包括:
    获取目标物品的标题信息和属性信息;
    将所述标题信息和所述属性信息输入至训练后的第二物品文案生成网络,得到所述目标物品对应的物品文案,其中,所述训练后的第二物品文案生成网络是根据训练后的第一物品文案生成网络,利用知识蒸馏方法对初始第二物品文案生成网络训练的。
  6. 根据权利要求5所述的方法,其中,所述将所述标题信息和所述属性信息输入至训练后的第二物品文案生成网络,得到所述目标物品对应的物品文案,包括:
    对所述标题信息和所述属性信息进行词向量转换,得到与所述标题信息对应的第一向量和所述属性信息对应的第二向量;
    对所述第一向量和所述第二向量进行编码,得到所述标题信息对应的第三向量集和所述属性信息对应第四向量集;
    对于第三向量集中每个第三向量和所述第三向量对应的第四向量,对所述第三向量和所述第四向量进行线性组合,得到第五向量;
    对所得到的第五向量集进行解码,得到所述目标物品对应的物品 文案。
  7. 根据权利要求6所述的方法,其中,所述对所述第一向量和所述第二向量进行编码,得到所述标题信息对应的第三向量集和所述属性信息对应第四向量集,包括:
    将所述第一向量和所述第三向量分别输入至预先训练的编码网络,得到所述第三向量集和所述第四向量集,其中,所述编码网络包括至少一层编码层。
  8. 根据权利要求6或7所述的方法,其中,所述对于第三向量集中每个第三向量和所述第三向量对应的第四向量,对所述第三向量和所述第四向量进行线性组合,得到第五向量,包括:
    将所述第三向量与数值η相乘,得到第一相乘结果,其中,所述数值η为0-1之间的数值;
    将所述第四向量与数值1-η相乘,得到第二相乘结果;
    将所述第一相乘结果与所述第二相乘结果进行相加,得到所述第五向量。
  9. 根据权利要求6-8之一所述的方法,其中,所述对所得到的第五向量集进行解码,得到所述目标物品对应的物品文案,包括:
    将所述第五向量集输入至预先训练的带有拷贝机制的解码网络,得到所述目标物品对应的物品文案。
  10. 一种物品文案生成网络训练装置,包括:
    获取单元,被配置成获取物品集中每个物品的物品描述信息,其中,所述物品描述信息包括:物品的标题信息、物品的属性信息和物品的至少一个评论信息;
    预处理单元,被配置成对所述物品集对应的物品描述信息集进行数据预处理,得到处理后的物品描述信息集;
    第一训练单元,被配置成对初始第一物品文案生成网络进行训练, 得到训练后的第一物品文案生成网络,其中,所述初始第一物品文案生成网络对应的训练样本包括:所述处理后的物品描述信息集中的物品描述信息和对应的物品文案,所述物品文案是预先撰写的;
    第二训练单元,被配置成将所述处理后的物品描述信息集中每个物品描述信息的标题信息、属性信息和所述每个物品描述信息对应的物品文案作为初始第二物品文案生成网络的训练样本,根据所述训练后的第一物品文案生成网络,利用知识蒸馏方法,对所述初始第二物品文案生成网络进行训练,得到训练后的第二物品文案生成网络。
  11. 一种物品文案生成装置,包括:
    获取单元,被配置成获取目标物品的标题信息和属性信息;
    输入单元,被配置成将所述标题信息和所述属性信息输入至训练后的第二物品文案生成网络,得到所述目标物品对应的物品文案,其中,所述训练后的第二物品文案生成网络是根据训练后的第一物品文案生成网络,利用知识蒸馏方法对初始第二物品文案生成网络训练的。
  12. 一种电子设备,包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如权利要求1-9中任一所述的方法。
  13. 一种计算机可读介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现如权利要求1-9中任一所述的方法。
PCT/CN2022/071588 2021-01-21 2022-01-12 物品文案生成网络训练方法、物品文案生成方法、装置 WO2022156576A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/273,473 US20240135146A1 (en) 2021-01-21 2022-01-12 Method and Apparatus for Training Item Copy-writing Generation Network, and Method and Apparatus for Generating Item Copy-writing

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110084578.X 2021-01-21
CN202110084578.XA CN113780516A (zh) 2021-01-21 2021-01-21 物品文案生成网络训练方法、物品文案生成方法、装置

Publications (1)

Publication Number Publication Date
WO2022156576A1 true WO2022156576A1 (zh) 2022-07-28

Family

ID=78835547

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/071588 WO2022156576A1 (zh) 2021-01-21 2022-01-12 物品文案生成网络训练方法、物品文案生成方法、装置

Country Status (3)

Country Link
US (1) US20240135146A1 (zh)
CN (1) CN113780516A (zh)
WO (1) WO2022156576A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113780516A (zh) * 2021-01-21 2021-12-10 北京沃东天骏信息技术有限公司 物品文案生成网络训练方法、物品文案生成方法、装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190205748A1 (en) * 2018-01-02 2019-07-04 International Business Machines Corporation Soft label generation for knowledge distillation
CN110196972A (zh) * 2019-04-24 2019-09-03 北京奇艺世纪科技有限公司 文案生成方法、装置及计算机可读存储介质
CN110765273A (zh) * 2019-09-17 2020-02-07 北京三快在线科技有限公司 推荐文案生成方法、装置、电子设备及可读存储介质
CN113377914A (zh) * 2021-06-10 2021-09-10 北京沃东天骏信息技术有限公司 推荐文本生成方法、装置、电子设备和计算机可读介质
CN113780516A (zh) * 2021-01-21 2021-12-10 北京沃东天骏信息技术有限公司 物品文案生成网络训练方法、物品文案生成方法、装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190205748A1 (en) * 2018-01-02 2019-07-04 International Business Machines Corporation Soft label generation for knowledge distillation
CN110196972A (zh) * 2019-04-24 2019-09-03 北京奇艺世纪科技有限公司 文案生成方法、装置及计算机可读存储介质
CN110765273A (zh) * 2019-09-17 2020-02-07 北京三快在线科技有限公司 推荐文案生成方法、装置、电子设备及可读存储介质
CN113780516A (zh) * 2021-01-21 2021-12-10 北京沃东天骏信息技术有限公司 物品文案生成网络训练方法、物品文案生成方法、装置
CN113377914A (zh) * 2021-06-10 2021-09-10 北京沃东天骏信息技术有限公司 推荐文本生成方法、装置、电子设备和计算机可读介质

Also Published As

Publication number Publication date
US20240135146A1 (en) 2024-04-25
CN113780516A (zh) 2021-12-10

Similar Documents

Publication Publication Date Title
US20210200951A1 (en) Method and apparatus for outputting information
CN106685916B (zh) 电子会议智能装置及方法
CN111680159B (zh) 数据处理方法、装置及电子设备
CN109614111B (zh) 用于生成代码的方法和装置
CN110688528B (zh) 生成视频的分类信息的方法、装置、电子设备和介质
TW201915790A (zh) 關注點文案的生成
CN112148784A (zh) 用于汇总和引导多用户协作数据分析的系统和方法
CN110209774A (zh) 处理会话信息的方法、装置及终端设备
US11669679B2 (en) Text sequence generating method and apparatus, device and medium
CN113254684B (zh) 一种内容时效的确定方法、相关装置、设备以及存储介质
JP7337172B2 (ja) 音声パケット推薦方法、装置、電子機器およびプログラム
CN111582360A (zh) 用于标注数据的方法、装置、设备和介质
WO2022156576A1 (zh) 物品文案生成网络训练方法、物品文案生成方法、装置
WO2024099171A1 (zh) 视频生成方法和装置
WO2022174669A1 (zh) 信息生成方法、装置、电子设备和计算机可读介质
US20230367972A1 (en) Method and apparatus for processing model data, electronic device, and computer readable medium
WO2022017299A1 (zh) 一种文本检测方法、装置、电子设备及存储介质
CN116127080A (zh) 描述对象的属性值提取方法及相关设备
WO2023221661A1 (zh) 用户喜好物品信息生成方法、装置、电子设备和介质
Tok et al. Practical Weak Supervision
CN115062119B (zh) 政务事件办理推荐方法、装置
CN116977885A (zh) 视频文本任务处理方法、装置、电子设备及可读存储介质
US12009071B2 (en) System and/or method for determining service codes from electronic signals and/or states using machine learning
CN114817559A (zh) 问答方法、装置、计算机设备和存储介质
CN110633476B (zh) 用于获取知识标注信息的方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22742055

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18273473

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22742055

Country of ref document: EP

Kind code of ref document: A1