CN114997920A - Method for generating advertisement file, device, equipment, medium and product thereof - Google Patents

Method for generating advertisement file, device, equipment, medium and product thereof Download PDF

Info

Publication number
CN114997920A
CN114997920A CN202210621648.5A CN202210621648A CN114997920A CN 114997920 A CN114997920 A CN 114997920A CN 202210621648 A CN202210621648 A CN 202210621648A CN 114997920 A CN114997920 A CN 114997920A
Authority
CN
China
Prior art keywords
sample
generator
advertisement
title
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210621648.5A
Other languages
Chinese (zh)
Inventor
胡凌宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huanju Shidai Information Technology Co Ltd
Original Assignee
Guangzhou Huanju Shidai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huanju Shidai Information Technology Co Ltd filed Critical Guangzhou Huanju Shidai Information Technology Co Ltd
Priority to CN202210621648.5A priority Critical patent/CN114997920A/en
Publication of CN114997920A publication Critical patent/CN114997920A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0276Advertisement creation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Mathematical Physics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to an advertisement case generation method, a device, equipment and a medium thereof, wherein the method inputs a commodity title to a generator, the generator generates a corresponding advertisement case, and the training process of the generator comprises the following steps: iteratively training a first model architecture to be convergent by adopting a sample title, wherein the first model architecture comprises a generator and a constrainer, the generator is used for predicting an advertisement scheme corresponding to the sample title, and the constrainer is used for predicting a result title corresponding to the advertisement scheme; iteratively training a discriminator to converge using a sample pattern, the sample pattern including negative samples generated by the generator and pre-collected positive samples, the discriminator for predicting the sample pattern as a positive sample or a negative sample; iteratively training a second model architecture to a converged state using a sample header, the second model architecture comprising the generator and the discriminator, the weight of the discriminator being frozen. The advertisement file generator can efficiently prepare the advertisement file and is used for generating the advertisement file according to the commodity title.

Description

Method for generating advertisement file, device, equipment, medium and product thereof
Technical Field
The present application relates to the field of e-commerce information technology, and in particular, to an advertisement document generation method, and a corresponding apparatus, computer device, computer-readable storage medium, and computer program product.
Background
The e-commerce platform is usually provided with an advertisement putting page for a shop user to put advertisements corresponding to commodities on shelves in the shop into the advertisement system, so that the aims of online drainage and promotion of commodity transaction amount are fulfilled.
The advertisement case corresponding to the advertisement needs a certain degree of expertise, and particularly, when a shop user needs to create an advertisement case in a non-native language, the shop user is more strenuous. In order to simplify the creation of the advertisement copy, an auxiliary creation means is often adopted to provide corresponding services for the user to call.
One of the traditional auxiliary creation means is to use a neural network model for constructing an advertisement file, wherein a user provides access parameters restricted by the neural network model, and the neural network model generates a corresponding advertisement file according to the obtained access parameters according to the learned ability. In practice, the preparation efficiency and the preparation effect of the known neural network model are not well represented, so that the generated advertising pattern is not well represented.
Therefore, how to promote the creation assistance effect of the advertisement case needs to be explored separately.
Disclosure of Invention
The present application is directed to solve the above problems and provide an advertisement document generation method and corresponding apparatus, computer device, computer readable storage medium, computer program product,
The technical scheme is adopted to adapt to various purposes of the application as follows:
in one aspect, a method for generating an advertisement document by inputting a title of a product to a generator and generating a corresponding advertisement document by the generator is provided, wherein the training process of the generator comprises:
iteratively training a first model architecture to be convergent by adopting a sample title, wherein the first model architecture comprises a generator and a constrainer, the generator is used for predicting an advertisement scheme corresponding to the sample title, and the constrainer is used for predicting a result title corresponding to the advertisement scheme;
iteratively training a discriminator to converge using a sample pattern, the sample pattern including negative samples generated by the generator and pre-collected positive samples, the discriminator for predicting the sample pattern as a positive sample or a negative sample;
iteratively training a second model architecture comprising the generator and the discriminator to a converged state using a sample header, wherein weights of the discriminator are frozen.
On the other hand, in accordance with one of the objects of the present application, there is provided an advertisement document generation device for inputting a product title to a generator and generating a corresponding advertisement document by the generator; the matching of which provides a training device comprising: the first training module is used for iteratively training a first model architecture to be convergent by adopting a sample title, the first model architecture comprises a generator and a constrainer, the generator is used for predicting the advertisement literature corresponding to the sample title, and the constrainer is used for predicting a result title corresponding to the advertisement literature; a second training module for iteratively training a discriminator to converge using a sample pattern, the sample pattern including a negative sample generated by the generator and a pre-collected positive sample, the discriminator for predicting the sample pattern as a positive sample or a negative sample; a third training module for iteratively training a second model architecture to a converged state using sample headers, the second model architecture comprising the generator and the discriminator, wherein weights of the discriminator are frozen.
In yet another aspect, a computer device adapted for one of the purposes of the present application includes a central processor and a memory, the central processor being configured to invoke execution of a computer program stored in the memory to perform the steps of the advertising copy generation method described herein.
In still another aspect, a computer-readable storage medium is provided, which stores a computer program implemented according to the method for generating an advertising copy in the form of computer-readable instructions, and when the computer program is called by a computer, executes the steps included in the method.
In yet another aspect, a computer program product is provided to adapt another object of the present application, comprising computer program/instructions to implement the steps of the advertising copy generation method described in any of the embodiments of the present application when executed by a processor.
Compared with the prior art, the application has various advantages, at least comprising the following aspects:
the generator can generate the high-quality advertisement file according to the commodity title, and the generator is effectively trained. In the training process of the generator, the first stage is a post-constraining device of the generator, the generator generates the advertisement file according to the sample title, and the constraining device reversely restores the advertisement file into the result title, so that gradient updating can be carried out on the generator and the constraining device according to the result title and the sample title without depending on the labeling information of the sample title; the second stage separately trains the ability of the discriminator for identifying whether the advertisement case meets the requirement, wherein the advertisement case generated by the generator is used as a sample case for classification judgment without depending on manual marking; in the third stage, the generator and the discriminator trained in the first two stages are adopted for combined training, the generator is responsible for generating the advertising copy according to the sample title, the discriminator is frozen in weight without participating in updating, and is only responsible for judging whether the advertising copy generated by the generator meets the requirements, and the like only needs to determine that the sample title is a positive sample of an effective source, and can also be independent of additional manual marking. Through the above process, the training of the generator can be completed, the generator firstly learns the capability of generating the advertisement file in the first stage, and participates in training and adjusts the weight to correct the generation capability through the discriminator in the third stage, so that the generator can generate the high-quality advertisement file according to the given commodity title. In the whole training process, the sample titles can be reused in different training stages, and the sample file can also use the product in the training process of the generator, so that the dependence on the sample amount is greatly reduced, the data utilization rate is improved, and the training cost of the generator is undoubtedly reduced.
Drawings
The above and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a flowchart illustrating an exemplary embodiment of an advertisement document generating method according to the present application.
Fig. 2 is a schematic structural diagram of an exemplary first model architecture according to the present application.
Fig. 3 is a schematic structural diagram of an exemplary second model architecture of the present application.
Fig. 4 is a schematic flowchart of training a first model architecture in the embodiment of the present application.
Fig. 5 is a schematic flowchart of training a discriminator in the embodiment of the present application.
Fig. 6 is a schematic flowchart illustrating a process of training a second model architecture according to an embodiment of the present application.
Fig. 7 is a schematic workflow diagram of an exemplary generator of the present application.
FIG. 8 is a functional block diagram of an advertising copy generation apparatus of the present application;
fig. 9 is a schematic structural diagram of a computer device used in the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As will be appreciated by those skilled in the art, "client," "terminal," and "terminal device" as used herein include both devices that are wireless signal receivers, which are devices having only wireless signal receivers without transmit capability, and devices that are receive and transmit hardware, which have receive and transmit hardware capable of two-way communication over a two-way communication link. Such a device may include: cellular or other communication devices such as personal computers, tablets, etc. having single or multi-line displays or cellular or other communication devices without multi-line displays; PCS (Personal Communications Service), which may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant), which may include a radio frequency receiver, a pager, internet/intranet access, a web browser, a notepad, a calendar and/or a GPS (Global Positioning System) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "client," "terminal device" can be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. The "client", "terminal Device" used herein may also be a communication terminal, a web terminal, a music/video playing terminal, such as a PDA, an MID (Mobile Internet Device) and/or a Mobile phone with music/video playing function, and may also be a smart tv, a set-top box, and the like.
The hardware referred to by the names "server", "client", "service node", etc. is essentially an electronic device with the performance of a personal computer, and is a hardware device having necessary components disclosed by the von neumann principle such as a central processing unit (including an arithmetic unit and a controller), a memory, an input device, an output device, etc., a computer program is stored in the memory, and the central processing unit calls a program stored in an external memory into the internal memory to run, executes instructions in the program, and interacts with the input and output devices, thereby completing a specific function.
It should be noted that the concept of "server" as referred to in this application can be extended to the case of a server cluster. According to the network deployment principle understood by those skilled in the art, the servers should be logically divided, and in physical space, the servers can be independent of each other but can be called through an interface, or can be integrated into a physical computer or a set of computer clusters. Those skilled in the art will appreciate this variation and should not be so limited as to restrict the implementation of the network deployment of the present application.
One or more technical features of the present application, unless expressly specified otherwise, may be deployed to a server for implementation by a client remotely invoking an online service interface provided by a capture server for access, or may be deployed directly and run on the client for access.
Unless specified in clear text, the neural network model referred to or possibly referred to in the application can be deployed in a remote server and used for remote call at a client, and can also be deployed in a client with qualified equipment capability for direct call.
Various data referred to in the present application may be stored in a server remotely or in a local terminal device unless specified in the clear text, as long as the data is suitable for being called by the technical solution of the present application.
The person skilled in the art will know this: although the various methods of the present application are described based on the same concept so as to be common to each other, they may be independently performed unless otherwise specified. In the same way, for each embodiment disclosed in the present application, the same inventive concept is proposed, and therefore, concepts expressed in the same manner and concepts expressed in terms of the same are equally understood, and even though the concepts are expressed differently, they are merely convenient and appropriately changed.
The embodiments to be disclosed herein can be flexibly constructed by cross-linking related technical features of the embodiments unless the mutual exclusion relationship between the related technical features is stated in the clear text, as long as the combination does not depart from the inventive spirit of the present application and can meet the needs of the prior art or solve the deficiencies of the prior art. Those skilled in the art will appreciate variations therefrom.
The advertising copy generation method can be programmed into a computer program product, and is deployed in a client or a server to run, for example, in an exemplary application scenario of the present application, the advertising copy generation method can be deployed in a server of an e-commerce platform, so that the method can be executed by accessing an interface opened after the computer program product runs and performing human-computer interaction with a process of the computer program product through a graphical user interface.
Referring to fig. 1, in an exemplary embodiment of the method for generating an advertisement copy of the present application, a product title is input to a generator, and the generator generates a corresponding advertisement copy. In order to make the generator have the corresponding advertisement pattern generation capability, the generator may be trained, in this embodiment, the training process of the generator includes the following steps:
step S1100, iteratively training a first model architecture to be convergent by adopting a sample title, wherein the first model architecture comprises a generator and a constrainer, the generator is used for predicting an advertisement case corresponding to the sample title, and the constrainer is used for predicting a result title corresponding to the advertisement case;
the purpose of this step is to perform a first stage of training on the generator. To implement the first stage of training, a first model architecture is built. As shown in fig. 2, the first model architecture includes a generator and a constrainer, which may have identical network structures, for example, a deep learning model implemented by using a basic neural network model such as CNN (convolutional neural network), RNN (recurrent neural network), and the like.
In one embodiment, the network structure of the generator and the constraint device is implemented by using a mature model based on RNN, such as LSTM (long short term recurrent neural network), biltm (bidirectional long short term recurrent neural network), etc., in which a first LSTM base model is used as an encoder and a second LSTM base model is used as a decoder, the encoder is used for encoding input first text information to implement feature extraction, and the decoder is used for predicting second text information corresponding to the first text information according to the extracted features. It can be seen that there is a corresponding conversion relationship between the input of the encoder and the output of the decoder.
In another embodiment, the encoder and decoder in the network structure may be replaced by other mature models suitable for extracting text features and decoding the output according to the extracted features, such as a transform series model.
In another embodiment, the generator and the constraint device may have different network architectures, for example, the generator may be implemented by using a basic model of the transform series, the decoder may be implemented by using a BilSTM, and so on, and those skilled in the art can flexibly and alternatively implement the method.
In the process that the first model architecture is trained, a sample title which is a training sample is input into a generator, the generator predicts a corresponding advertisement file after coding and decoding according to the sample title, then the advertisement file is input into a constrainer, the constrainer codes and decodes according to the advertisement file, and then a corresponding result title is predicted. It can be seen that the generator and the constraining device are used for realizing the reciprocal function, the advertisement file generated by the generator is restored to the result title by the constraining device, as long as the result title is compared with the sample title to be consistent, the generator and the constraining device can be trained by minimizing the reconstruction error between the sample title and the result title, so that the advertisement file generated by the generator can be ensured to contain the key semantic features of the sample title, and the generator can learn the capability of generating the corresponding file according to the input title.
In the process of implementing the first-stage training on the first model architecture, when the whole model architecture is not converged, different sample titles can be continuously called to implement iterative training on the first model architecture until the first-stage training can be terminated when the generator and the constraining device are judged to reach a convergence state according to a loss value between the sample title and the result title, namely the reconstruction error.
Because the input of the generator is a sample title, and the sample title can be used for calculating the loss value of the result title generated by the constraining device, in the process of implementing iterative training on the first model architecture, unsupervised training can be realized without implementing additional manual marking on the sample title in advance, so that the training cost of the first-stage training can be effectively and correspondingly reduced.
Step S1200, iteratively training a discriminator to convergence by adopting a sample file, wherein the sample file comprises a negative sample generated by the generator and a pre-acquired positive sample, and the discriminator is used for predicting whether the sample file is the positive sample or the negative sample;
the purpose of this step is to prepare a discriminator suitable for making a decision on the validity of the advertising copy generated by the generator, for which a second stage of training is carried out.
The discriminator may be implemented by using a neural network model access classifier adapted to implement text feature extraction, and in an exemplary network architecture, the text feature extraction model may be implemented by using a mature model such as TextCNN, Bert, and the classifier may be a binary classifier. The discriminator extracts deep semantic information from the text information input into the discriminator through a text characteristic extraction model, then carries out classification mapping according to the deep semantic information, maps the deep semantic information to a binary classification space, and correspondingly predicts whether the input text information is a positive sample or a negative sample.
According to the principle, two types of files can be prepared, wherein the first file is an advertisement file generated by a generator after the training of the first stage by taking each sample mark as input, the advertisement file is generated by the generator and is automatically marked by a program, and the sample type of the advertisement file is marked as a negative sample without manual intervention; the second pattern can be an advertisement pattern extracted from historical advertisements of the advertisement system, and the advertisement pattern is automatically marked after being collected by a program, and the sample type of the advertisement pattern is marked as a positive sample without manual intervention. Thus, two types of sample documents, namely the positive sample advertisement document and the negative sample advertisement document, are obtained.
In one embodiment, the sample pattern as the negative sample may also be an advertisement pattern extracted by the generator during the first stage training process, or an advertisement pattern generated by the generator according to other given text information.
And when the second stage of training is executed, the sample patterns are adopted to carry out iterative training on the discriminator, for each sample pattern, the embedded vector of the sample pattern is input into a text feature extraction model of the discriminator to extract deep semantic information of the sample pattern, then the deep semantic information is mapped into a classification space of a classifier in the discriminator through a full connection layer, and classification probabilities corresponding to two classes mapped into the classification space are calculated, wherein the class with the maximum classification probability is a result predicted by the discriminator correspondingly, so that whether the sample pattern used for training is a positive sample or a negative sample is judged.
After the discriminator predicts the corresponding result for each sample pattern, the sample type which is automatically marked in advance by the sample pattern is used as a supervision label, the loss value of the predicted result is calculated, then the discriminator is subjected to gradient updating according to the loss value, and the next sample pattern is continuously called to carry out iterative training on the discriminator until the discriminator is determined to be converged according to the loss value.
In one embodiment, the arbiter may access the generator after the first stage training, input the sample title to the generator, and the generator automatically generates the sample pattern as the negative sample and then provides the label corresponding to the negative sample to the arbiter for immediate training, thereby avoiding intermediate storage and processing of the negative sample, and directly generating the negative sample by the generator immediately and providing the negative sample to the arbiter immediately, which may further improve training efficiency. Alternatively, when the generator is used to generate the negative samples for the discriminator, the weight of the generator may be temporarily frozen so that the obtained negative samples have uniformity in quality.
The discriminator is trained to converge after two types of sample documents, and then acquires the ability of judging whether the given documents meet the specifications according to the given documents, wherein the specifications mainly refer to the aspects of expression style, expression habit, grammar structure, composition content, readability and the like of the advertisement documents, and the discriminator has the ability of extracting the characteristics corresponding to the specifications of the documents according to the weights obtained by training and can make a judgment according to the characteristics. It can be seen that the discriminators obtained from the second stage training can be subsequently used as a standardized referee organization.
Step 1300, iteratively training a second model architecture to a convergence state by using a sample header, wherein the second model architecture comprises the generator and the discriminator, and the weight of the discriminator is frozen.
The purpose of this step is to use the discriminator obtained by the second stage training as the refereeing mechanism of the advertisement literature generated by the generator obtained by the first stage training, supervise the fine tuning training process of the generator, train the generator again to the convergence state, complete the final preparation of the generator, and execute the third stage training for this purpose.
In order to implement the third stage training, a second model architecture as shown in fig. 3 is built, the second model architecture is composed of a generator finished by the first stage training and a discriminator finished by the second stage training, the generator takes a sample title as input, the generated advertisement file output is taken as the input of the discriminator, and the discriminator outputs the prediction result of the sample type of the advertisement file. The decision device is used as a decision mechanism during the third-stage training, so that the weight of the decision device is frozen and solidified in advance, and the weight of the decision device does not participate in the gradient updating in the training process any more, that is, in the third-stage training process, the second model architecture only updates the weight of the generator in the decision device, so as to realize the fine tuning training of the generator.
In the third-stage training corresponding to the second model architecture, in an embodiment, the sample titles used in the first-stage training may still be used, and since these sample titles may be generally extracted from the commodity information in the commodity database of the e-commerce platform in advance and are relatively standard data, the sample types may be automatically marked as positive samples in batches in advance, so as to indicate that these sample titles are used as positive samples in the third-stage training. Certainly, in an embodiment, some text information which does not meet the commodity title specification may be further collected to construct a negative sample, so as to provide a richer training sample for the third-stage training, and the convergence speed of the second model architecture may be further improved by training the second model architecture through the positive and negative samples.
When the third stage training is carried out, each sample title is called and then input into the generator, the generator encodes the sample title according to the learned capability to extract the feature vector of the sample title, then decodes according to the feature vector to generate a corresponding advertisement file, the advertisement file is input into the discriminator, the discriminator extracts the deep semantic information of the advertisement file and carries out classification mapping according to the deep semantic information by virtue of the classifier to predict a corresponding result label, then the loss value of the result label is calculated by utilizing whether the sample type labeled in advance by the trained sample title is a positive sample or a negative sample, and when the model architecture is judged not to be converged according to the loss value, the second model architecture is reversely propagated according to the loss value so as to realize the gradient updating of the weight.
It is understood that, since the weight of the arbiter is in the frozen state, when performing the gradient update, the weight of the arbiter is not modified, and only the weight of the generator is modified, so that the training performed by the third stage on the second model architecture is essentially fine tuning training on the generator. When the model is not converged, the third-stage training continuously calls the next sample title for iterative training. With the increase of the number of iterative training times, the loss value obtained by each training is expected to gradually reach a preset threshold corresponding to the convergence state, when the loss value reaches the preset threshold, the generator can be judged to be converged, and after the generator reaches the convergence state, the training at the third stage can be terminated, so that the generator can be prepared, and an effective advertising copy can be generated according to the input commodity title.
It can be seen that, in the above process of preparing the generator, it is not necessary to rely on the corresponding relationship information of the title of the same product and the advertisement and literature in the advertisement issued by the product all the way, that is, it is not necessary to provide the label information of the corresponding relationship between the title and the literature for any training stage, the advertisement and literature output by the generator is restored to the title by adopting the restraint device to implement the unsupervised training, and further, the training of the generator and the discriminator can be realized by automatically determining the positive and negative samples according to the source of the title of the sample and the source of the sample and literature, so that the generator can learn the capability of generating the effective advertisement and literature according to the given title. In the process, the sample collection cost is low, but the training efficiency is very high.
In summary, according to the above embodiments, the generator of the present application can generate a good quality advertisement scheme from the title of the product, since the generator is efficiently trained. In the training process of the generator, the first stage is a post-constraining device of the generator, the generator generates the advertisement file according to the sample title, and the constraining device reversely restores the advertisement file into the result title, so that gradient updating can be carried out on the generator and the constraining device according to the result title and the sample title without depending on the labeling information of the sample title; the second stage trains the ability of the discriminator to identify whether the advertisement file meets the requirement or not, wherein the advertisement file generated by the generator is used as a sample file for classification judgment without depending on manual marking; in the third stage, the generator and the discriminator trained in the first two stages are adopted for combined training, the generator is responsible for generating the advertising copy according to the sample title, the discriminator is frozen in weight without participating in updating, and is only responsible for judging whether the advertising copy generated by the generator meets the requirements, and the like only needs to determine that the sample title is a positive sample of an effective source, and can also be independent of additional manual marking. Through the above process, the training of the generator can be completed, the generator firstly learns the capability of generating the advertisement file in the first stage, and participates in training and adjusts the weight to correct the generation capability through the discriminator in the third stage, so that the generator can generate the high-quality advertisement file according to the given commodity title. In the whole training process, the sample titles can be reused in different training stages, and the sample file can also use the product in the training process of the generator, so that the dependence on the sample amount is greatly reduced, the data utilization rate is improved, and the training cost of the generator is undoubtedly reduced.
In an embodiment expanded on the basis of any of the above embodiments, referring to fig. 4, in step S1100, the iteratively training the first model architecture to converge by using the sample header includes:
step S1110, calling a single sample title in the first data set, wherein the sample title comprises a commodity title adopted by commodities in an online store of an e-commerce platform;
in order to call the sample titles to ensure data call efficiency, a first data set is prepared, wherein the first data set comprises enough commodity titles as sample titles, the commodity titles can be collected from a commodity database of an online shop of an e-commerce platform, specifically the commodity titles in the commodity information of each commodity, and for the part of the commodity titles, the part of the commodity titles belong to the commodity titles used by the specification.
In another embodiment, also required by the third-stage training of the present application, some scrambled texts that do not meet the product title specification may be used as sample titles, and the sample types are labeled as negative samples synchronously. It should be noted that in the first stage of training of the present application, the participation of these negative examples is not required.
When each iteration is performed in the first stage training, a single sample header, specifically the sample header belonging to the positive sample in this embodiment, is called from the first data set for performing one iteration training on the first model structure.
Step S1120, inputting the sample title into a generator in a first model architecture, and coding and decoding the sample title by the generator to obtain a decoded advertisement file;
the called sample titles are subjected to conventional text preprocessing, word segmentation and vectorization to obtain corresponding embedded vectors, the embedded vectors enter a generator in a first model architecture, an encoder in the generator performs feature extraction on the embedded vectors to obtain corresponding feature vectors, then the feature vectors are further provided for a decoder in the generator, and the decoder decodes the feature vectors according to the feature vectors to generate the advertising copy.
It is easy to understand that in the process of extracting the features of the sample title by the generator, key features related to the commodity can be extracted, such as product words, core words and other key words in the commodity title, and style features, expression habit features, structural features, content features, readability features and the like in the sample title can be expected to be extracted, and all the features can guide the decoding process of the decoder, so that the advertising copy matched with the sample title can be expected to be generated.
Step S1130, inputting the advertisement file into a constrainer in a first model architecture, and coding and decoding the advertisement file by the constrainer to obtain a decoded result title;
the constraint device takes the output of the generator as input, namely the advertisement file output by the generator as input of the constraint device, the encoder in the constraint device performs feature extraction on the advertisement file to obtain a feature vector of the advertisement file, and a decoder of the constraint device performs decoding according to the feature vector to generate a result title, wherein the result title corresponds to the commodity title, and the corresponding relation of the result title can be realized by adopting the sample title to perform supervision and correction on the result title.
It can be seen that the constrainer and the builder, although they can be implemented using the same network architecture, are opposite in the flow direction of the processed information, wherein the builder generates the result pattern according to a given title, and the constrainer generates the given title according to the result pattern, and both are the process of the game, so that the closer the result title generated by the constrainer is to the sample title, the more the constrainer and the constrainer both learn the capability required to generate the advertisement pattern.
Step S1140, calculating the loss value of the result header according to the sample header, performing gradient update on the generator and the constraint device according to the loss value, and continuing to call the next sample header to perform iterative training on the first model architecture when the loss value does not reach convergence.
Considering the principle of cooperative implementation of game training between the constraining device and the generator, for each time the result header obtained by iterative training is labeled as a supervision label, applying, for example, a cross entropy loss function, calculating a loss value of the result header relative to the sample header, aiming at minimization of the loss value, determining whether the loss value reaches a preset threshold value, wherein the preset threshold value may be 0 or an expression infinitely close to 0, when the loss value reaches the preset threshold value, it indicates that the first model architecture has been trained to a convergence state, and the first-stage training may be terminated. When the loss value does not reach the preset threshold value, the first model architecture is not converged, then, back propagation is carried out on the generator and the constraining device according to the loss value, gradient updating is carried out on the weight of each link of the generator and the constraining device through the back propagation, the generator and the constraining device are forced to be further close to convergence, then, the next sample title in the first data is continuously called to carry out iterative training on the first model architecture until the output of the first model architecture is judged to be the model convergence.
It is understood that the first model architecture formed by the generators and the constrainer based on the zero sum game theory is used for implementing the first-stage training, so that in the training process, a monitoring label of manual marking is not required to be additionally provided, the loss value of the result header as output is directly calculated by using the sample header as input, the training of the first model architecture can be monitored, the weight correction of the generators and the constrainer is realized, and the pattern generating capacity of the generators as the training target is continuously improved until the convergence state is reached. It can be seen that the training cost of the first stage training is low, the dependence on the data volume is less, and the capability obtained by the generator can be ensured under the control of the restraint device.
In an embodiment expanded on the basis of any of the above embodiments, referring to fig. 5, in step S1200, the iteratively training the discriminator to converge by using the sample pattern includes:
step S1210, calling a single sample file in a second data set, wherein the sample file comprises an advertisement file which is generated by the generator and is used as a negative sample and an advertisement file which is pre-collected from an advertisement publishing system and is used as a positive sample, and each sample file is provided with a first type label which is characterized in that the sample file is corresponding to the positive sample or the negative sample;
to facilitate the second stage training, a second data set may be prepared. The second data set comprises a sufficient amount of sample files, and the sample files can comprise two types of advertisement files, wherein the first type of advertisement files can be collected from a historical advertisement database of the advertisement publishing system, particularly the advertisement files in the historical advertisements, the type of the advertisement files is stored in the second data set, and the sample type of the advertisement files is marked as a positive sample and is represented as a type label corresponding to the positive sample.
Since the advertisement documents which are issued in history are relatively effective advertisement documents, the advertisement documents are determined as positive samples, so that a correct criterion is provided for the result of the discriminator when the corresponding advertisement documents are adopted to train the discriminator.
In one embodiment, the historical advertisement database may be screened in advance for high-quality historical advertisements with better advertisement performance data, and the advertisement copy of the high-quality historical advertisements may be extracted as the positive sample copy in the second data set. The advertisement performance data may be any one of Click Through Rate (CTR) obtained after the historical advertisement is released, Conversion Rate (CVR), and input-output Ratio (ROAS). When the high-quality historical advertisements are determined, the historical advertisements with the advertisement effectiveness data reaching a preset threshold value are screened as the high-quality historical advertisements, the advertisement texts corresponding to the high-quality historical advertisements are obtained and used as positive samples in a second data set, and corresponding first type labels are added to represent the positive samples. The advertisement case of the historical high-quality advertisement is taken as a positive sample, the arbiter can be guided to improve the requirement on the high-quality advertisement case, so that the text information input into the arbiter can be discriminated as an effective advertisement case by the arbiter only when the text information contains the common characteristic of the high-quality advertisement case, thereby actually improving the quality requirement of the arbiter on the advertisement case. Since the positive sample is an advertising copy of a high quality historical advertisement that has been successful in practice, the quality of the copy of the advertising copy generated with reference to such an advertising copy is also self-evident.
The second type of advertisement literature in the second data set is set for providing a negative example for the discriminator, and skillfully, the discriminator is used as a referee mechanism to supervise the advertisement literature generated by the generator during the subsequent third-stage training, that is, the generator is only pre-trained in the first stage before the third-stage training is carried out, and the capability of generating the advertisement literature is immature, so that the advertisement literature generated by the generator before the third-stage training is used as a negative sample in the second data set, and a corresponding first type label is added in the sample type to indicate that the advertisement literature is a negative sample.
In one embodiment, it is recommended to use the generator after the first stage training to regenerate the corresponding advertisement documents for each sample title in the first data set, and use these advertisement documents as negative samples in the second data set. In this case, the generator is a finished product after the first stage training to convergence, and has a strong advertising copy generation capability, so that the pre-training result of the obtained advertising copy representation generator is used as a negative sample in the second data set, and after the training is performed on the discriminator by using the pre-training result, the quality requirement of the discriminator on the advertising copy is favorably improved, so that the discriminator only predicts the advertising copy with a high quality level as a positive sample.
In another embodiment, the advertising copy generated by the generator for the sample title prediction during the first stage of training may be used as a negative sample in the second data set, which is obviously relatively coarse, but is also possible.
In yet another embodiment, it is also allowed to construct the negative examples in other ways to enrich the sample document in the second data set, for which a person skilled in the art can gather or construct the negative examples in the second data set according to the quality requirements for the advertisement document.
On the basis of the second data set, when each iterative training of the discriminator needs to be executed, a sample file is called from the second data set, the sample file is used as a training sample to be input into the discriminator, and the first type label corresponding to the sample file is used as a supervision label of the discriminator.
Step S1220, inputting the sample pattern into a discriminator to extract deep semantic information, classifying and mapping according to the deep semantic information to obtain a result label, wherein the result label represents the type of the sample pattern as a positive sample or a negative sample by the discriminator;
after a sample file is converted into an embedded vector through conventional text preprocessing, word segmentation and vectorization, the embedded vector is input into a discriminator, a text feature extraction model of the discriminator performs feature extraction on the embedded vector, deep semantic information corresponding to the sample file is extracted, then full connection mapping is performed through a full connection layer to a binary classification space of the discriminator, classification probabilities of corresponding classes of corresponding positive samples and negative samples are obtained, and a result label corresponding to the class with the maximum classification probability is obtained, so that the result label is used for representing the discriminator to predict the sample type of the sample file into the positive sample or the negative sample.
Step S1230, calculating a loss value of the result label according to the first type label, performing gradient update on the discriminator according to the loss value, and continuing to invoke the next sample scheme to perform iterative training on the discriminator when the loss value does not reach convergence.
The first type label pre-labeled by the sample pattern used for training is further used for calculating the loss value of the first type label after the corresponding result label is predicted by the discriminator aiming at the sample pattern, the cross entropy loss function can be adopted for calculation when the loss value is calculated, then, whether the loss value reaches a preset threshold value or not is judged, if the loss value reaches the preset threshold value, the discriminator is shown to be converged, and the second stage of training can be terminated. If the judgment result does not reach the preset threshold value, the judgment is not converged, the weight parameter of the judgment is corrected according to the back propagation of the loss value, the gradient updating is realized, then, the next sample file is called from the second data set, and the iterative training is continuously carried out on the judgment. By analogy, the discriminator is continuously close to convergence through the cyclic iterative training, and finally the convergence state is achieved.
It is understood that, during the second stage training, because the sample patterns are selected in the second data set and the advertisement patterns generated by the generator are included as negative examples, the discriminant can be trained to a convergence state by using the sample patterns, so that the discriminant can continuously improve the capability of recognizing the good-quality advertisement patterns with the help of the sample patterns and the first type labels thereof, and the advertisement patterns generated by the generator without the third stage training can be recognized as negative examples, so that the fine tuning training of the generator can be monitored according to the capability subsequently, and the generation capability of the good-quality advertisement patterns can be further improved by the generator.
In an embodiment expanded on the basis of any of the above embodiments, referring to fig. 6, in step S1300, the iteratively training the second model architecture to the convergence state by using the sample header includes:
step S13110, calling a single sample title in the first data set, wherein the sample title carries a second type label representing that the sample type of the sample title belongs to a positive sample or a negative sample;
after the training of the first stage and the second stage, the generator adopts sample title pre-training to obtain the capability of generating the copy according to the title based on a game mechanism, and the discriminator trains by using the sample copy generated by the generator to obtain the capability of judging the quality of the given copy, however, in the second stage, the advertisement copy generated by the generator prepared by the training of the first stage is regarded as a negative sample by the discriminator, so that the generator needs further fine tuning training to generate the advertisement copy which can be identified as a positive sample by the discriminator, and therefore, a second model architecture is set up for this purpose, and fine tuning training is carried out on the generator in the second model architecture to improve the capability of generating the good and bad advertisement copy.
In order to perform the third-stage training, the first data set of the first-stage training may be directly multiplexed for training the second model architecture so as to maintain a low training data amount. The sample title in the first data set or the third data set may be a commodity title collected from commodity information in a commodity data set of the e-commerce platform, for such sample title, since the sample title conforms to the specification of the commodity title of the e-commerce platform, a second type label may be preset in a sample type of the sample title, the second type label may be a binary value for indicating whether a corresponding sample title is a positive sample or a negative sample, and for the commodity title collected from the e-commerce platform, after the sample title is collected and added to the data set, a batch of the second type label may be quickly set to represent the positive sample.
In one embodiment, the first data set or the third data set may further include random text collected by other approaches as a sample title, and the second type tag of the sample type is set to characterize the negative sample. The positive sample is beneficial to improving the quality of the advertising copy produced by the generator, and the introduction of the negative sample can further improve the training speed of the generator.
Step S1320, inputting the sample title into a generator in a second model architecture, and coding and decoding the sample title by the generator to obtain a decoded advertisement file;
the second model architecture comprises a generator for completing the first-stage training and a discriminator for completing the second-stage training, wherein the generator is responsible for generating the advertising copy according to a given title, the discriminator is used for carrying out classification mapping on the advertising copy generated by the generator so as to predict a result label of the advertising copy, and whether the prediction result belongs to a positive sample or a negative sample is represented by the result label.
When the sample title called from the first data set or the third data set is subjected to text conventional preprocessing, word segmentation and word embedding to obtain an embedded vector, the embedded vector is input into a generator, the generator extracts a feature vector representing deep semantic information of the sample title according to the embedded vector, and then the predicted generated advertisement file is obtained according to decoding of the feature vector.
Step S1330, inputting the advertisement file into a discriminator in a second model architecture, and classifying and mapping the advertisement file by the discriminator to obtain a result label, wherein the weight of the discriminator is frozen and solidified in advance;
the advertisement file generated by the generator is input into a discriminator in a second model architecture, the discriminator extracts deep semantic information of the advertisement file, a corresponding result label is predicted by carrying out classification mapping according to the deep semantic information, and the advertisement file generated by the generator is distinguished as a positive sample or a negative sample by the result label representation discriminator.
In the second model structure, after the discriminant is trained to converge in the second stage, the weight of the discriminant is frozen and solidified, so that the discriminant acts as a discriminant mechanism in the second model structure, and the advertisement pattern generated by the generator is discriminated according to the discriminant capability learned by the training in the second stage, thereby obtaining the result label corresponding to the sample title used for training.
And S1340, calculating a loss value of the result label according to the second type label, performing gradient updating on the generator according to the loss value, and continuously calling a next sample title to perform iterative training on the first model architecture when the loss value does not reach convergence.
As previously described, the sample titles in the first data set or the third data set may each be provided with a second type label for characterizing whether the respective sample title is in fact a positive sample or a negative sample. The second type label is used as a supervision label of the second model architecture, so that after the result label corresponding to one sample title is predicted by the discriminator, the cross entropy loss function can be applied by using the second type label pre-labeled by the sample title, and the loss value of the result label relative to the second type label is calculated.
The loss value is then compared to a preset threshold, and when the loss value reaches the preset threshold, it indicates that the generator in the second model architecture has reached a converged state. When the loss value does not reach the preset threshold value, the generator in the second model architecture does not reach the convergence state, so that the second model architecture can be subjected to back propagation according to the loss value, the weight parameter of the generator in the second model architecture is corrected through gradient updating, the generator is further close to convergence, and another sample title is further called from the first data set or the third data set to continue iterative training.
During back propagation of the second model architecture on the basis of the loss values calculated from the result labels of the discriminators, the fine-tuning training is essentially carried out only for the generators in the second model architecture, since the weights of the discriminators have been frozen and do not participate in the gradient update.
It is easy to understand that in the third-stage training, the generator can be trained to a convergence state again through multiple loop iterations, and because the requirement on the quality of the advertising copy has been improved by the training of the discriminator through the second-stage training, the generator inevitably further improves the capability of producing the advertising copy with higher quality under the guidance of the discriminator during the third-stage training of the generator, thereby achieving the purpose of fine-tuning training.
In addition, the third-stage training can multiplex the sample titles in the first data set, so that the sample size adopted in the whole training process of the generator is kept at the lowest state, the training cost can be effectively controlled, in the process of training the second model architecture, the corresponding relation information between the sample titles and the advertisement copy still does not need to be relied on, only the second type label of the second type label needs to be automatically marked and set by a program according to the source of the sample title, therefore, manual marking is not needed, and the preparation cost of the training data is low.
In an embodiment expanded on the basis of any of the above embodiments, referring to fig. 7, inputting a product title to the generator, and generating a corresponding advertisement scheme by the generator, the method includes:
step S2100, performing word embedding on the commodity title to obtain an embedded vector of the commodity title;
in one embodiment, the generator may be implemented using a Transformer, which is a deep learning model based on a self-attention mechanism and is suitable for parallelization computation. When inputting text data such as a commodity title and a sample title into the Transformer, the text data firstly passes through an encoder (Encoders), the encoder encodes the text, word embedding is firstly realized to obtain an embedded vector of the text, encoding is continued on the basis of the embedded vector to perform feature extraction, then the encoded data is transmitted into a decoder (Decoders) to be decoded, and a translated text is obtained after decoding.
Step S2200, the encoder of the generator performs feature extraction on the embedded vector to generate dense feature vectors;
and an encoder in the generator is realized based on a self-attention mechanism, converts the embedded vector into a query vector, a key vector and a value vector, extracts the significant features in the commodity title by using the query vector and the key vector, then combines the significant features with the value vector to realize feature extraction, obtains a corresponding dense feature vector, and realizes the representation of key features such as expression style, expression habit, composition structure, composition content, readability and the like in the commodity title through the feature vector.
Step S2300, decoding the feature vector by a decoder of the generator, and generating the advertisement copy word by word in sequence.
The decoder of the generator is also realized based on a self-attention mechanism, has basically the same structure as the encoder, and is different from the encoder in that when the decoder calculates corresponding scores of a plurality of self-attention modules based on the feature vectors, the decoder recalculates the scores with reference to the feature vectors to obtain the self-attention scores, and then outputs the results to the feedforward neural network for processing and outputting.
The characteristic vector finally output by the decoder is mapped into a vector with higher dimensionality through a linear full-connection layer, each dimensionality in the vector with higher dimensionality is correspondingly mapped to a score corresponding to one word element in a word list, the scores are converted into classification probabilities through an output layer, and the word element with the maximum classification probability is determined to be the composition of the corresponding sequence position of the advertisement scheme according to the classification probability corresponding to each dimensionality. The word element can be a single word or a multi-word, for Chinese, a single Chinese character, for Latin like English, a word.
It will be appreciated that the conversion of one textual information into another textual information, embodied herein as the conversion of a product title into an advertising copy, can be efficiently achieved using a generator constructed based on an encoder and decoder, which, when trained to a converged state and learned of such conversion capability, can serve the automatic creation of advertising copies, resulting in the production of effective and quality advertising copies based on the product title given by the user.
In an embodiment expanded on the basis of any of the above embodiments, before inputting a product title to the generator and generating a corresponding advertisement scheme by the generator, the method includes:
step S3100, acquiring a document generation instruction submitted by the terminal equipment, and determining a target commodity specified by the document generation instruction;
when an online shop of an e-commerce platform needs to issue advertisements, a generator implemented by the application is expected to automatically generate a file, a service interface implemented by the application can be called, a file generation instruction required by file generation is triggered in a terminal device of a management user of the online shop, a target commodity to be issued the advertisements is contained in the file generation instruction and then sent to the service interface, and the service interface is called to execute, and the target commodity specified in the file generation instruction is determined.
The service interface may be deployed in an advertising system provided by an e-commerce platform and open up for invocation by administrative users of various online stores.
Step S3200, querying the product database to obtain the product title of the target product.
After the service interface obtains the target commodity, specifically, the unique characteristic information of the target commodity is obtained, the commodity database of the online shop is inquired according to the unique characteristic information, the target commodity corresponding to the unique characteristic information is searched from the unique characteristic information, then the commodity title in the commodity information of the target commodity is called, and the commodity title can be used for being input into a generator of the application to generate the advertisement file.
The generator can be deployed in an advertisement system, and opens a service for automatically generating advertisement documents for online shops of the e-commerce platform, so that the service capability of the e-commerce platform is enriched, the advertisement publishing service logic is optimized, and the user experience is improved.
In an embodiment expanded on the basis of any of the above embodiments, after inputting a product title to the generator and generating a corresponding advertisement scheme by the generator, the method includes:
step S4100, pushing the advertisement file to terminal equipment providing the commodity title for display;
after the generator trained to be converged obtains the advertising copy corresponding to the commodity title, the advertising copy can be pushed to the terminal equipment submitting the commodity title for display.
In this embodiment, a user accesses an advertisement distribution page in the terminal device, introduces a target product to be distributed with an advertisement into the advertisement distribution page, automatically calls a product title of the target product in the advertisement distribution page to display at a corresponding position, and simultaneously triggers a document generation request, which includes the product title, and submits the product title to a server, and then the server calls a generator to input the product title, and the generator generates an advertisement document corresponding to the product title, and then returns the advertisement document to the terminal device to display in the advertisement distribution page.
The user can continuously edit the advertisement file displayed on the advertisement publishing page, the advertisement file is an original version when not edited and is marked as a revised version after being coded, when the user confirms to publish the advertisement, the user can trigger the submission control so as to submit an advertisement publishing instruction to the server, the advertisement publishing instruction comprises the advertisement file, if the advertisement file is not edited, the original version of the advertisement file is still sent, and if the advertisement file is edited, the original version is replaced and the revised version is sent.
Step S4200, in response to the advertisement publishing instruction containing the advertisement of the original version or the revised version of the advertisement file submitted by the terminal device, submitting the advertisement to the advertisement system for publishing.
The server responds to the advertisement issuing instruction submitted by the terminal equipment, and can correspondingly obtain the advertisement file carried in the terminal equipment, wherein the advertisement file can be an original version or a revised version of the advertisement file, and whether the advertisement file is revised by a user in the terminal equipment is determined. On the basis of obtaining the advertisement file, an advertisement corresponding to the target commodity can be constructed, the advertisement is submitted to an advertisement system matched with the e-commerce platform, and the advertisement system realizes the release of the advertisement according to the inherent advertisement release service logic. After the advertisement is released to any terminal equipment, the audience can read the advertisement case from the terminal equipment and then directly reach the commodity detail page corresponding to the target commodity through the advertisement.
It can be seen that the above embodiments further achieve publishing the advertisement document of the generator of the present application to the advertisement system, complete the service closed loop of the advertisement document generated by the generator, and complete the service function of the advertisement system matched with the e-commerce platform, wherein, because the user publishing the advertisement does not need to originally create the advertisement document by himself, the advertisement publishing efficiency is greatly improved, especially for the user with inexpedient languages required by the advertisement, the advertisement document is generated and published in this way, which can greatly improve the advertisement publishing efficiency, and both for the user and the platform, the advertisement publishing efficiency is effectively improved.
In accordance with one of the objects of the present application, there is provided an advertisement document generation device for functionally embodying the advertisement document generation method of the present application, the device being configured to input a product title to a generator and generate a corresponding advertisement document by the generator; referring to fig. 8, the training device is provided, and includes: the first training module is used for iteratively training a first model architecture to be convergent by adopting a sample title, the first model architecture comprises a generator and a constrainer, the generator is used for predicting the advertisement literature corresponding to the sample title, and the constrainer is used for predicting a result title corresponding to the advertisement literature; a second training module for iteratively training a discriminator to converge using a sample pattern, the sample pattern including a negative sample generated by the generator and a pre-collected positive sample, the discriminator for predicting the sample pattern as a positive sample or a negative sample; a third training module for iteratively training a second model architecture to a converged state using sample headers, the second model architecture comprising the generator and the discriminator, wherein weights of the discriminator are frozen.
In an embodiment expanded on the basis of any of the above embodiments, the first training module includes: the system comprises a sample calling unit, a data processing unit and a data processing unit, wherein the sample calling unit is used for calling a single sample title in a first data set, and the sample title comprises a commodity title adopted by a commodity of an online shop of an e-commerce platform; the generation execution unit is used for inputting the sample title into a generator in a first model architecture, and the generator encodes and decodes the sample title to obtain a decoded advertisement file; the constraint execution unit is used for inputting the advertisement file into a constraint device in a first model architecture, and the constraint device encodes and decodes the advertisement file to obtain a decoded result title; and the iteration decision unit is used for calculating a loss value of the result header according to the sample header, performing gradient updating on the generator and the constraint device according to the loss value, and continuously calling the next sample header to perform iterative training on the first model architecture when the loss value does not reach convergence.
In an embodiment expanded on the basis of any of the above embodiments, the second training module includes: the sample calling unit is used for calling a single sample file in a second data set, the sample file comprises an advertisement file which is generated by the generator and serves as a negative sample and an advertisement file which is pre-collected from an advertisement publishing system and serves as a positive sample, and each sample file is provided with a first type label which is characterized in that the sample file corresponds to the positive sample or the negative sample; the classification discrimination unit is used for inputting the sample pattern into a discriminator to extract deep semantic information and obtaining a result label according to the deep semantic information classification mapping, and the result label representation discriminator is used for setting the sample type of the sample pattern as a positive sample or a negative sample; and the iterative decision unit is used for calculating a loss value of the result label according to the first type label, performing gradient updating on the discriminator according to the loss value, and continuously calling the next sample scheme to perform iterative training on the discriminator when the loss value does not reach convergence.
In an embodiment expanded on the basis of any of the above embodiments, the third training module includes: the sample calling unit is used for calling a single sample title in the first data set, wherein the sample title carries a second type label for representing that the sample type of the sample title belongs to a positive sample or a negative sample; the generation execution unit is used for inputting the sample title into a generator in a second model architecture, and the generator encodes and decodes the sample title to obtain a decoded advertisement file; the classification and discrimination unit is used for inputting the advertisement file into a discriminator in a second model architecture, the discriminator performs classification and mapping on the advertisement file to obtain a result label, and the weight of the discriminator is frozen and solidified in advance; and the iteration decision unit is used for calculating a loss value of the result label according to the second type label, performing gradient updating on the generator according to the loss value, and continuously calling the next sample title to perform iterative training on the first model architecture when the loss value is not converged.
In an embodiment expanded on the basis of any of the above embodiments, the generator includes: the embedding processing unit is used for embedding words into the commodity titles to obtain embedded vectors of the commodity titles; the encoding processing unit is used for performing feature extraction on the embedded vector by an encoder of the generator to generate a dense feature vector; and the decoding processing unit is used for decoding the characteristic vector by a decoder of the generator and generating the advertisement file word by word in sequence.
In an embodiment expanded on the basis of any of the above embodiments, the advertisement document generation apparatus includes: the target determining module is used for acquiring the file generation instruction submitted by the terminal equipment and determining the target commodity specified by the file generation instruction; and the title determining module is used for inquiring the commodity database to obtain the commodity title of the target commodity.
In an embodiment expanded on the basis of any of the above embodiments, the advertisement document generation apparatus includes: the file pushing module is used for pushing the advertisement file to the terminal equipment providing the commodity title for display; and the advertisement publishing module is used for responding to an advertisement publishing instruction which is submitted by the terminal equipment and contains the advertisement of the original version of the advertisement file or the revised version of the advertisement file, submitting the advertisement to an advertisement system for publishing.
In order to solve the technical problem, an embodiment of the present application further provides a computer device. As shown in fig. 9, the internal structure of the computer device is schematically illustrated. The computer device includes a processor, a computer-readable storage medium, a memory, and a network interface connected by a system bus. The computer readable storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store control information sequences, and when the computer readable instructions are executed by a processor, the processor can realize a commodity search category identification method. The processor of the computer device is used for providing calculation and control capability and supporting the operation of the whole computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, may cause the processor to perform the advertising copy generation method of the present application. The network interface of the computer device is used for connecting and communicating with the terminal. Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In this embodiment, the processor is configured to execute specific functions of each module and its sub-module in fig. 8, and the memory stores program codes and various data required for executing the modules or sub-modules. The network interface is used for data transmission to and from a user terminal or a server. The memory in this embodiment stores program codes and data necessary for executing all modules/submodules in the advertisement document generation device of the present application, and the server can call the program codes and data of the server to execute the functions of all the submodules.
The present application further provides a storage medium storing computer readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the method of generating an advertising copy of any of the embodiments of the present application.
The present application also provides a computer program product comprising computer programs/instructions which, when executed by one or more processors, implement the steps of the method as described in any of the embodiments of the present application.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments of the present application can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when the computer program is executed, the processes of the embodiments of the methods can be included. The storage medium may be a computer-readable storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).
In summary, the present application implements multi-stage training on the generator, so that the generator can learn the capability of generating the advertisement file in the first stage, and participate in training to adjust the weight and correct the generation capability in the third stage via the discriminator, and thus the generator can generate the advertisement file with good quality according to the given product title. In the whole training process, the sample titles can be reused in different training stages, and the sample files can also use the products in the training process of the generator, so that the dependence on the sample amount is greatly reduced, the data utilization rate is improved, and the training cost of the generator is undoubtedly reduced. Finally, the generator of the application is applied to the field of advertisements, and the auxiliary creation efficiency of the advertisement file can be improved.
Those of skill in the art will appreciate that the various operations, methods, steps in the processes, acts, or solutions discussed in this application can be interchanged, modified, combined, or eliminated. Further, other steps, measures, or schemes in various operations, methods, or flows that have been discussed in this application can be alternated, altered, rearranged, broken down, combined, or deleted. Further, steps, measures, schemes in the prior art having various operations, methods, procedures disclosed in the present application may also be alternated, modified, rearranged, decomposed, combined, or deleted.
The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims (10)

1. An advertisement case generation method is characterized in that a commodity title is input to a generator, and the generator generates a corresponding advertisement case, wherein the training process of the generator comprises the following steps:
iteratively training a first model architecture to be convergent by adopting a sample title, wherein the first model architecture comprises a generator and a constrainer, the generator is used for predicting an advertisement scheme corresponding to the sample title, and the constrainer is used for predicting a result title corresponding to the advertisement scheme;
iteratively training a discriminator to converge using a sample pattern, the sample pattern including negative samples generated by the generator and pre-collected positive samples, the discriminator for predicting the sample pattern as a positive sample or a negative sample;
iteratively training a second model architecture comprising the generator and the discriminator to a converged state using a sample header, wherein weights of the discriminator are frozen.
2. The method of claim 1, wherein iteratively training the first model architecture to converge using sample headers comprises:
calling a single sample title in a first data set, wherein the sample title comprises a commodity title adopted by a commodity of an online shop of an e-commerce platform;
inputting the sample title into a generator in a first model architecture, and coding and decoding the sample title by the generator to obtain a decoded advertisement case;
inputting the advertisement file into a constrainer in a first model architecture, and coding and decoding the advertisement file by the constrainer to obtain a decoded result title;
and calculating a loss value of the result header according to the sample header, performing gradient updating on the generator and the constrainer according to the loss value, and continuously calling the next sample header to perform iterative training on the first model architecture when the loss value does not reach convergence.
3. The method of claim 1, wherein iteratively training a discriminator to converge using a sample pattern comprises:
calling a single sample file in a second data set, wherein the sample file comprises an advertisement file which is generated by the generator and is used as a negative sample and an advertisement file which is pre-collected from an advertisement publishing system and is used as a positive sample, and each sample file is provided with a first type label which is characterized in that the sample file corresponds to the positive sample or the negative sample;
inputting the sample file into a discriminator to extract deep semantic information, classifying and mapping according to the deep semantic information to obtain a result label, wherein the result label represents the type of the sample file as a positive sample or a negative sample by a discriminator;
and calculating a loss value of the result label according to the first type label, performing gradient updating on the discriminator according to the loss value, and continuously calling the next sample scheme to perform iterative training on the discriminator when the loss value does not reach convergence.
4. The method of claim 1, wherein iteratively training the second model architecture to a converged state using sample headers comprises:
calling a single sample title in the first data set, wherein the sample title carries a second type label for representing that the sample type of the sample title belongs to a positive sample or a negative sample;
inputting the sample title into a generator in a second model architecture, and coding and decoding the sample title by the generator to obtain a decoded advertisement file;
inputting the advertisement file into a discriminator in a second model architecture, and carrying out classification mapping on the advertisement file by the discriminator to obtain a result label of the advertisement file, wherein the weight of the discriminator is frozen and solidified in advance;
and calculating a loss value of the result label according to the second type label, performing gradient updating on the generator according to the loss value, and continuously calling a next sample title to perform iterative training on the first model architecture when the loss value does not reach convergence.
5. The method of claim 1, wherein inputting a title of a product to the generator and generating a corresponding advertising copy by the generator comprises:
performing word embedding on the commodity title to obtain an embedded vector of the commodity title;
performing feature extraction on the embedded vector by an encoder of a generator to generate a dense feature vector;
the feature vector is decoded by a decoder of the generator to generate the advertising copy word by word in sequence.
6. The method of claim 1, wherein inputting a title of a product to the generator and before generating a corresponding advertisement document by the generator, comprises:
acquiring a file generation instruction submitted by terminal equipment, and determining a target commodity specified by the file generation instruction;
and querying a commodity database to obtain the commodity title of the target commodity.
7. The method of claim 1, wherein inputting a title of a product to the generator and generating a corresponding advertisement document by the generator, comprises:
pushing the advertising copy to a terminal device providing the commodity title for display;
and submitting the advertisement to an advertisement system to realize the release in response to an advertisement release instruction which is submitted by the terminal equipment and contains the advertisement of the original version or the revised version of the advertisement file.
8. An advertising copy generation apparatus for inputting a title of a product to a generator, generating a corresponding advertising copy by the generator, and matching and providing a training apparatus, the training apparatus comprising:
the first training module is used for iteratively training a first model architecture to be convergent by adopting a sample title, the first model architecture comprises a generator and a constrainer, the generator is used for predicting the advertisement literature corresponding to the sample title, and the constrainer is used for predicting a result title corresponding to the advertisement literature;
a second training module for iteratively training a discriminator to converge using a sample pattern, the sample pattern including a negative sample generated by the generator and a pre-collected positive sample, the discriminator for predicting the sample pattern as a positive sample or a negative sample;
a third training module for iteratively training a second model architecture to a converged state using sample headers, the second model architecture comprising the generator and the discriminator, wherein weights of the discriminator are frozen.
9. A computer device comprising a central processor and a memory, characterized in that the central processor is adapted to invoke execution of a computer program stored in the memory to perform the steps of the method according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that it stores, in the form of computer-readable instructions, a computer program implemented according to the method of any one of claims 1 to 7, which, when invoked by a computer, performs the steps comprised by the corresponding method.
CN202210621648.5A 2022-06-01 2022-06-01 Method for generating advertisement file, device, equipment, medium and product thereof Pending CN114997920A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210621648.5A CN114997920A (en) 2022-06-01 2022-06-01 Method for generating advertisement file, device, equipment, medium and product thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210621648.5A CN114997920A (en) 2022-06-01 2022-06-01 Method for generating advertisement file, device, equipment, medium and product thereof

Publications (1)

Publication Number Publication Date
CN114997920A true CN114997920A (en) 2022-09-02

Family

ID=83031138

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210621648.5A Pending CN114997920A (en) 2022-06-01 2022-06-01 Method for generating advertisement file, device, equipment, medium and product thereof

Country Status (1)

Country Link
CN (1) CN114997920A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117648349A (en) * 2024-01-29 2024-03-05 河北省中医院 File calling method and platform

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117648349A (en) * 2024-01-29 2024-03-05 河北省中医院 File calling method and platform
CN117648349B (en) * 2024-01-29 2024-04-09 河北省中医院 File calling method and platform

Similar Documents

Publication Publication Date Title
CN110097085B (en) Lyric text generation method, training method, device, server and storage medium
CN109062893B (en) Commodity name identification method based on full-text attention mechanism
CN111460838A (en) Pre-training method and device of intelligent translation model and storage medium
CN112926303B (en) Malicious URL detection method based on BERT-BiGRU
CN111694924A (en) Event extraction method and system
CN110232439B (en) Intention identification method based on deep learning network
CN104978587B (en) A kind of Entity recognition cooperative learning algorithm based on Doctype
CN108829823A (en) A kind of file classification method
CN116049412B (en) Text classification method, model training method, device and electronic equipment
CN111666400B (en) Message acquisition method, device, computer equipment and storage medium
CN115617955B (en) Hierarchical prediction model training method, punctuation symbol recovery method and device
CN113850201A (en) Cross-modal commodity classification method and device, equipment, medium and product thereof
CN113962224A (en) Named entity recognition method and device, equipment, medium and product thereof
CN115098673A (en) Business document information extraction method based on variant attention and hierarchical structure
CN115099854A (en) Method for creating advertisement file, device, equipment, medium and product thereof
CN114997920A (en) Method for generating advertisement file, device, equipment, medium and product thereof
CN117034921B (en) Prompt learning training method, device and medium based on user data
CN116244484B (en) Federal cross-modal retrieval method and system for unbalanced data
CN116842934A (en) Multi-document fusion deep learning title generation method based on continuous learning
CN115309905A (en) Advertisement text generation method, device, equipment and medium
CN115129819A (en) Text abstract model production method and device, equipment and medium thereof
CN115018548A (en) Advertisement case prediction method and device, equipment, medium and product thereof
CN113806536A (en) Text classification method and device, equipment, medium and product thereof
CN114444485B (en) Cloud environment network equipment entity identification method
CN114818644B (en) Text template generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination