CN116797280A

CN116797280A - Advertisement document generation method and device, equipment and medium thereof

Info

Publication number: CN116797280A
Application number: CN202310800576.5A
Authority: CN
Inventors: 葛莉
Original assignee: Guangzhou Shangyan Network Technology Co ltd
Current assignee: Guangzhou Shangyan Network Technology Co ltd
Priority date: 2023-06-30
Filing date: 2023-06-30
Publication date: 2023-09-22

Abstract

The application relates to an advertisement document generation method, a device, equipment and a medium thereof in the technical field of electronic commerce, wherein the method comprises the following steps: acquiring commodity description text and commodity pictures of the commodity to be advertised; generating a plurality of corresponding selling point marketing descriptions according to the commodity description text and the commodity picture by adopting a selling point generation model, and constructing a selling point marketing description set; screening a plurality of target selling point marketing descriptions meeting preset conditions from the selling point marketing description set, wherein the target selling point marketing descriptions are formed when the difference degree between one selling point marketing description and other selling point marketing descriptions in the selling point marketing description set and/or the correlation degree between the selling point marketing description and commodity description text are large, and the target selling point marketing descriptions are met; generating prompt texts according to the target selling point marketing description structures, and calling a large language model to generate a plurality of corresponding advertisement texts according to the generated prompt texts. The application can generate the advertisement file which accords with the expected advertising effect.

Description

Advertisement document generation method and device, equipment and medium thereof

Technical Field

The present application relates to the field of electronic commerce technologies, and in particular, to a method for generating an advertisement document, and a corresponding apparatus, computer device, and computer readable storage medium thereof.

Background

Often, the E-commerce marketing needs to use advertisement documents, which not only can attract the attention of the audience, but also can convey the characteristics, selling points and/or advantages of the commodity to the audience, guide the audience to generate purchasing desire, promote commodity transaction to be achieved and promote nutrient income.

In practice, the creation of advertisement literature requires a full understanding of the merchandise, knowledge of a degree of writing specifications, adoption of a degree of expression skills, writing out of the posting market and the literature that maximally achieves the commercial purpose. In the prior art, the automatic generation of advertisement text by using natural language is realized due to the rise of a large language model represented by ChatGPT. In this way, a large language model trained to be converged is often utilized, and the title of the commodity is directly used as the input of the model, so that a advertisement document with strong universality is obtained, however, the advertisement document has weak pertinence, and the selling point of the commodity cannot be sufficiently and accurately described, so that good advertising effect is difficult to obtain.

In view of the shortcomings of the conventional technology, the inventor conducts research in the related field for a long time, and develops a new way for solving the problem in the field of electronic commerce.

Disclosure of Invention

It is a primary object of the present application to solve at least one of the above problems and provide an advertisement document generating method and corresponding apparatus, computer device, computer readable storage medium.

In order to meet the purposes of the application, the application adopts the following technical scheme:

the application provides an advertisement document generation method adapting to one of the purposes of the application, which comprises the following steps:

acquiring commodity description text and commodity pictures of the commodity to be advertised;

generating a plurality of corresponding selling point marketing descriptions according to the commodity description text and the commodity picture by adopting a selling point generation model, and constructing a selling point marketing description set;

screening a plurality of target selling point marketing descriptions meeting preset conditions from the selling point marketing description set, wherein the target selling point marketing descriptions are formed when the difference degree between one selling point marketing description and other selling point marketing descriptions in the selling point marketing description set and/or the correlation degree between the selling point marketing description and commodity description text are large, and the target selling point marketing descriptions are met;

generating prompt texts according to the target selling point marketing description structures, and calling a large language model to generate a plurality of corresponding advertisement texts according to the generated prompt texts.

On the other hand, the advertisement document generation device provided by the application, which is suitable for one of the purposes of the application, comprises a data acquisition module, a description set construction module, a target screening module and a document generation module, wherein the data acquisition module is used for acquiring a commodity description text and a commodity picture of the commodity to be advertised; the description set construction module is used for generating a plurality of corresponding selling point marketing descriptions according to the commodity description text and the commodity picture by adopting a selling point generation model, and constructing a selling point marketing description set; the target screening module is used for screening a plurality of target selling point marketing descriptions meeting preset conditions from the selling point marketing description set, wherein the target selling point marketing descriptions are formed when the difference degree between one selling point marketing description and other selling point marketing descriptions in the selling point marketing description set and/or the correlation degree between the selling point marketing description and commodity description text are large, and the target selling point marketing descriptions are met; and the document generation module is used for generating prompt texts according to the target selling point marketing description structures, calling a large language model and generating a plurality of corresponding advertisement documents according to the generated prompt texts.

In yet another aspect, a computer device adapted to one of the objects of the present application comprises a central processor and a memory, said central processor being adapted to invoke the steps of running a computer program stored in said memory to perform the advertisement document generating method according to the present application.

In yet another aspect, a computer readable storage medium adapted to another object of the present application stores a computer program implemented according to the advertisement document generation method in the form of computer readable instructions, which when invoked by a computer, performs the steps included in the method.

The technical scheme of the application has various advantages, including but not limited to the following aspects:

according to the method, a selling point generation model is adopted to generate a plurality of corresponding selling point marketing descriptions according to commodity description texts and commodity pictures of the commodity to be advertised, a selling point marketing description set is constructed, a plurality of target selling point marketing descriptions meeting preset conditions are screened out, when the difference degree between one selling point marketing description and other selling point marketing descriptions in the selling point marketing description set and/or the correlation degree between the selling point marketing description and the commodity description texts are large, the target selling point marketing description is formed, a prompt text is generated according to the plurality of target selling point marketing description structures, and a large language model is called to generate a plurality of corresponding advertisement texts according to the generated prompt text. According to the accurate target selling point marketing description in the process of generating the advertisement document, the advertisement document with rich contents and special selling points of the close-holding commodity can be generated, and expected advertising effect is expected to be achieved.

Drawings

The foregoing and/or additional aspects and advantages of the application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flowchart of an exemplary embodiment of an advertisement document generation method according to the present application;

FIG. 2 is a flow chart of generating a sales point marketing description in an embodiment of the present application;

FIG. 3 is a flow chart illustrating a process for determining a target selling point according to an embodiment of the present application;

FIG. 4 is a schematic flow chart of expanding a description text of a commodity according to an embodiment of the present application;

FIG. 5 is a flow chart of the selling point keyword screening according to the embodiment of the present application;

FIG. 6 is a flow chart of training a selling point generation model in accordance with an embodiment of the present application;

FIG. 7 is a schematic flow chart of constructing and labeling training samples in an embodiment of the application;

FIG. 8 is a schematic block diagram of an advertising document generating device of the present application;

fig. 9 is a schematic structural diagram of a computer device used in the present application.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the application.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.

It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As used herein, "client," "terminal device," and "terminal device" are understood by those skilled in the art to include both devices that include only wireless signal receivers without transmitting capabilities and devices that include receiving and transmitting hardware capable of two-way communication over a two-way communication link. Such a device may include: a cellular or other communication device such as a personal computer, tablet, or the like, having a single-line display or a multi-line display or a cellular or other communication device without a multi-line display; a PCS (Personal Communications Service, personal communication system) that may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant ) that can include a radio frequency receiver, pager, internet/intranet access, web browser, notepad, calendar and/or GPS (Global Positioning System ) receiver; a conventional laptop and/or palmtop computer or other appliance that has and/or includes a radio frequency receiver. As used herein, "client," "terminal device" may be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or adapted and/or configured to operate locally and/or in a distributed fashion, at any other location(s) on earth and/or in space. As used herein, a "client," "terminal device," or "terminal device" may also be a communication terminal, an internet terminal, or a music/video playing terminal, for example, a PDA, a MID (Mobile Internet Device ), and/or a mobile phone with music/video playing function, or may also be a device such as a smart tv, a set top box, or the like.

The application refers to hardware such as a server, a client, a service node, and the like, which essentially is an electronic device with personal computer and other functions, and is a hardware device with necessary components disclosed by von neumann principles such as a central processing unit (including an arithmetic unit and a controller), a memory, an input device, an output device, and the like, wherein a computer program is stored in the memory, and the central processing unit calls the program stored in the memory to run, executes instructions in the program, and interacts with the input and output devices, thereby completing specific functions.

It should be noted that the concept of the present application, called "server", is equally applicable to the case of server clusters. The servers should be logically partitioned, physically separate from each other but interface-callable, or integrated into a physical computer or group of computers, according to network deployment principles understood by those skilled in the art. Those skilled in the art will appreciate this variation and should not be construed as limiting the implementation of the network deployment approach of the present application.

One or more technical features of the present application, unless specified in the clear, may be deployed either on a server for implementation and the client remotely invokes an online service interface provided by the acquisition server for implementation of the access, or may be deployed and run directly on the client for implementation of the access.

The neural network model cited or possibly cited in the application can be deployed on a remote server and can be used for implementing remote call on a client, or can be deployed on a client with sufficient equipment capability for direct call, unless specified by plaintext, and in some embodiments, when the neural network model runs on the client, the corresponding intelligence can be obtained through migration learning so as to reduce the requirement on the running resources of the hardware of the client and avoid excessively occupying the running resources of the hardware of the client.

The various data related to the present application, unless specified in the plain text, may be stored either remotely in a server or in a local terminal device, as long as it is suitable for being invoked by the technical solution of the present application.

Those skilled in the art will appreciate that: although the various methods of the present application are described based on the same concepts so as to be common to each other, the methods may be performed independently of each other unless specifically indicated otherwise. Similarly, for the various embodiments disclosed herein, all concepts described herein are presented based on the same general inventive concept, and thus, concepts described herein with respect to the same general inventive concept, and concepts that are merely convenient and appropriately modified, although different, should be interpreted as equivalents.

The various embodiments of the present application to be disclosed herein, unless the plain text indicates a mutually exclusive relationship with each other, the technical features related to the various embodiments may be cross-combined to flexibly construct a new embodiment as long as such combination does not depart from the inventive spirit of the present application and can satisfy the needs in the art or solve the deficiencies in the prior art. This variant will be known to the person skilled in the art.

The advertisement document generation method of the present application may be programmed as a computer program product deployed to run in a client or a server, for example, in an exemplary application scenario of the present application, may be deployed in a server of an e-commerce platform, whereby the method may be performed by accessing an interface that is open after the computer program product is run, and performing man-machine interaction with a process of the computer program product through a graphical user interface.

Referring to fig. 1, the advertisement document generation method of the present application, in an exemplary embodiment thereof, includes the following steps:

step S1100, acquiring commodity description text and commodity pictures of the commodity to be advertised;

according to the unique identification of the commodity to be advertised, commodity information of the commodity to be advertised is obtained from a commodity database, wherein the commodity information comprises but is not limited to commodity description text and commodity pictures, the commodity pictures are pictures for displaying commodities, the commodities can be displayed from the whole and/or different sides, and the commodity information comprises commodity main pictures, commodity detail pictures and the like, and the commodity main pictures are usually taken as pictures for primarily displaying the commodities, so that the corresponding overall view of the commodities can be displayed. The commodity descriptive text generally refers to descriptive information of commodities which are stored in association with the commodity to be advertised and are suitable for being provided in text form, and includes, but is not limited to, any one or more items of commodity titles, commodity detail texts, commodity prices, product parameters, class labels and the like of the commodity to be advertised. In application, the description information is generally used for describing any specific information of materials, usage, functions, models and the like of the commodity. The unique identifier distinguishes and represents different goods, and can be flexibly set by a person skilled in the art, such as a goods ID. In the application, the commodity main image of the commodity to be advertised is recommended to be used as the commodity image which is processed later.

In one embodiment, the data cleaning operation is performed on the commodity description text, and the data cleaning operation comprises the steps of removing special characters: and removing special characters in the commodity description text, such as punctuation marks, HTML labels, messy code characters and the like. The purity of the text can be ensured, and the interference of the special characters on the subsequent processing is avoided; case unification: converting the merchandise description text into a unified case form, typically selecting a case form. This helps to avoid the same word being recognized as a different vocabulary due to the different cases, improving the consistency analysis of the text; removing stop words: stop words refer to words that frequently appear in the commodity description text but have no actual meaning, such as articles, prepositions, conjunctions, and the like. Removing stop words can reduce noise in the text and improve accuracy of key information; removing the repeated vocabulary: removing repeated words in the commodity description text, avoiding the influence of repeated content on subsequent analysis and occupying storage space; spelling correction: a spelling correction algorithm may be used to automatically detect and correct spelling errors in the commodity description text, improving the accuracy and consistency of the text. A person skilled in the art may optionally perform any one or more of the data cleansing operations on the merchandise description text.

The spelling error correction algorithm may be a rule-based spelling correction algorithm: by checking and correcting spelling errors based on a series of rules. For example, consecutive repeated letters in the text, consecutive replacement characters, keyboard tap errors, etc. may be checked; statistical-based spelling correction algorithm: a statistical model is utilized to detect and correct spelling errors. A large-scale text corpus is used to calculate probabilities and statistical information of the occurrence of accurate words. In general, the algorithm evaluates possible correction suggestions based on word frequency and edit distance, and selects the suggestion with the highest probability as the correction result. Spelling correction algorithm based on language model: the probabilities of the word sequences are evaluated using a language model. The best spelling correction proposal is selected according to the prediction result of the language model. The language model may be modeled based on an n-gram model, a deep learning model (e.g., a recurrent neural network), and the like. One skilled in the art may choose any spelling correction algorithm to implement.

Step 1200, generating a plurality of corresponding selling point marketing descriptions according to the commodity description text and the commodity picture by adopting a selling point generation model, and constructing a selling point marketing description set;

And training the selling point generation model in advance until convergence, and obtaining the capability of generating corresponding selling point marketing description according to the commodity description text and the commodity picture of the commodity. The selling point generation model takes the commodity description text and the commodity picture of the commodity to be advertised as input, extracts corresponding text features and picture features, generates a plurality of corresponding texts as selling point marketing description based on the picture-text fusion features obtained by fusing the text features and the picture features, and can understand that the selling point marketing description ensures the selling point of the commodity to be advertised fully and accurately according to the multi-mode information of the commodity. Further, the aggregate of all the sales point marketing descriptions constitutes a sales point marketing description set.

The selling point generation model can comprise a Bert model to realize extraction of text characteristics of commodity description text, and a Resnet model to realize extraction of picture characteristics of commodity pictures. Without being limited to the examples herein, theoretically any neural network model suitable for extracting deep semantic information from the commodity description text and the commodity picture can be used to implement the corresponding feature extraction process in the present application, and finally, the text feature corresponding to the commodity description text and the picture feature corresponding to the commodity picture are obtained. In addition, the selling point generating model may further include an encoder such as a Decoder module in the Transformer to realize encoding the corresponding text based on the image-text fusion feature, which is not limited to the example herein, and theoretically any encoder suitable for encoding the corresponding text based on the image-text fusion feature may be used in the present application to realize the corresponding encoding process.

Step S1300, a plurality of target selling point marketing descriptions meeting preset conditions are screened out from the selling point marketing description set, wherein the target selling point marketing descriptions are formed when the difference degree between one selling point marketing description and other selling point marketing descriptions in the selling point marketing description set and/or the correlation degree between the selling point marketing description and commodity description text are large, and the target selling point marketing descriptions are met;

a text feature extraction model may be employed to determine semantic features of the vectorized representation corresponding to each of the sales point marketing descriptions in the sales point marketing description set, and semantic features of the vectorized representation of the commodity description text. Further, a vector distance algorithm is adopted to determine the vector distance between the semantic features of the vectorized representation corresponding to each selling point marketing description as a correlation degree, the complementary value is obtained to be the difference degree, and a vector distance algorithm is adopted to determine the vector distance between the semantic features of the vectorized representation corresponding to the commodity description text of each selling point marketing description as the correlation degree.

The text feature extraction model may be any of BERT, RNN, biLSTM, biGRU, roBERTa, ALBert, ERNIE, BERT-WWM, etc., and will not be described in detail, as the training process of these models is known in the art. The vector distance algorithm can be any one of cosine similarity algorithm, euclidean distance algorithm, pearson correlation coefficient algorithm, jacquard coefficient algorithm and the like. The preset threshold may be set as desired by one skilled in the art.

The TopN algorithm may be used to determine the target sales point marketing description. Specifically, according to descending order of the difference degree corresponding to each sales point marketing description, N (N is more than or equal to 2) sales point marketing descriptions with the top order are screened out to serve as target sales point marketing descriptions, in addition, a preset threshold value can be adopted to determine the target marketing descriptions, specifically, according to the difference degree corresponding to each sales point marketing description, a plurality of sales point marketing descriptions with the difference degree larger than the preset threshold value are screened out to serve as target marketing descriptions. It can be understood that the degree of difference between the marketing descriptions of the plurality of target selling points is determined to be larger, so that diversity is ensured, and advertisement documents generated by the following descriptions of the plurality of target selling points have higher diversity, namely more changes and originality in language and expression modes. This has benefits in marketing and improving user experience. The advertisement document with strong diversity can attract the attention of readers, and generate freshness and stimulus, thereby increasing the reading interest and purchasing desire of users. The N and the preset threshold may be set as desired by one skilled in the art.

The TopN algorithm may be used to determine the target sales point marketing description. Specifically, according to descending order of the correlation degree corresponding to each selling point marketing description, N (N is more than or equal to 2) selling point marketing descriptions with the top order are screened out to serve as target selling point marketing descriptions, in addition, a preset threshold value can be adopted to determine the target marketing descriptions, specifically, according to the correlation degree corresponding to each selling point marketing description, a plurality of selling point marketing descriptions with the correlation degree larger than the preset threshold value are screened out to serve as target marketing descriptions. It can be appreciated that determining that the plurality of target point-of-sale marketing descriptions has a greater degree of correlation ensures correlation such that advertisement documents generated from subsequent ones of the plurality of target point-of-sale descriptions are closely related to the article description text. The N and the preset threshold may be set as desired by one skilled in the art.

The target point-of-sale marketing description may be determined using a TopN algorithm based on the degree of difference between each point-of-sale marketing description and the degree of correlation between each point-of-sale marketing description and the commodity description text. Specifically, after the difference degree and the correlation degree corresponding to each sales point marketing description are multiplied by the weights respectively, the sorting scores are obtained by summation, the N (N is more than or equal to 2) sales point marketing descriptions with the top sorting are screened out as target sales point marketing descriptions according to descending sorting of the sorting scores corresponding to each sales point marketing description from high to low. It can be appreciated that the multiple target selling point marketing descriptions thus determined balance the correlation degree and the difference degree, respectively, and ensure comprehensive consideration of the correlation and the diversity. The N can be set as desired by a person skilled in the art. The degree of difference and the degree of correlation of the sales point marketing descriptions are respectively weighted, and the degree of difference and the degree of correlation are respectively weighted to be 0.6 and 0.4 according to the corresponding setting of the correlation and the diversity of the target sales point marketing descriptions.

And step 1400, constructing a generated prompt text according to the plurality of target selling point marketing descriptions, and calling a large language model to generate a plurality of corresponding advertisement texts according to the generated prompt text.

The large language model is suitable for text processing in the NLP field, is trained to be converged by using an extremely large corpus in advance, and has the capability of generating human language and accurate text semantic understanding capability and logic reasoning capability. The language model comprises OPT, chinchilla, paLM, LLaMA, alpaca, vicuna, GPT 3.5.3.5, GPT4, chatGPT and the like.

And editing task description, wherein the task description is used for indicating the large language model to generate corresponding N (generally N is greater than or equal to the number of target selling point marketing) advertisement documents according to the target selling point marketing descriptions and the commodity description text. The task descriptions and N therein may be flexibly set by those skilled in the art as disclosed herein. The commodity description text can be selected by one skilled in the art to use any one or more items of commodity titles, commodity detail texts, commodity prices, product parameters, class labels and the like of the commodity to be advertised. Constructing corresponding promts as generated Prompt texts according to the task description, the target selling point marketing description and the commodity description text, wherein the generated Prompt texts are as follows for facilitating understanding of the exemplary examples:

This is a product title:Full Leg and Foot Massager with Heat.Please refer to these 3sellingpoints:

1.This foot massager with an internal heat and airbags on its side allows you to warm up your leg muscles or take relaxation as needed.2.The leg massager works all around your ankles,knees and calf with 5modes for maximum intensity.

3.With kneading simulation system,this full leg and foot massager delivers comfortable massage to bring new experience.

Please help me write 5more Google SEO meta descriptions.

in one embodiment, the large language model is implemented by using GPT-3, the prompt text is input as an input of the large language model, the prompt text is input to an encoding end in the model, the multiple layers of multiple self-attentive layers and full-connection layers are stacked, the prompt text is encoded, specifically, the multiple attentive calculation is performed on the prompt text when the multiple attentive layers pass through, so that self-attentive weighting is performed on different dimensions of the prompt text, corresponding weighted vector representations are obtained, after the full-connection layers pass through, the encoded vector representations corresponding to the prompt text are obtained, further, the encoded representations are input to a decoding end in the model, the encoded representations corresponding to each word are decoded, specifically, the generation probability of each generated word is calculated according to the generated word, the current word position and the encoded representation corresponding to the word, and a plurality of advertisement texts are generated according to the generation probability of each generated word by adopting a diversified generation strategy. The diversification generation strategy can be realized by adopting Temperature Scaling (temperature control), top-kSampling, random Sampling and the like as required by one skilled in the art.

As can be appreciated from the exemplary embodiments of the present application, the technical solution of the present application has various advantages, including but not limited to the following aspects:

Referring to fig. 2, in a further embodiment, step S1200, generating a plurality of corresponding selling point marketing descriptions according to the commodity description text and the commodity picture by using a selling point generation model, includes the following steps:

Step S1210, inputting graphic data pairs formed by the commodity description text and the commodity pictures as a selling point generation model;

and training the selling point generation model in advance until convergence, and obtaining the capability of generating corresponding selling point marketing description according to the commodity description text and the commodity picture of the commodity. The selling point generation model comprises an image encoder, a text encoder, a fusion device and a decoder, wherein the image encoder can adopt a model suitable for extracting image characteristics, the recommended model is a ViT (Vision Transformer) model, and any other model such as a CNN model, a depth convolution model EfficientNet, denseNet, resnet and the like can also be adopted. The text encoder can adopt a model suitable for extracting text characteristics in the NLP field, for example, the Bert model is a better neural network model capable of processing text time sequence information so far, and can be suitable for being responsible for text extraction work in the application, and the Electrora model can obtain the effect equal to or similar to the Bert model with lower parameter, so that the text encoder is recommended to be used. The fusion device can be a linear layer or a neural network model based on a multi-head attention mechanism, and any model such as Transformer, vit can be adopted by the neural network model based on the multi-head attention mechanism. The Decoder may be a Decoder module in a transducer, or may be a generative language model, and the generative language model may be a Gpt series model.

Step S1220, an image encoder in the selling point generation model is applied to extract a picture feature vector of a commodity picture in the picture data pair, and a text encoder extracts a text feature vector of a commodity description text in the picture data pair;

inputting the image-text data pair into the selling point generating model, in one embodiment, adopting a Resnet50 to realize the image encoder, inputting the commodity image in the image-text data pair after conventional preprocessing into the Resnet50, gradually extracting the image characteristics corresponding to the commodity image by a main block (stem block) and 4 residual blocks (bottleneck blocks) of the Resnet50, wherein a shallow stage (stage) extracts basic characteristics such as details, edges and the like corresponding to the commodity to be advertised in the commodity image, further extracts deep semantic characteristics and advanced logic characteristics in a deep stage, and finally obtains the image characteristics, namely image characteristic vectors, of vectorization representation of the commodity image output by the Res5 stage.

The text encoder is realized by adopting a Bert model, commodity description texts in graphic data pairs subjected to word segmentation are input to Bert, three vectors are encoded from the commodity description texts, wherein the three vectors are respectively text embedded vectors (Token embedded) used for representing the words of the commodity description texts, position embedded vectors (Position Embedding) used for representing the position information of the words of the commodity description texts, sentence embedded vectors (Segment Embedding) used for representing the distinguishing information between sentences, and then the text encoder extracts text features according to the embedded vectors to finally obtain corresponding text semantic vectors.

Step S1230, a fusion device in a selling point generation model is applied to fuse the picture feature vector and the text feature vector to obtain a picture-text fusion vector;

in one embodiment, the fusion device is implemented by adopting a linear layer, the picture feature vector and the text feature vector are adjusted to a uniform scale by a gauge, and then the picture and text fusion vector can be obtained by simple splicing.

In another embodiment, the picture feature vector and the text feature vector are input into a neural network model based on a multi-head attention mechanism, and according to the principle of a transducer, a plurality of encoders with the same structure and principle are included in a coding path of the transducer, each encoder comprises a self-attention layer, feature interaction is performed on the picture feature vector and the text feature vector which are input into the encoder, and the interacted result is transmitted to a multi-layer perceptron of the encoder for high-level semantic extraction and output, so that an image-text fusion vector is obtained. In this embodiment, the accuracy and reliability of the obtained image-text fusion vector can be ensured by performing multidimensional feature interaction on the image feature vector and the text feature vector.

Step 1240, a decoder in the selling point generation model is applied to decode the graphic fusion vector, and a plurality of selling point marketing descriptions are obtained based on the diversified generation strategy.

In one embodiment, a Decoder module in a transformer is adopted to realize the Decoder, the image-text fusion vector is input into the Decoder, the generation probability of each generated word is calculated according to the generated word and the current word position and the code representation corresponding to the segmentation in the decoding process, and a plurality of corresponding advertisement texts are generated according to the generation probability of each generated word by adopting a diversified generation strategy. The diversification generation strategy can be realized by adopting Temperature Scaling (temperature control), top-K Sampling, random Sampling and the like as required by one skilled in the art.

In the embodiment, the selling point generation model is adopted to generate a plurality of corresponding selling point marketing descriptions based on multi-mode commodity information of the commodity to be advertised, and the comprehensive and accurate selling point marketing descriptions generated are ensured by fully utilizing the refined description information of the commodity provided by the commodity description text and visual feeling information of the visual image provided by the commodity picture.

Referring to fig. 3, in a further embodiment, step S1300, a plurality of target selling point marketing descriptions meeting preset conditions are screened from the selling point marketing description set, and the method includes the following steps:

Step S1310, extracting text feature vectors corresponding to each selling point marketing description in the selling point marketing description set by adopting a text feature extraction model, and extracting text feature vectors of the commodity description text;

and determining semantic features of the vectorized representation corresponding to each selling point marketing description in the selling point marketing description set, namely text feature vectors, and semantic features of the vectorized representation of the commodity description text, namely text feature vectors, by adopting a text feature extraction model.

The text feature extraction model may be any of BERT, RNN, biLSTM, biGRU, roBERTa, ALBert, ERNIE, BERT-WWM, etc., and will not be described in detail, as the training process of these models is known in the art.

Step S1320, determining the degree of correlation corresponding to each sales point marketing description according to the text feature vector corresponding to each sales point marketing description and other sales point marketing descriptions, and determining the degree of correlation corresponding to each sales point marketing description according to the text feature vector corresponding to each sales point marketing description and text information;

further, a vector distance algorithm is adopted to determine the vector distance between text feature vectors corresponding to the marketing descriptions of each selling point as the degree of correlation, and a vector distance algorithm is adopted to determine the vector distance between text feature vectors corresponding to the marketing descriptions of each selling point and the commodity description text as the degree of correlation.

The vector distance algorithm can be any one of cosine similarity algorithm, euclidean distance algorithm, pearson correlation coefficient algorithm, jacquard coefficient algorithm and the like. The preset threshold may be set as desired by one skilled in the art.

And step 1330, determining a plurality of target selling point marketing descriptions by adopting a maximum edge correlation algorithm according to the correlation degree between each selling point marketing description and the commodity description text.

In order to comprehensively consider the correlation degree between each sales point marketing description and commodity description text, a final result set obtained by iterating at least once is obtained by adopting the maximum edge correlation algorithm (Maximal Marginal Relevance), the final result set comprises a plurality of sales point marketing descriptions which are ordered in descending order from high to low based on MMR scores, a plurality of sales point marketing descriptions which are ordered in front in the final result set are screened out to serve as target marketing descriptions, and an exemplary formula of the maximum edge correlation algorithm is as follows:

wherein: q is commodity description text, R is sales point marketing description set, D _i For members of the set of sales point marketing descriptions, the sales point marketing description, S is the current returned result set, sim ₁ (Q,D _i ) For the degree of correlation between the marketing descriptions of the selling points and the text of the descriptions of the goods, sim ₂ (D _i ,D _j ) For the degree of correlation between marketing descriptions of selling points, λ is a parameter that balances the correlation and diversity of the final result, and is more important for diversity in this embodiment, so the recommended value of λ is in the range of [0.2-0.4 ]]Those skilled in the art can readily devise flexible arrangements as disclosed herein. It can be understood that a certain degree of various target selling point marketing descriptions are beneficial to generating various advertisement texts according to the target marketing descriptions, and the advertisement texts can attract the attention of readers and generate freshness and stimulus, so that the reading interest and purchasing desire of users are increased.

In this embodiment, by determining a plurality of target selling point marketing descriptions based on the correlation degree between each selling point marketing description and the commodity description text by using the maximum edge correlation algorithm, it is able to ensure that the target selling point marketing descriptions have a certain degree of correlation and diversity, and is helpful to generating multiple advertisement documents according to the target marketing descriptions in the following process, so that the advertisement documents can be more accurate, comprehensive and attractive to readers, and the advertisement effect and ROI (return on investment) can be improved.

Referring to fig. 4, in a further embodiment, after step S1100, obtaining the article description text and the article picture of the article to be advertised, the method includes the following steps:

step 1101, acquiring commodity titles of all commodities corresponding to each commodity class in a commodity database, and segmenting each commodity title to obtain a segmentation sequence corresponding to each commodity title;

the word segmentation of the commodity title can be realized by adopting a word segmentation algorithm such as jieba, n-gram, wordpiece and the like, and can be flexibly realized by one of ordinary skill in the art.

Step 1102, determining, for each commodity category, a word segment with an inverse text frequency index meeting a preset condition in a word segment sequence corresponding to each commodity title as a selling point keyword, determining a word segment with a word frequency meeting the preset condition according to the word segment sequences corresponding to all the commodity titles as a selling point keyword, and constructing a selling point keyword set by associating the selling point keywords with the corresponding category;

for each commodity category, determining an inverse text frequency index corresponding to each word in a word segmentation sequence corresponding to each commodity title, specifically, for a single word, dividing the total word segmentation sequence number, namely, all commodity titles of the corresponding single commodity category, by the word segmentation sequence in which the word segmentation occurs, namely, the commodity titles containing the word segmentation in the commodity category, taking the obtained quotient as a logarithm with a base number of 10 to obtain the inverse document frequency index of the word segmentation, and for the word segmentation sequence corresponding to each commodity title of the same commodity category, sorting according to descending order of the inverse text frequency index corresponding to each word in the word segmentation sequence from high to low, screening out N words with strong general importance, namely, the word with strong general importance, which are ranked in the front of each word segmentation sequence, wherein N can be set by a person skilled in the art as required.

For each commodity category, determining all different word segmentation according to word segmentation sequences corresponding to all corresponding commodity titles, counting word frequency corresponding to each word segmentation, sorting N word segmentation words with the top sorting, namely words with high frequency, as selling point keywords of the commodity category according to descending order of the word frequency corresponding to each word segmentation as the word frequency corresponding to each word segmentation is expressed, wherein N can be set by a person in the field as required.

Further, after performing the duplication removal operation on all the selling point keywords corresponding to each commodity category, associating each commodity category with all the selling point keywords corresponding to each commodity category to form association data, and collecting all the association data to form a selling point keyword set.

Step S1103, according to the commodity category recall selling point keywords corresponding to the selling point keyword set of the commodity to be advertised, determining selling point keywords matched with the commodity description text and/or commodity picture of the commodity to be advertised in the recalled selling point keywords as the expanded commodity description text of the commodity to be advertised.

And acquiring all the selling point keywords associated with the commodity category from the selling point keyword set according to the commodity category of the commodity to be advertised so as to recall the selling point keywords.

Further, in an embodiment, according to the vectorized representation of the picture features of the commodity picture of the commodity to be advertised, the vector distance between the vectorized representations of the text features corresponding to each of the recalled selling point keywords is used as the matching degree, and the selling point keywords, of which the matching degree meets a preset threshold, in all the recalled selling point keywords are screened out and used as the expanded commodity description text of the commodity to be advertised.

In another embodiment, according to the vectorized representation of the text features of the commodity description text of the advertisement commodity to be put, the vector distance between the vectorized representations of the text features corresponding to each of the recalled selling point keywords is used as the matching degree, and the selling point keywords, of which the matching degree meets a preset threshold, in all the recalled selling point keywords are screened out to be used as the expanded commodity description text of the advertisement commodity to be put.

In still another embodiment, a graphic fusion vector obtained by fusing the feature corresponding to the commodity description text and the commodity picture of the commodity to be advertised is determined, and the selling point keywords with the matching degree meeting a preset threshold value in all the recalled selling point keywords are screened out as the expanded commodity description text of the commodity to be advertised according to the vector distance between the graphic fusion vector and the vectorized representation of the text feature corresponding to each recalled selling point keyword.

Each recalled selling point keyword can be used as input of the model by adopting a pre-trained convergent Bert model, taking a single selling point keyword as an example, three vectors are encoded from the selling point keyword, wherein the three vectors are respectively a text embedded vector (Token embedded) for representing each word, a position embedded vector (Position Embedding) for representing position information of each word and a sentence embedded vector (Segment Embedding) for representing distinguishing information among sentences, and a text encoder extracts text features according to the embedded vectors to finally obtain vectorized representation of corresponding text features.

The determination of the graphic fusion vector may be implemented according to steps S1210-1230, and the vectorization representation of the picture feature of the commodity picture of the commodity to be advertised and the vectorization representation of the text feature of the commodity description text may be implemented according to steps S1210-1220, which are not described in detail herein.

In this embodiment, by constructing the selling point keyword set, further determining the selling point keywords in the selling point keyword set, which are matched with the commodity picture and/or the commodity description text of the commodity to be advertised, as the expanded commodity description text of the commodity to be advertised, on one hand, the rich selling point keywords can be determined efficiently and accurately, and on the other hand, the deficient and/or synonymous selling point description in the commodity description text can be expanded.

Referring to fig. 5, in a further embodiment, step S1103, determining a selling point keyword matching with the commodity description text and the commodity picture of the commodity to be advertised from among the recalled plurality of selling point keywords, includes the following steps:

step S1104, extracting a picture feature vector of a commodity picture of the commodity to be advertised by using an image encoder, extracting a text feature vector of a commodity description text of the commodity to be advertised by using a text encoder, and fusing the picture feature vector and the text feature vector to obtain a picture-text fusion vector;

can be realized according to the steps S1210-1230, which are not explained in detail.

Step S1105, extracting text feature vectors corresponding to the recalled keywords of each selling point by using a text encoder;

the text encoder can adopt a model suitable for extracting text characteristics in the NLP field, for example, the Bert model is a better neural network model capable of processing text time sequence information so far, and can be suitable for being responsible for text extraction work in the application, and the Electrora model can obtain the effect equal to or similar to the Bert model with lower parameter, so that the text encoder is recommended to be used. Since the training processes of the Bert and the electric models are known to those skilled in the art, the training process is not described in detail.

In one embodiment, the text encoder is implemented by using a Bert model trained to be converged in advance, each recalled selling point keyword is taken as an input of the model, and three vectors are encoded from the selling point keyword by taking a single selling point keyword as an example, and are respectively a text Embedding vector (Token Embedding) for representing each word of the text encoder, a position Embedding vector (Position Embedding) for representing position information of each word of the text encoder, a sentence Embedding vector (Segment Embedding) for representing distinguishing information among sentences, and text feature extraction is performed by the text encoder according to the Embedding vectors, so that vectorized representation of corresponding text features is finally obtained.

And step 1106, screening out the selling point keywords with the matching degree meeting the preset conditions according to the matching degree between the text feature vector corresponding to each recalled selling point keyword and the image-text fusion vector.

And determining the vector distance between the text feature vector corresponding to each recalled selling point keyword and the image-text fusion vector as the matching degree by adopting a vector distance algorithm, sorting the N selling point keywords in descending order according to the matching degree corresponding to each selling point keyword, and screening the N selling point keywords ranked in the front. The N can be set as desired by a person skilled in the art.

In this embodiment, the matching degree between the feature vectors of the text corresponding to each recalled selling point keyword and the feature fusion vector of the feature corresponding to the commodity picture and the commodity description text of the commodity to be advertised is fused, so that the matched selling point keyword is screened out.

Referring to fig. 6, in a further embodiment, before acquiring the commodity description text and the commodity picture of the commodity to be advertised, step S1100 includes the following steps:

step S1000, acquiring a single training sample and a supervision tag thereof from a prepared training set, wherein the training sample is a graphic data pair formed by a commodity description text and a commodity picture of an advertisement commodity, and the supervision tag is a selling point marketing description in an advertisement document used when the commodity of the training sample is put in the advertisement;

To ensure that after the selling point generation model is trained to converge, a selling point marketing description that can attract readers and accurately describe the selling point can be generated, in the recommended embodiment, the advertising commodity is a commodity with advertising effect meeting the expectations.

Step S1010, extracting a picture feature vector of a commodity picture in a training sample by using an image encoder in the selling point generation model, and extracting a text feature vector of a commodity description text in the training sample by using a text encoder;

Step S1020, a fusion device in a selling point generation model is applied to fuse the picture feature vector and the text feature vector to obtain a picture-text fusion vector;

And step 1030, decoding the graphic fusion vector by using a decoder in the selling point generation model, and obtaining a predicted selling point marketing description based on a greedy search strategy.

In one embodiment, a Decoder module in a transformer is adopted to realize the Decoder, the image-text fusion vector is input into the Decoder, the generation probability of each generated word is calculated according to the generated word and the current word position and the code representation corresponding to the segmentation in the decoding process, the word with the highest generation probability is selected by adopting a greedy search strategy, and the generated words are spliced in sequence to obtain the predicted selling point marketing description.

And step S1040, determining a loss value of the predicted selling point marketing description by adopting a supervision tag of the training sample, updating the weight of the selling point generation model when the loss value does not reach a preset threshold, and continuously calling other training samples to perform iterative training until the selling point generation model converges.

Invoking a preset cross entropy loss function, wherein the cross entropy loss function can be flexibly set by a person skilled in the art according to priori knowledge or experimental experience, calculating a cross entropy loss value corresponding to the predicted sales point marketing description based on a supervision label according to the training sample, and when the cross entropy loss value reaches a preset threshold value, indicating that the sales point generation model is trained to a convergence state, so that the training of the sales point generation model can be terminated; when the cross entropy loss value does not reach the preset threshold value, the fact that the selling point generation model is not converged is indicated, gradient update is carried out on the model according to the cross entropy loss value, the model is further approximated to convergence by correcting weight parameters of all links of the model through back propagation, and then other training samples are continuously called to carry out iterative training on the selling point generation model until the model is trained to a convergence state.

In the embodiment, a training process of the selling point generation model is disclosed, and after training is completed, the capability of generating corresponding selling point marketing description according to the commodity description text and the commodity picture of the commodity is learned, so that the accuracy and the reliability of the generated selling point marketing description can be ensured.

Referring to fig. 7, in a further embodiment, before step S1000, a single training sample and its supervision tag are obtained from the prepared training set, the method includes the following steps:

step S2000, obtaining delivery success data of a plurality of delivered advertisement commodities, screening target delivered advertisement commodities of which the delivery success data meets preset conditions, and obtaining advertisement documents of the target delivered advertisement commodities;

it can be understood that, through the advertisement file of the advertisement commodity, the relevant delivery effect data of the advertisement commodity can be obtained, wherein the delivery effect data includes but is not limited to click rate, conversion rate and ROI (return on investment), and accordingly, the advertisement commodity with the click rate, conversion rate and ROI exceeding the corresponding preset threshold is screened out as the target advertisement commodity. The click rate refers to the ratio of the number of clicks obtained by the advertisement document in which the advertisement commodity is put to the number of times of display, and is an index for measuring the attraction of the advertisement to the audience. The conversion rate refers to the ratio of the conversion times generated by purchasing goods, filling forms, subscribing services and the like and the click times of advertisements after clicking advertisement texts of the advertisement goods, and is an index for measuring the effect of the advertisements and the target achievement condition. The ROI refers to the ratio of advertising investment to revenue generated, the higher the ROI, the greater the return generated by the advertising investment. The click rate, conversion rate and preset threshold corresponding to the ROI can be set by a person skilled in the art according to the expected advertising effectiveness.

Step S2010, determining whether each sentence in the advertisement document of the target advertisement commodity belongs to a marketing selling point by adopting a text classification model, and screening out sentences belonging to the marketing selling point as selling point marketing description;

the text classification model includes a text feature extraction layer adapted to extract semantics of the input text for vector representation, and a classifier, which may be selected from a variety of known models including, but not limited to BERT, RNN, biLSTM, biGRU, roBERTa, ALBert, ERNIE, BERT-WWM, etc. The classifier is suitable for a bi-classification task, which may be MLP (feed forward neural network) or FC (fully connected layer). Since the training process of the text classification model is known to those skilled in the art, the training process thereof is not described in detail.

And dividing the advertisement document of the target advertisement commodity according to the period, inputting each sentence into the text classification model, taking a single sentence as an example, adopting a text feature extraction layer in the text classification model to extract a text feature vector corresponding to the sentence, inputting the text feature vector into a classifier in the model, and mapping the text feature vector to a preset dichotomy class, wherein the dichotomy class comprises a first class which characterizes that the sentence belongs to a marketing selling point and a second class which does not belong to the marketing selling point, and obtaining the class with the highest classification probability in the dichotomy class, so that whether the sentence belongs to the marketing selling point can be determined.

Step S2020, acquiring commodity description text and commodity picture forming graph text data pairs of the target advertisement commodity to be used as training samples, and marking selling point marketing description in the advertisement text of the target advertisement as a supervision label of the training samples.

Further, training samples and supervision labels of the advertisement commodities can be correspondingly constructed according to commodity description texts and commodity pictures of the target advertisement commodities and advertisement texts.

In the embodiment, the training samples and the supervision labels of the training samples are correspondingly constructed by screening out the advertisement delivery effect to meet the expected target advertisement delivery commodity. The method lays a foundation for ensuring that after the selling point generation model is trained to be converged, selling point marketing description which can attract readers and accurately describe the selling point can be generated.

Referring to fig. 8, an advertisement document generating apparatus provided in accordance with one of the purposes of the present application is a functional implementation of the advertisement document generating method of the present application, and in another aspect, the apparatus provided in accordance with one of the purposes of the present application includes a data obtaining module 1100, a description set constructing module 1200, a target screening module 1300, and a document generating module 1400, where the data obtaining module 1100 is configured to obtain a commodity description text and a commodity picture of a commodity to be advertised; the description set constructing module 1200 is configured to generate a plurality of corresponding selling point marketing descriptions according to the commodity description text and the commodity picture by using a selling point generating model, and construct a selling point marketing description set; the target screening module 1300 is configured to screen a plurality of target sales point marketing descriptions that meet a preset condition from the sales point marketing description set, where the target sales point marketing descriptions are configured when a degree of difference between a sales point marketing description in the sales point marketing description set and other sales point marketing descriptions and/or a degree of correlation with a commodity description text is greater, and the target sales point marketing descriptions are met; the document generation module 1400 is configured to generate a prompt text according to the plurality of target selling point marketing descriptions, and call a large language model to generate a plurality of advertisement documents according to the generated prompt text.

In a further embodiment, the description set construction module 1200 includes: the model input submodule is used for forming image-text data pairs by the commodity description text and the commodity pictures to serve as input of a selling point generation model; the first image-text feature extraction submodule is used for extracting image feature vectors of commodity pictures in the image data pair by applying an image encoder in the selling point generation model, and extracting text feature vectors of commodity description texts in the image data pair by using a text encoder; the image-text feature fusion sub-module is used for fusing the image feature vector and the text feature vector by applying a fusion device in the selling point generation model to obtain an image-text fusion vector; and the diversification generation sub-module is used for decoding the graphic fusion vector by applying a decoder in the selling point generation model and obtaining a plurality of selling point marketing descriptions based on a diversification generation strategy.

In a further embodiment, the target screening module 1300 includes: the second image-text feature extraction sub-module is used for extracting text feature vectors corresponding to each selling point marketing description in the selling point marketing description set by adopting a text feature extraction model, and extracting text feature vectors of the commodity description text; the relevance determining sub-module is used for determining the relevance corresponding to each selling point marketing description according to the text feature vector corresponding to each selling point marketing description and other selling point marketing descriptions, and determining the relevance corresponding to each selling point marketing description according to the text feature vector corresponding to each selling point marketing description and the text information; and the description determining sub-module is used for determining a plurality of target selling point marketing descriptions according to the correlation degree between each selling point marketing description and the commodity description text by adopting a maximum edge correlation algorithm.

In a further embodiment, the data acquisition module 1100 further comprises: the word segmentation sequence sub-module is used for acquiring commodity titles of all commodities corresponding to each commodity class in the commodity database, and segmenting each commodity title to acquire a word segmentation sequence corresponding to each commodity title; the word set construction submodule is used for determining, for each commodity category, a word segment with the inverse text frequency index meeting a preset condition in a word segment sequence corresponding to each commodity title as a selling point keyword, determining the word segment with the word frequency meeting the preset condition as the selling point keyword according to the word segment sequences corresponding to all commodity titles, and constructing a selling point keyword set by associating the selling point keywords with the corresponding category; the text expansion sub-module is used for recalling a plurality of corresponding selling point keywords in the selling point keyword set according to the commodity class of the commodity to be advertised, and determining selling point keywords matched with the commodity description text and/or commodity picture of the commodity to be advertised in the recalled plurality of selling point keywords as the expanded commodity description text of the commodity to be advertised.

In a further embodiment, the text extension sub-module includes: the image-text feature extraction and fusion unit is used for extracting image feature vectors of commodity pictures of the commodity to be advertised by using the image encoder, extracting text feature vectors of commodity description texts of the commodity to be advertised by using the text encoder, and fusing the image feature vectors and the text feature vectors to obtain image-text fusion vectors; the text feature extraction unit is used for extracting text feature vectors corresponding to the recalled keywords of each selling point by using a text encoder; and the keyword screening unit is used for screening the selling point keywords with the matching degree meeting the preset conditions according to the matching degree between the text feature vector corresponding to each recalled selling point keyword and the image-text fusion vector.

In a further embodiment, before the data acquisition module 1100, the method includes: the sample acquisition module is used for acquiring a single training sample and a supervision tag thereof from a prepared training set, wherein the training sample is a graphic data pair formed by a commodity description text and a commodity picture of an advertisement commodity, and the supervision tag is a selling point marketing description in an advertisement document used when the commodity of the training sample is put in the advertisement; the image-text feature extraction module is used for extracting picture feature vectors of commodity pictures in the training samples by using an image encoder in the selling point generation model, and extracting text feature vectors of commodity description texts in the training samples by using a text encoder; the image-text feature fusion module is used for fusing the image feature vector and the text feature vector by applying a fusion device in the selling point generation model to obtain an image-text fusion vector; and decoding the graphic fusion vector by using a decoder in the selling point generation model, and obtaining a predicted selling point marketing description based on a greedy search strategy. And the iterative training module is used for determining a loss value of the predicted selling point marketing description by adopting a supervision label of the training sample, updating the weight of the selling point generating model when the loss value does not reach a preset threshold value, and continuously calling other training samples to perform iterative training until the selling point generating model converges.

In a further embodiment, before the sample acquisition module, the method comprises: the data acquisition module 1100 is configured to acquire delivery success data of a plurality of advertisement commodities, screen out target advertisement commodities of the delivery success data meeting a preset condition, and acquire advertisement documents of the target advertisement commodities; the description screening module is used for determining whether each sentence in the advertisement document of the target advertisement commodity belongs to a marketing selling point or not by adopting a text classification model, and screening out sentences belonging to the marketing selling point as selling point marketing description; and the sample construction and labeling module is used for acquiring a commodity description text and commodity picture composition graph text data pair of the target advertisement commodity to be used as a training sample, and labeling the selling point marketing description in the advertisement document of the target advertisement as a supervision label of the training sample.

In order to solve the technical problems, the embodiment of the application also provides computer equipment. As shown in fig. 9, the internal structure of the computer device is schematically shown. The computer device includes a processor, a computer readable storage medium, a memory, and a network interface connected by a system bus. The computer readable storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store a control information sequence, and the computer readable instructions can enable the processor to realize an advertisement document generation method when the computer readable instructions are executed by the processor. The processor of the computer device is used to provide computing and control capabilities, supporting the operation of the entire computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, cause the processor to perform the advertisement document generation method of the present application. The network interface of the computer device is for communicating with a terminal connection. It will be appreciated by persons skilled in the art that the architecture shown in fig. 9 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applicable, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

The processor in this embodiment is configured to execute specific functions of each module and its sub-module in fig. 8, and the memory stores program codes and various data required for executing the above modules or sub-modules. The network interface is used for data transmission between the user terminal or the server. The memory in this embodiment stores program codes and data required for executing all modules/sub-modules in the advertisement document generation device of the present application, and the server can call the program codes and data of the server to execute the functions of all sub-modules.

The present application also provides a storage medium storing computer readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of the advertisement document generation method of any of the embodiments of the present application.

Those skilled in the art will appreciate that all or part of the processes implementing the methods of the above embodiments of the present application may be implemented by a computer program for instructing relevant hardware, where the computer program may be stored on a computer readable storage medium, where the program, when executed, may include processes implementing the embodiments of the methods described above. The storage medium may be a computer readable storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).

In summary, the application can accurately describe the selling points of the commodities in the advertisement document, and is expected to achieve the expected advertising effect.

Those of skill in the art will appreciate that the various operations, methods, steps in the flow, acts, schemes, and alternatives discussed in the present application may be alternated, altered, combined, or eliminated. Further, other steps, means, or steps in a process having various operations, methods, or procedures discussed herein may be alternated, altered, rearranged, disassembled, combined, or eliminated. Further, steps, measures, schemes in the prior art with various operations, methods, flows disclosed in the present application may also be alternated, altered, rearranged, decomposed, combined, or deleted.

The foregoing is only a partial embodiment of the present application, and it should be noted that it will be apparent to those skilled in the art that modifications and adaptations can be made without departing from the principles of the present application, and such modifications and adaptations are intended to be comprehended within the scope of the present application.

Claims

1. The advertisement document generation method is characterized by comprising the following steps:

2. The advertising document generation method of claim 1, wherein generating a corresponding plurality of point of sale marketing descriptions from the commodity description text and commodity picture using a point of sale generation model comprises the steps of:

the commodity description text and the commodity picture form image-text data to be used as input of a selling point generation model;

extracting a picture feature vector of a commodity picture in the picture data pair by using an image encoder in the selling point generation model, and extracting a text feature vector of a commodity description text in the picture data pair by using a text encoder;

Applying a fusion device in the selling point generation model to fuse the picture feature vector and the text feature vector to obtain a picture-text fusion vector;

and decoding the graphic fusion vector by using a decoder in the selling point generation model, and obtaining a plurality of selling point marketing descriptions based on a diversified generation strategy.

3. The advertisement document generation method according to claim 1, wherein the step of screening out a plurality of target sales point marketing descriptions meeting a preset condition from the sales point marketing description set comprises the steps of:

extracting text feature vectors corresponding to each selling point marketing description in the selling point marketing description set by adopting a text feature extraction model, and extracting text feature vectors of the commodity description text;

determining the corresponding correlation degree of each sales point marketing description according to the text feature vector of each sales point marketing description corresponding to other sales point marketing descriptions, and determining the corresponding correlation degree of each sales point marketing description according to the text feature vector of each sales point marketing description corresponding to the text information;

and determining a plurality of target selling point marketing descriptions by adopting a maximum edge correlation algorithm according to the correlation degree between each selling point marketing description and the commodity description text.

4. The advertisement document generation method according to claim 1, wherein after acquiring the commodity description text and the commodity picture of the commodity to be advertised, comprising the steps of:

acquiring commodity titles of all commodities corresponding to each commodity category in a commodity database, and segmenting each commodity title to obtain a word segmentation sequence corresponding to each commodity title;

determining, for each commodity category, a word segment with an inverse text frequency index meeting a preset condition in a word segment sequence corresponding to each commodity title as a selling point keyword, determining a word segment with a word frequency meeting the preset condition as a selling point keyword according to the word segment sequences corresponding to all commodity titles, and associating the selling point keywords with the corresponding category to construct a selling point keyword set;

and according to the commodity category recall selling point keywords of the commodity to be put in, determining selling point keywords matched with the commodity description text and/or commodity picture of the commodity to be put in from among the recalled selling point keywords as the expanded commodity description text of the commodity to be put in.

5. The advertisement document generation method according to claim 4, wherein determining a selling point keyword matching a commodity description text and a commodity picture of a commodity to be advertised out of a plurality of recalled selling point keywords, comprises the steps of:

Extracting a picture feature vector of a commodity picture of the commodity to be advertised by using an image encoder, extracting a text feature vector of a commodity description text of the commodity to be advertised by using a text encoder, and fusing the picture feature vector and the text feature vector to obtain a picture-text fusion vector;

extracting text feature vectors corresponding to the recalled keywords of each selling point by using a text encoder;

and screening out the selling point keywords with the matching degree meeting the preset conditions according to the matching degree between the text feature vector corresponding to each recalled selling point keyword and the image-text fusion vector.

6. The advertisement document generation method according to claim 1, wherein before acquiring the commodity description text and the commodity picture of the commodity to be advertised, comprising the steps of:

acquiring a single training sample and a supervision tag thereof from a prepared training set, wherein the training sample is a graphic data pair formed by commodity description text and commodity pictures of the advertised commodity, and the supervision tag is a selling point marketing description in an advertisement file used when the commodity of the training sample is advertised;

extracting a picture feature vector of a commodity picture in the training sample by using an image encoder in the selling point generation model, and extracting a text feature vector of a commodity description text in the training sample by using a text encoder;

and decoding the graphic fusion vector by using a decoder in the selling point generation model, and obtaining a predicted selling point marketing description based on a greedy search strategy.

And determining a loss value of the predicted selling point marketing description by adopting a supervision tag of the training sample, updating the weight of the selling point generation model when the loss value does not reach a preset threshold, and continuously calling other training samples to perform iterative training until the selling point generation model converges.

7. The advertising document generation method of claim 6, wherein before obtaining a single training sample and its supervision labels from the prepared training set, comprising the steps of:

acquiring delivery effect data of a plurality of delivered advertisement commodities, screening target delivered advertisement commodities of the delivery effect data meeting preset conditions, and acquiring advertisement texts of the target delivered advertisement commodities;

determining whether each sentence in the advertisement document of the target advertisement commodity belongs to a marketing selling point or not by adopting a text classification model, and screening out sentences belonging to the marketing selling point as selling point marketing description;

And acquiring commodity description text and commodity picture forming graph text data pairs of the target advertisement commodity to serve as training samples, and marking selling point marketing descriptions in the advertisement text of the target advertisement as supervision labels of the training samples.

8. An advertising document generation device, comprising:

the data acquisition module is used for acquiring commodity description texts and commodity pictures of the commodities to be advertised;

the description set construction module is used for generating a plurality of corresponding selling point marketing descriptions according to the commodity description text and the commodity picture by adopting a selling point generation model, and constructing a selling point marketing description set;

the target screening module is used for screening a plurality of target selling point marketing descriptions meeting preset conditions from the selling point marketing description set, wherein the target selling point marketing descriptions are formed when the difference degree between one selling point marketing description and other selling point marketing descriptions in the selling point marketing description set and/or the correlation degree between the selling point marketing description and commodity description text are large, and the target selling point marketing descriptions are met;

and the document generation module is used for generating prompt texts according to the target selling point marketing description structures, calling a large language model and generating a plurality of corresponding advertisement documents according to the generated prompt texts.

9. A computer device comprising a central processor and a memory, characterized in that the central processor is arranged to invoke a computer program stored in the memory for performing the steps of the method according to any of claims 1 to 7.

10. A computer-readable storage medium, characterized in that it stores in the form of computer-readable instructions a computer program implemented according to the method of any one of claims 1 to 7, which, when invoked by a computer, performs the steps comprised by the corresponding method.