CN116719924A - Advertisement keyword generation method, device, equipment and medium thereof - Google Patents

Advertisement keyword generation method, device, equipment and medium thereof Download PDF

Info

Publication number
CN116719924A
CN116719924A CN202310779038.2A CN202310779038A CN116719924A CN 116719924 A CN116719924 A CN 116719924A CN 202310779038 A CN202310779038 A CN 202310779038A CN 116719924 A CN116719924 A CN 116719924A
Authority
CN
China
Prior art keywords
advertisement
keywords
candidate
core product
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310779038.2A
Other languages
Chinese (zh)
Inventor
罗东锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shangyun Network Technology Co ltd
Original Assignee
Guangzhou Shangyun Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shangyun Network Technology Co ltd filed Critical Guangzhou Shangyun Network Technology Co ltd
Priority to CN202310779038.2A priority Critical patent/CN116719924A/en
Publication of CN116719924A publication Critical patent/CN116719924A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application relates to an advertisement keyword generation method, a device, equipment and a medium thereof in the technical field of electronic commerce, wherein the method comprises the following steps: acquiring text information of the commodity to be advertised, and recalling a core product word set matched with the text information in a preset core product word library; obtaining derivative keywords related to the core product words according to the core product words in the core product word set, and adding the derivative keywords serving as extended core product words into the core product word set; forming candidate advertisement keywords by using core product words in the core product word set and intention keywords in a preset search intention word library, and summarizing the candidate advertisement keywords to form a candidate keyword set, wherein the intention keywords are feature description information comprising roles for supplying the advertisement commodities to be put in; and determining recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set, and screening advertisement keywords with recommendation scores meeting preset conditions. Thus, keywords reaching the expected advertising expectations can be produced.

Description

Advertisement keyword generation method, device, equipment and medium thereof
Technical Field
The present application relates to the field of electronic commerce technologies, and in particular, to a method for generating advertisement keywords, and a corresponding apparatus, computer device, and computer readable storage medium thereof.
Background
B2B cross-border e-commerce refers to cross-border e-commerce transactions conducted from business to business worldwide. In marketing promotion, enterprises belonging to suppliers in B2B cross-border electronic commerce promote commodity transaction by exposing supplied commodities to enterprises belonging to purchasing parties through advertisement delivery.
In the prior art, the advertisement keywords used for advertising are manually screened out from the commodity titles of commodities to be advertised to serve as advertisement keywords, however, such advertisement keywords are only descriptive of commodities and cannot be accurately matched with the searching intention of enterprises belonging to suppliers in B2B cross-border electronic commerce, enterprises belonging to suppliers need to search, and therefore the advertisement keywords cannot be searched by the enterprises belonging to the suppliers and hit, and advertisements delivered by the suppliers cannot be exposed, so that the expected effect of advertisement delivery cannot be achieved.
In view of the shortcomings of the conventional technology, the inventor conducts research in the related field for a long time, and develops a new way for solving the problem in the field of electronic commerce.
Disclosure of Invention
It is a primary object of the present application to solve at least one of the above problems and provide an advertisement keyword generating method and corresponding apparatus, computer device, and computer-readable storage medium.
In order to meet the purposes of the application, the application adopts the following technical scheme:
the application provides an advertisement keyword generation method which is suitable for one of the purposes of the application, and comprises the following steps:
acquiring text information of an advertisement commodity to be put in, recalling a core product word set matched with the text information in a preset core product word stock, wherein the text information comprises description information of the commodity;
obtaining derivative keywords related to the core product words according to the core product words in the core product word set, and adding the derivative keywords serving as expanded core product words into the core product word set;
forming candidate advertisement keywords by using core product words in the core product word set and intention keywords in a preset search intention word library, and summarizing the candidate advertisement keywords to form a candidate keyword set, wherein the intention keywords are feature description information comprising roles for supplying the advertisement commodities to be put in;
and determining recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set, and screening advertisement keywords with recommendation scores meeting preset conditions.
In a further embodiment, derived keywords related to the core product words are obtained according to the core product words in the core product word set, and the derived keywords are added to the core product word set as extended core product words, and the method comprises the following steps:
identifying the core product words which are expanded in the core product word set and belong to the brand words by adopting a named entity identification model as the bid brand words, and filtering the bid brand words in the core product word set;
and matching the core product words in the core product word set with irrelevant words in a preset irrelevant word library, and filtering core product words matched with the irrelevant words in the core product word set.
In a further embodiment, determining a recommendation score corresponding to each candidate advertisement keyword in the candidate keyword set includes the following steps:
obtaining a search quantity score and a competition score corresponding to each candidate advertisement keyword in the candidate keyword set;
splicing all core product words in the core product word set to form a core text, and determining a corresponding relevance score between the core text and the core product word in each candidate advertisement keyword in the candidate keyword set;
And determining the recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set according to the search quantity scores, the competition scores and the relevance scores.
In a further embodiment, after summarizing candidate advertisement keywords to form a candidate keyword set, the method comprises the following steps:
obtaining derivative keywords related to the candidate advertisement keywords according to the candidate advertisement keywords in the candidate keyword set, and adding the derivative keywords serving as expanded candidate advertisement keywords into the candidate keyword set;
identifying candidate advertisement keywords which are expanded in the candidate keyword set and belong to brand words by adopting a named entity identification model as bid brand words, and filtering the bid brand words in the candidate keyword set;
and matching the candidate advertisement keywords in the candidate keyword set with irrelevant words in a preset irrelevant word library, and filtering candidate advertisement keywords matched with the irrelevant words in the candidate keyword set.
In a further embodiment, after screening out advertisement keywords whose recommendation scores meet a preset condition, the method includes the following steps:
clustering the advertisement keywords by adopting a clustering algorithm to determine the class clusters to which the advertisement keywords belong;
The advertisement keywords belonging to the same class cluster are collected to form keyword subsets, and the keyword subsets are summarized to form a keyword recommendation set.
In a further embodiment, after screening out advertisement keywords whose recommendation scores meet a preset condition, the method includes the following steps:
generating advertisement texts according to the advertisement keywords by adopting a preset text generation model;
and scoring the advertisement text to obtain the quality score of the advertisement text.
In a further embodiment, before generating the advertisement marketing text according to the advertisement keyword by using a preset text generation model, the method includes the following steps:
acquiring a single training sample and a supervision tag thereof from a prepared training set, wherein the training sample is an advertisement keyword corresponding to the commodity of the advertisement, and the supervision tag is an advertisement text used when the commodity of the training sample is advertised;
inputting the training sample into a text generation model, extracting corresponding deep semantic information, and generating a predictive advertisement text based on the deep semantic information;
and determining a loss value of the predicted advertisement text by adopting a supervision tag of the training sample, updating the weight of the text generation model when the loss value does not reach a preset threshold, and continuously calling other training samples to perform iterative training until the text generation model converges.
On the other hand, the advertisement keyword generating device provided by the application, which is suitable for one of the purposes of the application, comprises a word set recall module, a word set expansion module, a candidate construction module and a keyword screening module, wherein the word set recall module is used for acquiring text information of an advertisement commodity to be put in, recalling a core product word set matched with the text information in a preset core product word stock, and the text information comprises description information of the commodity; the word set expansion module is used for acquiring derivative keywords related to the core product words according to the core product words in the core product word set, and adding the derivative keywords serving as expanded core product words into the core product word set; the candidate construction module is used for forming candidate advertisement keywords by using core product words in the core product word set and intention keywords in a preset search intention word library, summarizing the candidate advertisement keywords to form a candidate keyword set, wherein the intention keywords are feature description information comprising roles for supplying the advertisement commodities to be put in; and the keyword screening module is used for determining recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set and screening advertisement keywords with recommendation scores meeting preset conditions.
In a further embodiment, the vocabulary extension module includes: the first bidding product brand word filtering sub-module is used for identifying the core product words which are expanded in the core product word set and belong to brand words as bidding product brand words by adopting a named entity identification model, and filtering the bidding product brand words in the core product word set; and the first irrelevant word filtering sub-module is used for matching the core product words in the core product word set with irrelevant words in a preset irrelevant word library and filtering the core product words matched with the irrelevant words in the core product word set.
In a further embodiment, the keyword screening module includes: the branch acquisition sub-module is used for acquiring search quantity scores and competition scores corresponding to each candidate advertisement keyword in the candidate keyword set; the correlation score determining submodule is used for splicing all core product words in the core product word set to form a core text, and determining the correlation score corresponding to the core text and the core product word in each candidate advertisement keyword in the candidate keyword set; and the score determining submodule is used for determining the recommendation score corresponding to each candidate advertisement keyword in the candidate keyword set according to the search quantity score, the competition score and the relevance score.
In a further embodiment, after the candidate construction module, the candidate construction module includes: the word set expansion sub-module is used for acquiring derivative keywords related to the candidate advertisement keywords according to the candidate advertisement keywords in the candidate keyword set, and adding the derivative keywords serving as expanded candidate advertisement keywords into the candidate keyword set; the first bid brand word filtering sub-module is used for adopting a named entity recognition model to recognize candidate advertisement keywords which are expanded in the candidate keyword set and belong to brand words as bid brand words, and filtering the bid brand words in the candidate keyword set; and the second irrelevant word filtering sub-module is used for matching the candidate advertisement keywords in the candidate keyword set with irrelevant words in a preset irrelevant word library and filtering the candidate advertisement keywords matched with the irrelevant words in the candidate keyword set.
In a further embodiment, the keyword screening module further includes: the clustering sub-module is used for clustering the advertisement keywords by adopting a clustering algorithm to determine the class clusters to which the advertisement keywords belong; and the collection construction sub-module is used for collecting advertisement keywords belonging to the same class of clusters to form keyword subsets, and summarizing all the keyword subsets to construct a keyword recommendation collection.
In a further embodiment, the keyword screening module further includes: the text generation sub-module is used for generating advertisement text according to the advertisement keywords by adopting a preset text generation model; and the text scoring module is used for scoring the advertisement text and obtaining the quality score of the advertisement text.
In a further embodiment, before the text generation submodule, the text generation submodule includes: the sample acquisition sub-module is used for acquiring a single training sample and a supervision label thereof from a prepared training set, wherein the training sample is an advertisement keyword corresponding to the advertised commodity, and the supervision label is an advertisement text used when the commodity of the training sample is advertised; the prediction generation sub-module is used for inputting the training sample into a text generation model, extracting corresponding deep semantic information and generating a prediction advertisement text based on the deep semantic information; and the iterative training sub-module is used for determining the loss value of the predicted advertisement text by adopting the supervision label of the training sample, updating the weight of the text generation model when the loss value does not reach a preset threshold value, and continuously calling other training samples to perform iterative training until the text generation model converges.
In yet another aspect, a computer device adapted to one of the objects of the present application comprises a central processor and a memory, said central processor being adapted to invoke the steps of running a computer program stored in said memory to perform the advertisement keyword generation method of the present application.
In yet another aspect, a computer readable storage medium adapted to another object of the present application stores a computer program implemented according to the advertisement keyword generation method in the form of computer readable instructions, which when invoked by a computer, performs the steps included in the method.
The technical scheme of the application has various advantages, including but not limited to the following aspects:
according to the application, a core product word set matched with text information in a preset core product word library is recalled based on the text information of the commodity to be put, related derivative keywords are obtained according to the core product words in the core product word set, the derivative keywords are added into the core product word set as expanded core product words, candidate advertisement keywords are formed by the core product words in the core product word set and the intention keywords in the preset search intention word library, the candidate advertisement keywords are summarized to form a candidate keyword set, the intention keywords are feature description information of characters for supplying the commodity to be put, recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set are determined, and advertisement keywords with recommendation scores meeting preset conditions are screened out. Based on the intention keywords contained in the advertisement keywords, the advertisement keywords can be ensured to be searched by the purchasing enterprises to be hit, the advertisements put by the supplier enterprises are exposed, and the expected effect of advertisement putting is achieved.
Drawings
The foregoing and/or additional aspects and advantages of the application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a flow chart of an exemplary embodiment of an advertising keyword generation method of the present application;
FIG. 2 is a flow chart of filtering irrelevant words and bid-brand words in a core product word set in an embodiment of the application;
FIG. 3 is a flowchart illustrating a method for determining recommendation scores corresponding to each candidate advertisement keyword in a candidate keyword set according to an embodiment of the present application;
FIG. 4 is a schematic flow chart of expanding a candidate advertisement keyword set, filtering irrelevant words and bid-for-brand words in the candidate advertisement keyword set according to an embodiment of the present application;
FIG. 5 is a schematic flow chart of aggregating similar advertisement keywords to construct a keyword recommendation set in an embodiment of the present application;
FIG. 6 is a schematic flow chart of generating advertisement text according to advertisement keywords and scoring the advertisement text in an embodiment of the present application;
FIG. 7 is a flow chart of training a text generation model in an embodiment of the application;
FIG. 8 is a schematic block diagram of an advertising keyword generating apparatus of the present application;
fig. 9 is a schematic structural diagram of a computer device used in the present application.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.
It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As used herein, "client," "terminal device," and "terminal device" are understood by those skilled in the art to include both devices that include only wireless signal receivers without transmitting capabilities and devices that include receiving and transmitting hardware capable of two-way communication over a two-way communication link. Such a device may include: a cellular or other communication device such as a personal computer, tablet, or the like, having a single-line display or a multi-line display or a cellular or other communication device without a multi-line display; a PCS (Personal Communications Service, personal communication system) that may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant ) that can include a radio frequency receiver, pager, internet/intranet access, web browser, notepad, calendar and/or GPS (Global Positioning System ) receiver; a conventional laptop and/or palmtop computer or other appliance that has and/or includes a radio frequency receiver. As used herein, "client," "terminal device" may be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or adapted and/or configured to operate locally and/or in a distributed fashion, at any other location(s) on earth and/or in space. As used herein, a "client," "terminal device," or "terminal device" may also be a communication terminal, an internet terminal, or a music/video playing terminal, for example, a PDA, a MID (Mobile Internet Device ), and/or a mobile phone with music/video playing function, or may also be a device such as a smart tv, a set top box, or the like.
The application refers to hardware such as a server, a client, a service node, and the like, which essentially is an electronic device with personal computer and other functions, and is a hardware device with necessary components disclosed by von neumann principles such as a central processing unit (including an arithmetic unit and a controller), a memory, an input device, an output device, and the like, wherein a computer program is stored in the memory, and the central processing unit calls the program stored in the memory to run, executes instructions in the program, and interacts with the input and output devices, thereby completing specific functions.
It should be noted that the concept of the present application, called "server", is equally applicable to the case of server clusters. The servers should be logically partitioned, physically separate from each other but interface-callable, or integrated into a physical computer or group of computers, according to network deployment principles understood by those skilled in the art. Those skilled in the art will appreciate this variation and should not be construed as limiting the implementation of the network deployment approach of the present application.
One or more technical features of the present application, unless specified in the clear, may be deployed either on a server for implementation and the client remotely invokes an online service interface provided by the acquisition server for implementation of the access, or may be deployed and run directly on the client for implementation of the access.
The neural network model cited or possibly cited in the application can be deployed on a remote server and can be used for implementing remote call on a client, or can be deployed on a client with sufficient equipment capability for direct call, unless specified by plaintext, and in some embodiments, when the neural network model runs on the client, the corresponding intelligence can be obtained through migration learning so as to reduce the requirement on the running resources of the hardware of the client and avoid excessively occupying the running resources of the hardware of the client.
The various data related to the present application, unless specified in the plain text, may be stored either remotely in a server or in a local terminal device, as long as it is suitable for being invoked by the technical solution of the present application.
Those skilled in the art will appreciate that: although the various methods of the present application are described based on the same concepts so as to be common to each other, the methods may be performed independently of each other unless specifically indicated otherwise. Similarly, for the various embodiments disclosed herein, all concepts described herein are presented based on the same general inventive concept, and thus, concepts described herein with respect to the same general inventive concept, and concepts that are merely convenient and appropriately modified, although different, should be interpreted as equivalents.
The various embodiments of the present application to be disclosed herein, unless the plain text indicates a mutually exclusive relationship with each other, the technical features related to the various embodiments may be cross-combined to flexibly construct a new embodiment as long as such combination does not depart from the inventive spirit of the present application and can satisfy the needs in the art or solve the deficiencies in the prior art. This variant will be known to the person skilled in the art.
The advertisement keyword generating method of the present application may be programmed as a computer program product and deployed in a client or a server for execution, for example, in the exemplary application scenario of the present application, may be deployed in a server of an e-commerce platform, so that the method may be executed by accessing an interface that is opened after the computer program product is executed, and performing man-machine interaction with a process of the computer program product through a graphical user interface.
Referring to fig. 1, the advertisement keyword generating method of the present application, in an exemplary embodiment thereof, includes the following steps:
step S1100, acquiring text information of an advertisement commodity to be put in, and recalling a core product word set matched with the text information in a preset core product word stock, wherein the text information comprises description information of the commodity;
According to the unique identification of the commodity to be advertised, text information of the commodity to be advertised is obtained from a commodity database, the text information can be any one or more of commodity titles, commodity detail texts, commodity prices, product parameters, class labels and the like, the unique identification distinguishes and represents different commodities, and the user can flexibly set the unique identification, such as commodity IDs.
The text information of the commercial product to be put in is segmented to obtain a commercial word segmentation sequence, each commercial word in the commercial word segmentation sequence is matched with a core product word in a core product word bank, the matched core product word is determined to be recalled from the core product word bank, all recalled core product words form a core product word set, the matching can be precise matching and/or semantic matching, the precise matching means that the commercial word is identical to the core product word, the semantic matching means that the commercial word is identical to the core product word on the semantic level, any one of, for example bf, kmp, bm, sunday, rk, and the like can be realized by adopting a character precise matching algorithm, the semantic matching can adopt a text feature extraction model to determine the semantic features of vectorized representation corresponding to the commercial word and the core product word, the vector distance algorithm is adopted to determine the vector distance between the semantic features of vectorized representation corresponding to the commercial word and the core product word as the similarity, and when the similarity is larger than a preset threshold, the commercial word is confirmed to be matched with the core product word, and otherwise, the commercial word is confirmed to be not matched with the core product word on the semantic level. The text feature extraction model may be any of BERT, RNN, biLSTM, biGRU, roBERTa, ALBert, ERNIE, BERT-WWM, etc., and will not be described in detail, as the training process of these models is known in the art. The vector distance algorithm can be any one of cosine similarity algorithm, euclidean distance algorithm, pearson correlation coefficient algorithm, jacquard coefficient algorithm and the like. The preset threshold may be set as desired by one skilled in the art.
The core product word stock is constructed in advance, in one embodiment, corresponding target commodity search results are determined from commodity search results obtained by responding to commodity search requests each time by a B2B cross-border electronic commerce platform, the target commodity search results comprise at least one commodity triggering user behaviors, text information of the commodity triggering the user behaviors in the target commodity search results is obtained, search texts according to the target commodity search results are obtained, the text information corresponding to all the target commodity search results are summarized for word segmentation, a commodity word sequence is obtained, the search texts corresponding to all the target commodity search results are summarized for word segmentation, a search word sequence is obtained, word frequency corresponding to each commodity word in the commodity word sequence is counted, word frequency corresponding to each search word in the search word sequence is counted, commodity word segments with word frequency exceeding a preset threshold in the commodity word sequence are selected as core products, and search word sequences with word frequency exceeding the preset threshold in the search analysis sequence are summarized for the core product word, and the core product is formed. The user behavior includes any one or more of clicking, purchasing, joining a shopping cart, sharing, praying, collecting, etc. The search text is text input by a user of the B2B cross-border electronic commerce platform when searching for commodities, and the role of the user is generally that of an enterprise belonging to a buyer. The preset threshold may be flexibly set by one skilled in the art based on a priori knowledge or experimental data. In this embodiment, based on the target commodity search result according with the expectation of the user searching for the commodity, the common search word segmentation when searching for the commodity and the commodity word segmentation likely to be used for searching for the commodity are determined, and the core product word library is respectively constructed as the core product words, so that the core product words in the core product word library have high search hit rate.
Step 1200, obtaining derivative keywords related to the core product words according to the core product words in the core product word set, and adding the derivative keywords as extended core product words in the core product word set;
and submitting each core product word in the core product word set to an interface by adopting a Google AdWords (keyword planner) interface, setting parameters including a search duration range, language and equipment, wherein the search duration range in the parameters is nearly one year, the language is English and the equipment is all equipment according to the needs of a person skilled in the art, and further, after the interface receives the set parameters and the core product word, determining a plurality of search keywords with stronger correlation of each core product word according to the correlation between each core product word and the search keywords in a Google search word library by taking search data of a Google browser as a basis, sorting each search keyword according to descending order of the correlation of each search keyword, controlling the interface to return N search keywords with the front of each core product word, wherein the N search keywords are derivative keywords corresponding to the core product word, and taking the derivative keywords as the product words in the core product set by the method according to the needs of the technical skill.
Step S1300, forming candidate advertisement keywords by using core product words in the core product word set and intention keywords in a preset search intention word library, and summarizing the candidate advertisement keywords to form a candidate keyword set, wherein the intention keywords are feature description information comprising roles for supplying the advertisement commodities to be put in;
the search intention word library is constructed in advance, in one embodiment, from commodity search results obtained by responding to commodity search requests each time by a B2B cross-border electronic commerce platform, search texts corresponding to all commodity search results are determined, word segmentation is carried out on the search texts corresponding to all commodity search results, a search word sequence is obtained, word frequencies corresponding to each search word in the search word sequence are counted, in order to filter search words under the conditions of wrong words, missed words, extremely unusual use and the like, data cleaning operation is carried out on the search word sequence by adopting a preset threshold value, search words with word frequencies smaller than the preset threshold value in the search word sequence are filtered, then clustering is carried out on the search words in the search word sequence, clusters corresponding to each search word are determined, search words closest to the center point of each cluster are used as cluster representatives, whether each cluster representative takes the role of searching the enterprise belonging to the provider is represented by adopting a text classification model, and the role of the enterprise belonging to the purchasing enterprise is usually determined. And determining that the class cluster representatives are search intents representing the roles of users in searching enterprises belonging to suppliers, and taking all search segmentation words in the class clusters in which the class cluster representatives are located as intention keywords. The search text is text input by a user of the B2B cross-border electronic commerce platform when searching for commodities, and the role of the user is generally that of an enterprise belonging to a buyer. The preset threshold is not too large and can be flexibly set by a person skilled in the art according to priori knowledge or experimental data, for example, the preset threshold is 5. To facilitate understanding of the exemplary example, the intent keywords in the search intent word stock are manufacture, wholesale, supplier, factory and the like. The text classification model includes a text feature extraction layer adapted to extract semantics of the input text for vector representation and a classifier, which may be selected from a variety of known models including, but not limited to BERT, RNN, biLSTM, biGRU, roBERTa, ALBert, ERNIE, BERT-WWM, etc. The classifier is suitable for a bi-classification task, which may be MLP (feed forward neural network) or FC (fully connected layer).
In one embodiment, a plurality of search keywords are collected, the search keywords are search texts input by users during commodity searching, the search texts are subjected to word segmentation to obtain search segmentation words, the single search keywords are used as training samples, and supervision labels of the training samples are correspondingly marked according to whether the search keywords represent search intention of the users in searching the roles of enterprises belonging to suppliers or not. Invoking a single training sample to be input into the text classification model, predicting whether the training sample characterizes the search intention of a user to search for the role of an enterprise belonging to a provider, obtaining a corresponding classification result, wherein the characterization of the classification result is yes or no, invoking a preset cross entropy loss function, wherein the cross entropy loss value corresponding to the classification result can be flexibly set by a person skilled in the art according to priori knowledge or experimental experience, and is calculated based on a supervision tag according to the training sample, when the cross entropy loss value reaches a preset threshold value, the text classification model is trained to a convergence state, so that the text classification model training can be terminated; when the cross entropy loss value does not reach the preset threshold value, the text classification model is indicated to be not converged, gradient update is carried out on the model according to the cross entropy loss value, the model is further approximated to be converged by correcting weight parameters of each link of the model through back propagation, and then other training samples are continuously called to carry out iterative training on the text classification model until the model is trained to be in a converged state.
And splicing each core product word in the core product word set with at least one intention keyword in a preset search intention word library to form candidate advertisement keywords, and collecting all the candidate advertisement keyword to form a candidate advertisement word set.
And S1400, determining recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set, and screening advertisement keywords with recommendation scores meeting preset conditions.
In one embodiment, for each expanded core product word in the core product word set, a relevance rank corresponding to each expanded core product word returned by the keyword planner tool may be obtained, a corresponding relevance score is determined according to the relevance rank, and one skilled in the art may flexibly implement the determination of the relevance score, where an exemplary formula is as follows:
wherein: the relateScore is the relevance score of the extended core product words, N is the number of the extended core product words, i is the relevance ranking of the extended core product words, max is the full-scale relevance score, and can be set by a person skilled in the art as required. Furthermore, the relevance score for each non-expanded core product word of the core product word set is a full-scale relevance score.
Because the corresponding intention keywords in each candidate advertisement keyword in the candidate keyword set are the same in representation, namely the semantics of the intention keyword parts among the candidate advertisement keywords are all relevant, the relevance score corresponding to each candidate advertisement keyword is equivalent to the relevance score of the core product word in the candidate advertisement keyword, and the relevance score corresponding to each candidate advertisement keyword is determined according to the relevance score corresponding to the core product word in the core product word set.
And determining the search quantity score and the competition score corresponding to each candidate keyword in the candidate keyword set by using the keyword planner tool. The search amount score is used to represent the number of searches that the candidate keyword is used to search, and the competition degree score is used to represent the competition degree that the candidate keyword is used to search. The higher the competition score, the more intense the competition, the higher the risk and the more funds the competition spends. Accordingly, corresponding recommendation scores are determined according to the search quantity scores, the competition scores and the relevance scores corresponding to the candidate keywords, and an exemplary formula is as follows:
TotalScore=X*SearchScore-Y*CompeteScore+Z*RelateScore
wherein: total score is the recommendation score of the candidate keywords, searchscore is the search amount score of the candidate keywords, competescore is the competition score of the candidate keywords, relatescore is the relevance score of the candidate keywords, and X, Y, Z is the weight corresponding to the search amount score, the competition score and the relevance score.
In one embodiment, a plurality of candidate advertisement keywords with recommendation scores greater than a preset threshold are selected as advertisement keywords according to recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set, and the preset threshold can be set by a person skilled in the art as required. In another embodiment, all candidate advertisement keywords are ranked according to the recommendation score corresponding to each candidate advertisement keyword in the candidate keyword set from the descending order of the score from high to low, and N candidate advertisement keywords with the top ranking are screened out as advertisement keywords, where N can be set by a person skilled in the art as required.
As can be appreciated from the exemplary embodiments of the present application, the technical solution of the present application has various advantages, including but not limited to the following aspects:
according to the application, a core product word set matched with text information in a preset core product word library is recalled based on the text information of the commodity to be put, related derivative keywords are obtained according to the core product words in the core product word set, the derivative keywords are added into the core product word set as expanded core product words, candidate advertisement keywords are formed by the core product words in the core product word set and the intention keywords in the preset search intention word library, the candidate advertisement keywords are summarized to form a candidate keyword set, the intention keywords are feature description information of characters for supplying the commodity to be put, recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set are determined, and advertisement keywords with recommendation scores meeting preset conditions are screened out. Based on the intention keywords contained in the advertisement keywords, the advertisement keywords can be ensured to be searched by the purchasing enterprises to be hit, the advertisements put by the supplier enterprises are exposed, and the expected effect of advertisement putting is achieved.
Referring to fig. 2, in a further embodiment, step S1200, obtaining derivative keywords related to the core product words according to the core product words in the core product word set, and adding the derivative keywords as extended core product words to the core product word set, includes the following steps:
step S1201, recognizing the core product words which are expanded in the core product word set and belong to the brand words by adopting a named entity recognition model as the bid brand words, and filtering the bid brand words in the core product word set;
the named entity recognition model is suitable for named entity recognition tasks, and specific model selection can be Roberta+CRF, biLSTM+CRF, IDCNN+CRF, bert+BiLSTM+CRF, FLAT, etc., and the training process of these models is known in the art and will not be described in detail. One skilled in the art can select a model according to the need, and after the named entity model is trained to be converged, the ability of identifying whether the core product word belongs to the brand word can be obtained.
It may be understood that the extended core product words belonging to the brand words may appear in the core product word set, and these brand words are most likely brand words of the bid product, which may cause negative effects when applied to advertisement delivery, and may not achieve the expected advertisement delivery effect. The category is obtained by adopting one of BIO, BIOES, BMES labeling methods based on the entity type. The entity type is used for classifying various types of core product words, including non-brand words and brand words. And determining whether each expanded core product word in the core product word set belongs to a brand word according to the entity type, thereby regarding the core product word belonging to the brand word as a bid brand word, deleting the bid brand word from the core product word set, and filtering the bid brand word.
Step 1202, matching the core product words in the core product word set with irrelevant words in a preset irrelevant word bank, and filtering core product words matched with the irrelevant words in the core product word set.
The method can manually collect irrelevant words irrelevant to the intention of purchased goods, and collect the irrelevant words to form an irrelevant word stock. For ease of understanding, the exemplary example, the irrelevant word is "nearest", which is related to "electric bikerepair" commodity, constituting "electric bikerepair farm", but the obvious irrelevant word "near me" indicates a service mechanism to find the periphery, and there is no intention to purchase the commodity.
It can be appreciated that the core product word set may have core product words belonging to irrelevant words, and application of the irrelevant words to advertisement placement may cause negative effects, and expected advertisement placement results may not be achieved. The accurate matching means that the irrelevant word is identical to the core product word, the semantic matching means that the irrelevant word is identical to the core product word in terms of semantic level, the accurate matching can adopt a character accurate matching algorithm to realize any one of, for example bf, kmp, bm, sunday, rk, the semantic matching can adopt a text feature extraction model to determine the semantic features of vectorized representation corresponding to the irrelevant word and the core product word, a vector distance algorithm is adopted to determine the vector distance between the semantic features of vectorized representation corresponding to the irrelevant word and the core product word as the similarity, and when the similarity is larger than a preset threshold value, the irrelevant word and the core product word are confirmed to be matched, otherwise, the irrelevant word and the core product word are confirmed to be not matched. The text feature extraction model may be any of BERT, RNN, biLSTM, biGRU, roBERTa, ALBert, ERNIE, BERT-WWM, etc., and will not be described in detail, as the training process of these models is known in the art. The vector distance algorithm can be any one of cosine similarity algorithm, euclidean distance algorithm, pearson correlation coefficient algorithm, jacquard coefficient algorithm and the like. The preset threshold may be set as desired by one skilled in the art.
In the embodiment, the named entity recognition model and the irrelevant word library are adopted to determine the core product words corresponding to the bid brand words and the irrelevant words in the core product word set, and the filtering is carried out, so that the data cleaning is realized, and a foundation is laid for the expected effectiveness of advertisement delivery.
Referring to fig. 3, in a further embodiment, step S1400 of determining a recommendation score corresponding to each candidate advertisement keyword in the candidate keyword set includes the following steps:
step S1410, obtaining a search amount score and a competition score corresponding to each candidate advertisement keyword in the candidate keyword set;
and determining the search quantity score and the competition score corresponding to each candidate keyword in the candidate keyword set by using the keyword planner tool. The search amount score is used to represent the number of searches that the candidate keyword is used to search, and the competition degree score is used to represent the competition degree that the candidate keyword is used to search. The higher the competition score, the more intense the competition, the higher the risk and the more funds the competition spends.
Step S1420, splicing all core product words in the core product word set to form a core text, and determining a corresponding relevance score between the core text and the core product word in each candidate advertisement keyword in the candidate keyword set;
Because the corresponding intention keywords in each candidate advertisement keyword in the candidate keyword set are the same in representation, namely the semantics of the intention keyword parts among the candidate advertisement keywords are all relevant, the relevance score corresponding to each candidate advertisement keyword is equivalent to the relevance score of the core product word in the candidate advertisement keyword, accordingly, the deep semantic information representing the semantics of the core text is extracted by adopting a text feature extraction model, the core text vector representing the deep semantic information is obtained by corresponding vectorization, the deep semantic information representing the semantics corresponding to the core product word in each candidate advertisement keyword in the candidate keyword set is extracted, the keyword vector representing the deep semantic information is obtained by corresponding vectorization, the vector distance between the core text vector and the keyword vector corresponding to each core product word is determined by adopting a vector distance algorithm, and the relevance score is taken as the relevance score, so that the accuracy and reliability of the core text containing the core product word in the candidate keywords can be ensured. The text feature extraction model may be any of BERT, RNN, biLSTM, biGRU, roBERTa, ALBert, ERNIE, BERT-WWM, etc., and will not be described in detail, as the training process of these models is known in the art. The vector distance algorithm can be any one of cosine similarity algorithm, euclidean distance algorithm, pearson correlation coefficient algorithm, jacquard coefficient algorithm and the like. The preset threshold may be set as desired by one skilled in the art.
Step S1430, determining recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set according to the search amount scores, the competition scores and the correlation scores.
According to the difference of the reference values of the search quantity score, the competition score and the correlation score corresponding to the effect generated by advertisement delivery, presetting a weight corresponding to each score, and calculating a corresponding recommendation score according to the search quantity score, the competition score, the correlation score and the weights corresponding to the search quantity score, the competition score and the correlation score, wherein an exemplary formula is as follows:
TotalScore=X*SearchScore-Y*CompeteScore+Z*RelateScore
wherein: total score is the recommendation score of the candidate keywords, searchscore is the search amount score of the candidate keywords, competescore is the competition score of the candidate keywords, relatescore is the relevance score of the candidate keywords, and X, Y, Z is the weight corresponding to the search amount score, the competition score and the relevance score.
In this embodiment, by determining the search amount score, the competition score, and the relevance score corresponding to each candidate advertisement keyword in the candidate keyword set, a corresponding recommendation score is further obtained, so as to ensure that the recommendation score can accurately and reliably reflect the success that can be generated by using the candidate advertisement keywords for advertisement delivery.
Referring to fig. 4, in a further embodiment, after step S1300 of summarizing candidate advertisement keywords to form a candidate keyword set, the method includes the following steps:
step S1301, obtaining derivative keywords related to the candidate advertisement keywords according to the candidate advertisement keywords in the candidate keyword set, and adding the derivative keywords serving as expanded candidate advertisement keywords in the candidate keyword set;
and submitting each candidate advertisement keyword in the candidate keyword set to an interface by adopting a Google AdWords (keyword planner) interface, setting parameters including a search duration range, language and equipment, wherein the search duration range in the parameters is nearly one year, the language is English and the equipment is all equipment according to the needs, exemplary examples are shown, after the interface receives the set parameters and the candidate advertisement keywords, the interface uses search data of a Google browser as a basis, a plurality of search keywords with stronger correlation of each candidate advertisement keyword is determined according to the correlation between each candidate advertisement keyword and the search keywords in a Google search word library, each search keyword is sequenced according to descending order of strong to weak correlation of each candidate advertisement keyword, the control interface returns N search keywords with front sequences corresponding to each candidate advertisement keyword, and the N search keywords are derivative keywords corresponding to the candidate advertisement keywords.
Step S1302, a named entity recognition model is adopted to recognize that candidate advertisement keywords which are expanded in the candidate keyword set and belong to brand words are used as bid brand words, and the bid brand words in the candidate keyword set are filtered;
the named entity recognition model is suitable for named entity recognition tasks, and specific model selection can be Roberta+CRF, biLSTM+CRF, IDCNN+CRF, bert+BiLSTM+CRF, FLAT, etc., and the training process of these models is known in the art and will not be described in detail. One skilled in the art can select a model according to the need, and after the named entity model is trained to be converged, the ability of identifying whether the candidate advertisement keywords belong to brand words can be obtained.
It may be understood that the expanded candidate advertisement keywords belonging to the brand words may appear in the candidate advertisement keyword set, and these brand words are very likely brand words of the bid product, and the application to advertisement delivery may cause negative effects, and the expected advertisement delivery effect cannot be obtained. The category is obtained by adopting one of BIO, BIOES, BMES labeling methods based on the entity type. The entity type is used for classifying candidate advertisement keywords of various types, including non-brand words and brand words. And determining whether each expanded candidate advertisement keyword in the candidate advertisement keyword set belongs to the brand word according to the entity type, thereby regarding the candidate advertisement keywords belonging to the brand word as the bid brand word, and deleting the bid brand word from the candidate advertisement keyword set to filter the bid brand word.
Step S1303, matching the candidate advertisement keywords in the candidate keyword set with irrelevant words in a preset irrelevant word library, and filtering candidate advertisement keywords matched with the irrelevant words in the candidate keyword set.
The method can manually collect irrelevant words irrelevant to the intention of purchased goods, and collect the irrelevant words to form an irrelevant word stock. For ease of understanding, the exemplary example, the irrelevant word is "nearest", which is related to "electric bikerepair" commodity, constituting "electric bikerepair farm", but the obvious "nearest" is indicative of a service mechanism to find the periphery, and there is no intention to purchase the commodity.
It can be appreciated that candidate advertisement keywords belonging to irrelevant words may appear in the candidate advertisement keyword set, application of the irrelevant words to advertisement delivery may cause negative effects, and expected advertisement delivery results cannot be obtained. The accurate matching means that the irrelevant word is identical to the candidate advertisement keyword, the semantic matching means that the irrelevant word is identical to the candidate advertisement keyword on the semantic level, the accurate matching can adopt a character accurate matching algorithm to realize any one of the semantic features such as bf, kmp, bm, sunday, rk, the semantic matching can adopt a text feature extraction model to determine the semantic features of the vectorized representation corresponding to the irrelevant word and the candidate advertisement keyword, a vector distance algorithm is adopted to determine the vector distance between the irrelevant word and the semantic features of the vectorized representation corresponding to the candidate advertisement keyword as the similarity, and when the similarity is larger than a preset threshold, the irrelevant word is confirmed to be matched with the candidate advertisement keyword, otherwise, the irrelevant word is confirmed to be not matched with the candidate advertisement keyword. The text feature extraction model may be any of BERT, RNN, biLSTM, biGRU, roBERTa, ALBert, ERNIE, BERT-WWM, etc., and will not be described in detail, as the training process of these models is known in the art. The vector distance algorithm can be any one of cosine similarity algorithm, euclidean distance algorithm, pearson correlation coefficient algorithm, jacquard coefficient algorithm and the like. The preset threshold may be set as desired by one skilled in the art.
In this embodiment, after the candidate advertisement keywords in the candidate advertisement keyword set are expanded, the candidate advertisement keywords in the candidate advertisement keyword set, which belong to the bid brand words and correspond to the irrelevant words, are determined by using the named entity recognition model and the irrelevant word library, and are filtered, so that data cleaning is realized, and it is ensured that the generated effect can reach expectations when the candidate advertisement keywords in the candidate advertisement keyword set are used for advertisement delivery.
Referring to fig. 5, in a further embodiment, after screening out the advertisement keywords with recommendation scores satisfying the preset condition in step S1400, the method includes the following steps:
step S1401, clustering the advertisement keywords by adopting a clustering algorithm to determine the class clusters to which the advertisement keywords belong;
the clustering algorithm can be K-means algorithm, GMM Gaussian mixture model clustering algorithm, DBSCAN algorithm, means shift algorithm, spectral clustering algorithm and the like, and can be realized by one skilled in the art according to the need.
In one embodiment, the clustering of the advertisement keywords is implemented by using a K-Means clustering algorithm, the advertisement keywords are used as an input data set of K-Means, central points (not necessarily advertisement keywords in a data set) of K class clusters are randomly determined, each advertisement keyword in the data set is allocated to one class cluster, specifically, a single advertisement keyword is taken as an example, a distance between the advertisement keyword and each central point is calculated, and the advertisement keyword is allocated to the class cluster corresponding to the closest central point. Accordingly, after each advertisement keyword is distributed to a corresponding class cluster, the central point of each class cluster is updated to be the average value of all advertisement keywords in the class cluster, and the above process is repeated until all advertisement keywords in the data set are nearest to the corresponding central point, clustering is completed, and the class cluster to which the advertisement keyword belongs is determined. Further, the K value can be determined through an elbow analysis method, so that the number of clusters of the clusters is more reliable, and the accuracy of the clusters is ensured.
Step S1402, aggregating advertisement keywords belonging to the same cluster to form a keyword subset, and aggregating all the keyword subsets to form a keyword recommendation set.
It is easy to understand that advertisement keywords belonging to the same class of clusters are similar, so that the clusters are grouped together to form keyword subsets, the keyword subsets corresponding to each class of clusters are summarized to form keyword recommendation sets, and when advertisement keywords need to be selected from the keyword recommendation sets, the advertisement keywords in the subset can be selected from the selected keyword subsets first, and then the advertisement keywords in the subset are selected.
In this embodiment, advertisement keywords in the same class of clusters are clustered to form keyword subsets, and all subsets are summarized to form a keyword recommendation set, so that the advertisement keywords can be conveniently and rapidly referred and selected by using the keyword recommendation set.
Referring to fig. 6, in a further embodiment, after screening out the advertisement keywords with recommendation scores satisfying the preset condition in step S1400, the method includes the following steps:
s2410, generating an advertisement text according to the advertisement keywords by adopting a preset text generation model;
the text generation model is trained in advance until convergence, and the capability of generating corresponding advertisement text according to advertisement keywords is learned. The text generation model can be selected from GPT series model, bert model, encoder-Decoder model, transducer model and the like, and can be realized by one skilled in the art according to the need.
In one embodiment, a GPT-3.5 model is adopted as the text generation model, the advertisement keyword is used as the input of the model, the advertisement keyword is input to the coding end in the model, the advertisement keyword is coded by stacking a plurality of layers of multi-head self-attention layers and full-connection layers, specifically, the advertisement keyword is subjected to multi-head attention calculation when passing through the multi-head attention layers, so that different dimensionalities of the advertisement keyword are weighted by self attention to obtain corresponding weighted vector representations, after passing through the full-connection layers, the coding vector representation corresponding to the advertisement keyword is obtained, further, the coding representation is input to the decoding end in the model, the coding representation corresponding to each word is decoded, specifically, the generation probability of each word is calculated according to the generated word, the current word position and the coding representation corresponding to the word, the word with the highest generation probability is selected, and the word generated each time is spliced in sequence to obtain the advertisement text.
Step S2420, scoring the advertisement text to obtain the quality score of the advertisement text.
And scoring the advertisement text by adopting a language model as a scoring device. The language model is a large-scale language model suitable for the NLP field, is trained to be converged by using an extremely large corpus in advance, and has the capability of generating human language and accurate text semantic understanding capability and logic reasoning capability. The language model may be selected from a GPT series model, a Bert model, an Encoder-Decoder model, a transducer model, etc., as one skilled in the art will realize.
The guiding prompt text is edited by the advertisement text and task description in a thinking chain mode, the task description is used for indicating the thinking of the language model step by step, the scores corresponding to the advertisement text are rated from different quality dimensions in marketing, and a person skilled in the art can flexibly set the task description according to the disclosure, wherein the different quality dimensions can be any multiple items such as semantic richness, attraction, expression smoothness and the like. To facilitate an understanding of the exemplary examples:
editing a guiding prompt text by three advertisement texts and task descriptions: "Q: from a step-by-step thinking, from the aspect of semantic richness, attractiveness, and fluency of expression of marketing, which of the following advertisement texts generated from advertisement keywords is most attractive to customers? The corresponding semantic richness, attraction and expression fluency are all fully divided into 10 points, scoring is carried out according to the score, the reason is simply explained, and the score is output in json format.
Advertisement keywords: "electric bike manufacturer"
Advertisement text 1, "Embrace the revolution of electric bike transportation with our trusted manufacturer-! "
Advertisement text 2, "Discover the thrill of eco-friendly rides from the leading electric bike mangafacter ]"
A: the advertisement text 1 contains the keyword electricbike manufacturer, and the metaphor is used to convey a trustworthy meaning, but the language is relatively flat. }"
The task descriptions are the task descriptions except for the corresponding descriptions of the advertisement texts 1 and 2, respectively, and it can be understood that the advertisement text 1 and the advertisement text A are examples of the guiding model, and the advertisement text 2 is a score corresponding to different quality dimensions required to be obtained by the model according to the examples.
In one embodiment, the language model adopts a GPT-3 model, the guiding prompt text is used as input of the model, word segmentation is performed on the guiding prompt text to obtain a corresponding word segmentation sequence, then the word segmentation sequence is input to a coding end in the model, each word in the word segmentation sequence is coded by stacking a plurality of layers of multi-head self-attention layers and full-connection layers, specifically, multi-head attention calculation is performed on each word when the multi-head attention layer passes through the multi-head attention layer, so that self-attention weighting is performed on different dimensions of the word to obtain corresponding weighted vector representations, and after the full-connection layer passes through the full-connection layer, the coded vector representations corresponding to the word segments are obtained, and scores corresponding to assessment titles representing different quality dimensions based on marketing aspects are extracted. Further, inputting the coded representation corresponding to each word in the word segmentation sequence to a decoding end in the model, decoding the coded representation corresponding to each word, specifically, calculating the generation probability of each generated word according to the generated word, the current word position and the coded representation corresponding to the word segmentation in the decoding process, selecting the word with the highest generation probability, and splicing the words generated each time in sequence to obtain the score corresponding to each quality dimension. To facilitate an understanding of the exemplary examples:
{ "advertisement text 1" { "semantic richness score": 5 ": 6": 9 ": reason": the advertisement text contains the keyword "electric bike manufacturer", and metaphors are used to convey the meaning of "revolution" and "trust", giving a positive upward sense to people. However, language expression is relatively flat, which may further increase attractiveness. "},
"advertisement text 2" { "semantic richness score": 7 "," attraction score ": 8", "expression fluency score": 8 ", and" reason ": this advertisement text also contains the keyword" electric bike manufacturer ", and emphasizes the advantage of" reading ". The word "threll" is used in the text to describe the stimulus of riding, which gives an exciting feel to people. The whole expression is smooth and attractive, but can be further improved to improve the attraction ",
further, after scores corresponding to different quality dimensions of the advertisement text are obtained through a scoring device, corresponding quality scores are obtained through summation.
In this embodiment, after the advertisement text of the advertisement keyword is generated through the text generation model, the corresponding quality score is obtained by scoring the advertisement text, so that the corresponding advertisement text and the quality score thereof can be provided based on the advertisement keyword, the cost, time and energy required for editing the advertisement text are reduced, and the user experience is improved.
Referring to fig. 7, in a further embodiment, before generating an advertisement marketing text according to the advertisement keyword using a preset text generation model, step S2410 includes the following steps:
step 2400, acquiring a single training sample and a supervision tag thereof from a prepared training set, wherein the training sample is an advertisement keyword corresponding to an advertisement commodity, and the supervision tag is an advertisement text used when the commodity of the training sample is advertised;
the click rate and the conversion rate corresponding to the advertisement text of the advertised commodity are obtained, the advertisement text with the click rate and the conversion rate respectively reaching the corresponding preset thresholds is screened out, the preset thresholds corresponding to the click rate and the conversion rate are used for dividing the advertisement text with expected effect generated by advertisement delivery, and the preset thresholds corresponding to the click rate and the conversion rate are 0.8 and 0.7 according to the disclosure of the person skilled in the art. Further, according to steps S1100-1400, it is determined that the advertisement keyword corresponding to the advertised commodity is used as a training sample, the advertisement text is labeled as a supervision label of the training sample, and accordingly, a plurality of training samples are prepared and mapped and associated with the supervision labels to form a training set.
Step S2401, inputting the training sample into a text generation model, extracting corresponding deep semantic information, and generating a predicted advertisement text based on the deep semantic information;
the text generation model can be selected from GPT series model, bert model, encoder-Decoder model, transducer model and the like, and can be realized by one skilled in the art according to the need.
In one embodiment, a GPT-3.5 model is adopted as the text generation model, the training sample is used as the input of the model, the training sample is input to the coding end in the model, the training sample is coded by stacking a plurality of layers of multi-head self-attention layers and full-connection layers, specifically, the training sample is subjected to multi-head attention calculation when passing through the multi-head attention layers, so that self-attention weighting is carried out on different dimensions of the training sample to obtain corresponding weighted vector representations, the full-connection layers are further passed, the coding vector representations corresponding to the training sample, namely the vectorization representations of the deep semantic information, are further input to the decoding end in the model, the coding representation corresponding to each word is decoded, specifically, the generation probability of each word is calculated according to the generated word, the current word position and the coding representation corresponding to the word, the word with the highest generation probability is selected, and the word with the highest generation probability is spliced in sequence in the decoding process, so that the predicted advertisement text is obtained.
Step S2402, a supervision label of the training sample is adopted to determine a loss value of the predicted advertisement text, and when the loss value does not reach a preset threshold value, weight updating is carried out on the text generation model, and other training samples are continuously called to carry out iterative training until the text generation model converges.
Invoking a preset cross entropy loss function, wherein the cross entropy loss function can be flexibly set by a person skilled in the art according to priori knowledge or experimental experience, calculating a cross entropy loss value corresponding to the predicted advertisement text based on a supervision label according to the training sample, and when the cross entropy loss value reaches a preset threshold value, indicating that the text generation model is trained to a convergence state, so that the text generation model training can be terminated; when the cross entropy loss value does not reach the preset threshold value, the text generation model is indicated to be not converged, gradient update is carried out on the model according to the cross entropy loss value, the model is further approximated to be converged by correcting weight parameters of each link of the model through back propagation, and then other training samples are continuously called to carry out iterative training on the text generation model until the model is trained to be in a converged state.
In the embodiment, a training process of the text generation model is disclosed, and after training is completed, the capability of generating the corresponding advertisement text according to the advertisement keywords is learned, so that the generated advertisement text can be ensured to meet the expectations when the advertisement is put.
Referring to fig. 8, an advertisement keyword generating apparatus provided in accordance with one of the purposes of the present application is a functional implementation of the advertisement keyword generating method of the present application, and in another aspect, the advertisement keyword generating apparatus provided in accordance with one of the purposes of the present application includes a word set recall module 1100, a word set expansion module 1200, a candidate construction module 1300, and a keyword screening module 1400, where the word set recall module 1100 is configured to obtain text information of an advertisement commodity to be placed, recall a core product word set matching the text information in a preset core product word library, and the text information includes description information of the commodity; the word set expansion module 1200 is configured to obtain derivative keywords related to the core product words according to the core product words in the core product word set, and add the derivative keywords as expanded core product words in the core product word set; the candidate construction module 1300 is configured to form candidate advertisement keywords by using core product words in the core product word set and intention keywords in a preset search intention word library, and aggregate the candidate advertisement keywords to form a candidate keyword set, where the intention keywords are feature description information including characters for supplying the advertisement goods to be put in; the keyword screening module 1400 is configured to determine a recommendation score corresponding to each candidate advertisement keyword in the candidate keyword set, and screen advertisement keywords whose recommendation scores meet a preset condition.
In a further embodiment, the vocabulary extension module 1200 includes: the first bidding product brand word filtering sub-module is used for identifying the core product words which are expanded in the core product word set and belong to brand words as bidding product brand words by adopting a named entity identification model, and filtering the bidding product brand words in the core product word set; and the first irrelevant word filtering sub-module is used for matching the core product words in the core product word set with irrelevant words in a preset irrelevant word library and filtering the core product words matched with the irrelevant words in the core product word set.
In a further embodiment, the keyword screening module 1400 includes: the branch acquisition sub-module is used for acquiring search quantity scores and competition scores corresponding to each candidate advertisement keyword in the candidate keyword set; the correlation score determining submodule is used for splicing all core product words in the core product word set to form a core text, and determining the correlation score corresponding to the core text and the core product word in each candidate advertisement keyword in the candidate keyword set; and the score determining submodule is used for determining the recommendation score corresponding to each candidate advertisement keyword in the candidate keyword set according to the search quantity score, the competition score and the relevance score.
In a further embodiment, the candidate construction module 1300 then comprises: the word set expansion sub-module is used for acquiring derivative keywords related to the candidate advertisement keywords according to the candidate advertisement keywords in the candidate keyword set, and adding the derivative keywords serving as expanded candidate advertisement keywords into the candidate keyword set; the first bid brand word filtering sub-module is used for adopting a named entity recognition model to recognize candidate advertisement keywords which are expanded in the candidate keyword set and belong to brand words as bid brand words, and filtering the bid brand words in the candidate keyword set; and the second irrelevant word filtering sub-module is used for matching the candidate advertisement keywords in the candidate keyword set with irrelevant words in a preset irrelevant word library and filtering the candidate advertisement keywords matched with the irrelevant words in the candidate keyword set.
In a further embodiment, the keyword screening module 1400 further includes: the clustering sub-module is used for clustering the advertisement keywords by adopting a clustering algorithm to determine the class clusters to which the advertisement keywords belong; and the collection construction sub-module is used for collecting advertisement keywords belonging to the same class of clusters to form keyword subsets, and summarizing all the keyword subsets to construct a keyword recommendation collection.
In a further embodiment, the keyword screening module 1400 further includes: the text generation sub-module is used for generating advertisement text according to the advertisement keywords by adopting a preset text generation model; and the text scoring module is used for scoring the advertisement text and obtaining the quality score of the advertisement text.
In a further embodiment, before the text generation submodule, the text generation submodule includes: the sample acquisition sub-module is used for acquiring a single training sample and a supervision label thereof from a prepared training set, wherein the training sample is an advertisement keyword corresponding to the advertised commodity, and the supervision label is an advertisement text used when the commodity of the training sample is advertised; the prediction generation sub-module is used for inputting the training sample into a text generation model, extracting corresponding deep semantic information and generating a prediction advertisement text based on the deep semantic information; and the iterative training sub-module is used for determining the loss value of the predicted advertisement text by adopting the supervision label of the training sample, updating the weight of the text generation model when the loss value does not reach a preset threshold value, and continuously calling other training samples to perform iterative training until the text generation model converges.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. As shown in fig. 9, the internal structure of the computer device is schematically shown. The computer device includes a processor, a computer readable storage medium, a memory, and a network interface connected by a system bus. The computer readable storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store a control information sequence, and the computer readable instructions can enable the processor to realize an advertisement keyword generation method when the computer readable instructions are executed by the processor. The processor of the computer device is used to provide computing and control capabilities, supporting the operation of the entire computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, cause the processor to perform the advertisement keyword generation method of the present application. The network interface of the computer device is for communicating with a terminal connection. It will be appreciated by persons skilled in the art that the architecture shown in fig. 9 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applicable, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
The processor in this embodiment is configured to execute specific functions of each module and its sub-module in fig. 8, and the memory stores program codes and various data required for executing the above modules or sub-modules. The network interface is used for data transmission between the user terminal or the server. The memory in this embodiment stores program codes and data required for executing all modules/sub-modules in the advertisement keyword generation apparatus of the present application, and the server can call the program codes and data of the server to execute the functions of all sub-modules.
The present application also provides a storage medium storing computer readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of the advertisement keyword generation method of any of the embodiments of the present application.
Those skilled in the art will appreciate that all or part of the processes implementing the methods of the above embodiments of the present application may be implemented by a computer program for instructing relevant hardware, where the computer program may be stored on a computer readable storage medium, where the program, when executed, may include processes implementing the embodiments of the methods described above. The storage medium may be a computer readable storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).
In summary, the application can ensure the search hit rate of the advertisement keywords and achieve the expected effect of advertisement delivery.
Those of skill in the art will appreciate that the various operations, methods, steps in the flow, acts, schemes, and alternatives discussed in the present application may be alternated, altered, combined, or eliminated. Further, other steps, means, or steps in a process having various operations, methods, or procedures discussed herein may be alternated, altered, rearranged, disassembled, combined, or eliminated. Further, steps, measures, schemes in the prior art with various operations, methods, flows disclosed in the present application may also be alternated, altered, rearranged, decomposed, combined, or deleted.
The foregoing is only a partial embodiment of the present application, and it should be noted that it will be apparent to those skilled in the art that modifications and adaptations can be made without departing from the principles of the present application, and such modifications and adaptations are intended to be comprehended within the scope of the present application.

Claims (10)

1. The advertisement keyword generation method is characterized by comprising the following steps:
acquiring text information of an advertisement commodity to be put in, recalling a core product word set matched with the text information in a preset core product word stock, wherein the text information comprises description information of the commodity;
Obtaining derivative keywords related to the core product words according to the core product words in the core product word set, and adding the derivative keywords serving as expanded core product words into the core product word set;
forming candidate advertisement keywords by using core product words in the core product word set and intention keywords in a preset search intention word library, and summarizing the candidate advertisement keywords to form a candidate keyword set, wherein the intention keywords are feature description information comprising roles for supplying the advertisement commodities to be put in;
and determining recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set, and screening advertisement keywords with recommendation scores meeting preset conditions.
2. The advertising keyword generation method of claim 1, wherein derived keywords related to the core product words are obtained from the core product words in the core product word set, and are added to the core product word set as expanded core product words, comprising the steps of:
identifying the core product words which are expanded in the core product word set and belong to the brand words by adopting a named entity identification model as the bid brand words, and filtering the bid brand words in the core product word set;
And matching the core product words in the core product word set with irrelevant words in a preset irrelevant word library, and filtering core product words matched with the irrelevant words in the core product word set.
3. The advertising keyword generation method of claim 1, wherein determining a recommendation score corresponding to each candidate advertising keyword in the candidate keyword set comprises the steps of:
obtaining a search quantity score and a competition score corresponding to each candidate advertisement keyword in the candidate keyword set;
splicing all core product words in the core product word set to form a core text, and determining a corresponding relevance score between the core text and the core product word in each candidate advertisement keyword in the candidate keyword set;
and determining the recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set according to the search quantity scores, the competition scores and the relevance scores.
4. The advertising keyword generation method of claim 1, wherein after summarizing candidate advertising keywords to form a candidate keyword set, comprising the steps of:
obtaining derivative keywords related to the candidate advertisement keywords according to the candidate advertisement keywords in the candidate keyword set, and adding the derivative keywords serving as expanded candidate advertisement keywords into the candidate keyword set;
Identifying candidate advertisement keywords which are expanded in the candidate keyword set and belong to brand words by adopting a named entity identification model as bid brand words, and filtering the bid brand words in the candidate keyword set;
and matching the candidate advertisement keywords in the candidate keyword set with irrelevant words in a preset irrelevant word library, and filtering candidate advertisement keywords matched with the irrelevant words in the candidate keyword set.
5. The advertisement keyword generation method according to claim 1, wherein after screening out advertisement keywords whose recommendation scores satisfy a preset condition, comprising the steps of:
clustering the advertisement keywords by adopting a clustering algorithm to determine the class clusters to which the advertisement keywords belong;
the advertisement keywords belonging to the same class cluster are collected to form keyword subsets, and the keyword subsets are summarized to form a keyword recommendation set.
6. The advertisement keyword generation method according to claim 1, wherein after screening out advertisement keywords whose recommendation scores satisfy a preset condition, comprising the steps of:
generating advertisement texts according to the advertisement keywords by adopting a preset text generation model;
and scoring the advertisement text to obtain the quality score of the advertisement text.
7. The advertising keyword generation method of claim 6, wherein before generating advertising marketing text from the advertising keywords using a preset text generation model, comprising the steps of:
acquiring a single training sample and a supervision tag thereof from a prepared training set, wherein the training sample is an advertisement keyword corresponding to the commodity of the advertisement, and the supervision tag is an advertisement text used when the commodity of the training sample is advertised;
inputting the training sample into a text generation model, extracting corresponding deep semantic information, and generating a predictive advertisement text based on the deep semantic information;
and determining a loss value of the predicted advertisement text by adopting a supervision tag of the training sample, updating the weight of the text generation model when the loss value does not reach a preset threshold, and continuously calling other training samples to perform iterative training until the text generation model converges.
8. An advertisement keyword generation apparatus, comprising:
the word set recall module is used for obtaining text information of the commodity to be advertised, recalling a core product word set matched with the text information in a preset core product word library, and the text information comprises description information of the commodity;
The word set expansion module is used for acquiring derivative keywords related to the core product words according to the core product words in the core product word set, and adding the derivative keywords serving as expanded core product words into the core product word set;
the candidate construction module is used for forming candidate advertisement keywords by using core product words in the core product word set and intention keywords in a preset search intention word library, summarizing the candidate advertisement keywords to form a candidate keyword set, wherein the intention keywords are feature description information comprising roles for supplying the advertisement commodities to be put in;
and the keyword screening module is used for determining recommendation scores corresponding to each candidate advertisement keyword in the candidate keyword set and screening advertisement keywords with recommendation scores meeting preset conditions.
9. A computer device comprising a central processor and a memory, characterized in that the central processor is arranged to invoke a computer program stored in the memory for performing the steps of the method according to any of claims 1 to 7.
10. A computer-readable storage medium, characterized in that it stores in the form of computer-readable instructions a computer program implemented according to the method of any one of claims 1 to 7, which, when invoked by a computer, performs the steps comprised by the corresponding method.
CN202310779038.2A 2023-06-28 2023-06-28 Advertisement keyword generation method, device, equipment and medium thereof Pending CN116719924A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310779038.2A CN116719924A (en) 2023-06-28 2023-06-28 Advertisement keyword generation method, device, equipment and medium thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310779038.2A CN116719924A (en) 2023-06-28 2023-06-28 Advertisement keyword generation method, device, equipment and medium thereof

Publications (1)

Publication Number Publication Date
CN116719924A true CN116719924A (en) 2023-09-08

Family

ID=87867780

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310779038.2A Pending CN116719924A (en) 2023-06-28 2023-06-28 Advertisement keyword generation method, device, equipment and medium thereof

Country Status (1)

Country Link
CN (1) CN116719924A (en)

Similar Documents

Publication Publication Date Title
US20200302340A1 (en) Systems and methods for learning user representations for open vocabulary data sets
CN109189904A (en) Individuation search method and system
US20230102337A1 (en) Method and apparatus for training recommendation model, computer device, and storage medium
US20130054552A1 (en) Automated search for detecting patterns and sequences in data using a spatial and temporal memory system
US20130054495A1 (en) Encoding of data for processing in a spatial and temporal memory system
CN109471978B (en) Electronic resource recommendation method and device
CN116521906B (en) Meta description generation method, device, equipment and medium thereof
US11354349B1 (en) Identifying content related to a visual search query
CN109359247A (en) Content delivery method and storage medium, computer equipment
CN114663197A (en) Commodity recommendation method and device, equipment, medium and product thereof
CN116976920A (en) Commodity shopping guide method and device, equipment and medium thereof
CN113468414A (en) Commodity searching method and device, computer equipment and storage medium
US20210374276A1 (en) Smart document migration and entity detection
US20220343365A1 (en) Determining a target group based on product-specific affinity attributes and corresponding weights
CN114862480A (en) Advertisement putting orientation method and its device, equipment, medium and product
CN116823410B (en) Data processing method, object processing method, recommending method and computing device
CN116823321B (en) Method and system for analyzing economic management data of electric business
CN116823404A (en) Commodity combination recommendation method, device, equipment and medium thereof
CN116796027A (en) Commodity picture label generation method and device, equipment, medium and product thereof
CN115293818A (en) Advertisement putting and selecting method and device, equipment and medium thereof
Jbene et al. An LSTM-based intent detector for conversational recommender systems
WO2023278030A1 (en) Query-based product representations
CN116719924A (en) Advertisement keyword generation method, device, equipment and medium thereof
CN113971599A (en) Advertisement putting and selecting method and device, equipment, medium and product thereof
Al-Baity et al. Towards effective service discovery using feature selection and supervised learning algorithms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination