WO2022111291A1 - Recommendation information evaluation method, apparatus and device, and computer readable storage medium - Google Patents

Recommendation information evaluation method, apparatus and device, and computer readable storage medium Download PDF

Info

Publication number
WO2022111291A1
WO2022111291A1 PCT/CN2021/130006 CN2021130006W WO2022111291A1 WO 2022111291 A1 WO2022111291 A1 WO 2022111291A1 CN 2021130006 W CN2021130006 W CN 2021130006W WO 2022111291 A1 WO2022111291 A1 WO 2022111291A1
Authority
WO
WIPO (PCT)
Prior art keywords
recommendation information
sample
information
evaluation
network model
Prior art date
Application number
PCT/CN2021/130006
Other languages
French (fr)
Chinese (zh)
Inventor
肖小范
陈龙
李宥壑
Original Assignee
北京沃东天骏信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京沃东天骏信息技术有限公司 filed Critical 北京沃东天骏信息技术有限公司
Publication of WO2022111291A1 publication Critical patent/WO2022111291A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements
    • G06Q30/0245Surveys
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/322Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Definitions

  • the present application relates to the technical field of computer applications, and relates to, but is not limited to, a method, apparatus, device, and computer-readable storage medium for evaluating recommendation information.
  • Advertising content is an important element of advertising, which is related to the conversion rate of products and the spread of brands.
  • the importance of advertising copy as the carrier of advertising content is evident.
  • a comprehensive online shopping mall that sells over tens of thousands of brands and tens of millions of products, and needs to place millions of advertisements every day, how to accurately and objectively evaluate the millions of advertisements in the system, and determine an It is very important to determine whether the advertisement copy is a low-quality advertisement to determine whether the advertisement needs to be filtered out, so as to reduce unnecessary advertisement expenses and reduce the operating cost of the enterprise.
  • embodiments of the present application provide a method, apparatus, device, and computer-readable storage medium for evaluating recommendation information.
  • the embodiment of the present application provides a method for evaluating recommendation information, and the method includes:
  • the evaluation result of the recommendation information is determined based on the scoring results of the recommendation information in each dimension.
  • An embodiment of the present application provides a device for evaluating recommendation information, and the device includes:
  • a first obtaining module configured to obtain recommendation information of the object to be recommended and object information of the object to be recommended from the recommendation information delivery platform;
  • the evaluation module is configured to input the recommendation information and the object information into the trained copywriting scoring model for evaluation, and obtain the scoring results of the recommendation information in each dimension, and the dimensions include subjectivity, compliance, attractiveness strength and smoothness;
  • a determination module configured to determine an evaluation result of the recommendation information based on the scoring results of the recommendation information in each dimension.
  • the embodiment of the present application provides an evaluation device for recommended information, including:
  • a memory configured to store a computer program executable on the processor
  • Embodiments of the present application provide a computer-readable storage medium storing computer-executable instructions, where the computer-executable instructions are configured to execute the steps of the foregoing method for evaluating recommendation information.
  • Embodiments of the present application provide a method, apparatus, device, and computer-readable storage medium for evaluating recommendation information, wherein the method includes: acquiring recommendation information of an object to be recommended and an object of the object to be recommended from a recommendation information delivery platform information; input the recommendation information and the object information into the trained copywriting scoring model for evaluation, and obtain the scoring results of the recommendation information in each dimension, the dimensions include subjectivity, compliance, attractiveness and smoothness
  • the evaluation result of the recommendation information is determined based on the scoring results of the recommendation information in each dimension. In this way, an objective and multi-dimensional quantitative evaluation of the recommended information can be realized, which can improve evaluation efficiency and evaluation accuracy, shorten evaluation time, reduce evaluation cost, and reduce evaluation risk.
  • FIG. 1 is a schematic flowchart of a realization of a method for evaluating recommendation information provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of a dictionary tree constructed by an evaluation method for recommendation information provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of a dictionary tree-based search model constructed by the method for evaluating recommendation information provided by an embodiment of the present application;
  • FIG. 4 is a schematic flowchart of another implementation of the method for evaluating recommendation information provided by the embodiment of the present application.
  • FIG. 5 is a schematic diagram of the implementation principle of the evaluation method for advertising creative copy provided by the embodiment of the present application.
  • FIG. 6 is a schematic diagram of training of an advertisement copy theme scoring model provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of training of an advertisement copy theme compliance degree model provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of training of an advertising copy topic attractiveness model provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of training of an advertising copy subject naturalness model provided by an embodiment of the present application.
  • Figure 10 is a schematic diagram of the scores of three copywriting in each dimension under the ICAN model
  • Figure 11 is a radar chart of the scores of the three copywriting in each dimension under the ICAN model
  • FIG. 12 is a schematic diagram of the composition and structure of a device for evaluating recommendation information provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of the composition and structure of an evaluation device for recommendation information provided by an embodiment of the present application.
  • first ⁇ second ⁇ third is only used to distinguish similar objects, and does not represent a specific ordering of objects. It is understood that “first ⁇ second ⁇ third” is used in Where permitted, the specific order or sequence may be interchanged to enable the embodiments of the application described herein to be practiced in sequences other than those illustrated or described herein.
  • Method 1 Evaluation method based on evaluators, this method mainly relies on evaluators' professional knowledge background, personal experience, etc. to evaluate recommended information, such as selecting experts who can represent consumers' attitudes Evaluate ad copy.
  • Method 2 Based on the evaluation method of questionnaire survey, a questionnaire is designed in combination with the content of the object to be recommended, and the appropriate interviewees are screened according to the audience attributes of the object to be recommended. The interviewee evaluates the form, style, appeal point, and understanding of the recommended information. and so on, and select recommended information that may have an ideal effect for actual delivery.
  • Method 3 Evaluation method based on the actual delivery effect. This method requires the actual delivery of the recommended information, based on the monitored clicks, impressions, costs and other related indicators to evaluate the recommended information, and iteratively revise to optimize the recommended information. , to improve the recommendation effect.
  • FIG. 1 is a schematic flowchart of an implementation of a method for evaluating recommendation information provided by an embodiment of the present application. As shown in FIG. 1 , the method includes the following steps:
  • Step S101 Obtain recommendation information of the object to be recommended and object information of the object to be recommended from the recommendation information delivery platform.
  • the embodiment of the present application takes the recommendation information as an advertisement copy as an example, and the recommendation information delivery platform is described as an advertisement delivery platform.
  • the steps of the method provided in the embodiment of the present application may be implemented by an advertising copy evaluation device.
  • the advertising copy evaluation equipment establishes a connection relationship with the advertising delivery platform. Before an advertisement placement platform places an advertisement, in order to ensure the quality of the advertisement to be placed, the evaluation device of the advertisement copy needs to evaluate the to-be-placed advertisement, so as to determine whether the advertisement copy to be placed is normally placed according to the evaluation result.
  • the advertising copy evaluation device first obtains the advertisement copy of the advertisement to be placed, and the object information corresponding to the advertisement copy.
  • the advertisement copy and the object information corresponding to the advertisement copy may be information of the same object or information of different objects.
  • the ad copy and the object information correspond to the same object, it indicates that the object described by the ad copy and the object described by the object information are the same object, that is, the ad copy matches the object information; when the ad copy and the object information correspond to different objects, it indicates that the object described by the ad copy
  • the object described by the object information is a different object, that is, the ad copy does not match the object information.
  • the advertisement copy is "a teapot that Dad will like”
  • the object information is "a transparent glass teapot for chrysanthemum tea”
  • the described objects are all "teapot”
  • the advertisement copy and the object information corresponding to the advertisement copy are the same
  • the information of the object if the advertisement copy is "Teapot that Dad will like”, the object information is "Premium tea, a special product of Yunnan before the Ming Dynasty", the object described in the advertisement copy is "Teapot”, and the object described in the object information is "Tea”, at this time , the advertisement copy and the object information corresponding to the advertisement copy are information of different objects.
  • step S102 the recommendation information and the object information are input into the trained copywriting scoring model for evaluation, and the scoring results of the recommendation information in each dimension are obtained.
  • dimensions include topicality, compliance, attractiveness, and fluency.
  • the embodiment of the present application proposes a copywriting scoring model ICAN.
  • the copywriting scoring model ICAN considers at least subjectivity (I, Integrated), compliance (C, Compliance), attractiveness (A, Appeal) and naturalness (N, Natural) (also called fluency) dimensions, pre-train the proposed copywriting scoring model to obtain the trained copywriting scoring model ICAN. Input the information to be evaluated into the trained copywriting scoring model ICAN, and obtain the scoring results of multiple dimensions of subject I, compliance C, attractiveness A, and smoothness N.
  • the advertising copy evaluation device performs multi-dimensional quantitative evaluation on the advertisement copy to be placed in the advertisement, which can ensure the objectivity of the scoring result, and can improve the evaluation accuracy by considering the multi-dimensional scoring result. Compared with other evaluation methods, it can improve evaluation efficiency, shorten evaluation time, reduce evaluation cost, and reduce evaluation risk.
  • Step S103 Determine an evaluation result of the recommendation information based on the scoring results of the recommendation information in each dimension.
  • the scoring results of the recommended information are determined according to the respective scoring results; it is judged whether the scoring results of the recommended information are greater than the first preset threshold; When the preset threshold is set, the evaluation result of the recommended information is determined to be passed; when the scoring result of the recommended information is less than or equal to the first preset threshold, the evaluation result of the recommended information is determined to be unsuccessful.
  • An implementation method of determining the scoring result of the recommended information according to each scoring result is: when determining the scoring result of the recommended information according to the scoring results of each dimension, it can be determined in combination with the radar chart, and the scoring results of each dimension are formed in the radar chart. The area of the quadrilateral is determined as the scoring result of the recommended information.
  • the scoring result of the recommendation information determined by this implementation is determined based on the integrity of multiple dimensions.
  • the scoring result of the recommended information is determined based on the overallity, which facilitates subsequent overall adjustment and optimization of the recommended information, or direct elimination of the recommended information with a lower overall score.
  • the variance of the thematic scoring results, the compliance scoring results, the attractiveness scoring results and the naturalness scoring results calculate the variance of the thematic scoring results, the compliance scoring results, the attractiveness scoring results and the naturalness scoring results; and determine whether the variance is less than a second preset threshold; When the variance is smaller than the second preset threshold, it is further judged whether there is at least one scoring result in the thematic scoring result, the compliance scoring result, the attractiveness scoring result and the fluent scoring result that is greater than the third preset threshold; when thematic scoring result When at least one of the results, the compliance score, the attractiveness score, and the fluent score is greater than the third preset threshold, it is determined that the evaluation result of the recommended information is the evaluation pass; when the variance is greater than or equal to the second preset The threshold value, or the subjectivity score result, compliance score result, attractiveness score result, and smoothness score result are all less than or equal to the third preset threshold, the evaluation result of the recommended information is determined to be an evaluation failure.
  • the recommendation information of the object to be recommended and the object information of the object to be recommended are obtained from the recommendation information delivery platform; the recommendation information and the object information are input into the trained copywriting scoring model for evaluation, and the The scoring results of the recommended information in each dimension, including subjectivity, compliance, attractiveness, and smoothness; the evaluation results of the recommended information are determined based on the scoring results of the recommended information in each dimension.
  • an objective and multi-dimensional quantitative evaluation of the recommended information can be realized, which can improve evaluation efficiency and evaluation accuracy, shorten evaluation time, reduce evaluation cost, and reduce evaluation risk.
  • the evaluation method for recommendation information further includes the following steps:
  • Step S11 obtaining the thematic sample set, the sensitive word set, the attractiveness sample set and the fluent degree sample set.
  • the obtained thematic sample set, attractiveness sample set, and commensurate sample set can be the same sample set, and the sample recommendation information and sample object information included in each sample set are the same.
  • the difference lies in the training of different models.
  • the input information of different models is different.
  • the recommendation information takes ad copy as an example.
  • advertising copy should convey the advertising content in a healthy and positive form, and guide consumers to establish correct values. Based on this, sensitive words that do not comply with laws and regulations are formed into a sensitive word set. When it is judged that the ad copy contains sensitive words, it is determined that the ad copy is not compliant.
  • Step S12 input the thematic sample set, the attractiveness sample set and the fluent degree sample set into the preset thematic network model, the preset attractive network model and the preset fluent network model, respectively, to obtain a trained thematic network model , the trained attractiveness network model and the trained smoothness network model.
  • the trained thematic network model is configured to determine the recommendation information of the object to be recommended and the thematic score of the object information of the object to be recommended.
  • first determine the subject of the recommended information as the first subject determine the subject of the object information as the second subject, then calculate the matching probability between the first subject and the second subject, and determine the matching probability as the subjectivity of the recommended information score. For example, when the topic matching probability of the subject of the ad copy and the object information is 1, the subject score of the ad copy is 1; when the subject match probability of the subject of the ad copy and the subject information is 0.1, the subject score of the ad copy is 0.1.
  • the trained attractiveness network model is configured to determine the attractiveness score of the recommended information.
  • information entropy a quantitative index for measuring the amount of information
  • the information entropy of recommended information may be used to determine its attractiveness. For example, when recommending product advertisements, first preset a feature information set, which includes feature information such as product categories, discounts, and attribute words, and then input the advertisement copy into the trained attractive network model to obtain the advertisement copy.
  • the probability distribution of calculates the information entropy according to the probability distribution, and determines the information entropy as the attractiveness score of the advertisement copy.
  • the trained smoothness network model is configured to determine the naturalness (ie, smoothness) score of the recommended information.
  • Language model perplexity PPL, Perplixity
  • Perplexity is an indicator to measure the performance of language models.
  • the degree of confusion may be used to quantify the degree of smoothness of the recommended information. The lower the degree of confusion of the recommended information, the more natural the semantics of the recommended information, and the higher the degree of smoothness; or there are typos.
  • the preset PPL calculation formula can be used to calculate the confusion degree, and then based on the Chinese language model N-Gram, the calculated confusion degree of the recommended information is weighted and summed to obtain the smoothness of the recommended information.
  • Step S13 construct a search model based on dictionary tree according to the sensitive word set.
  • step S13 can be implemented by the following steps:
  • Step S131 construct a dictionary tree according to each sensitive word in the sensitive word set.
  • the tree constructed at this time is a dictionary tree, also known as a trie tree.
  • the sensitive word set is ⁇ high h, high imitation, usury, simulation gun, real game ⁇ , and the constructed trie tree is shown in Figure 2.
  • Step S132 adding a query failure pointer to each node in the dictionary tree to obtain a lookup model based on the dictionary tree.
  • the trie tree can be used for multi-pattern matching, backtracking is required every time the matching fails. If the pattern string is very long, it will be a waste of time.
  • the embodiment of the present application continues to perform step S132, introducing multiple Modular matching algorithm AC automaton (Aho-Corasick automaton).
  • the AC automaton adds a query failure pointer, that is, the fail pointer, on the basis of the tire tree. If the current node fails to match, the pointer is transferred to the place pointed by the fail pointer, so that the matching can continue without backtracking.
  • the construction of AC automata can be achieved by the following pseudocode:
  • Step S14 constructing a trained copywriting scoring model based on the trained thematic network model, the trained attractiveness network model, the trained smoothness network model and the search model.
  • a sample set is obtained, and a pre-proposed copywriting scoring model ICAN is trained to obtain a trained copywriting scoring model ICAN.
  • the trained copywriting scoring model ICAN can perform thematic I, compliance degree C, attractiveness evaluation on the recommended information.
  • the multi-dimensional quantitative evaluation of force A and natural degree N can ensure the objectivity of the scoring results. Considering the multi-dimensional scoring results, it can improve the evaluation accuracy. Compared with the evaluation methods of recommended information in related technologies, it can improve the evaluation efficiency, Shorten evaluation time, reduce evaluation cost, and reduce evaluation risk.
  • step S12 inputting the thematic sample set into a preset thematic network model to obtain a trained thematic network model.
  • Step S121 Obtain sample object information and sample recommendation information of each sample object in the thematic sample set.
  • the sample object information refers to the description information of the sample object, and the sample recommendation information is the recommended content of the sample object.
  • the promoted product is "teapot”
  • the sample object information is "transparent glass teapot for chrysanthemum tea”
  • the sample recommendation information is "teapot that dad will like”.
  • step S122 the sample object information and the sample recommendation information of the same sample object are regarded as a group of sample pairs, and the labeling information of the sample pairs is obtained. That is, "a transparent glass teapot for chrysanthemum tea” and “a teapot that dad will like” are used as a set of sample pairs (also called sample sentence pairs). information.
  • the evaluation device for the recommendation information can obtain the label information of the sample pair that is manually pre-labeled and saved from the storage device.
  • the annotation information represents the probability that the sample object information in the sample pair matches the sample recommendation information.
  • the probability that the sample recommendation information matches the sample object information is 1; If the recommendation information describes "teapot”, and the sample object information in the sample pair describes "mobile phone”, the probability that the sample recommendation information matches the sample object information is 0.
  • Step S123 input each sample pair corresponding to each sample object in the thematic sample set and the labeling information of each sample pair into a preset thematic network model for training and learning, and obtain a trained thematic network model.
  • the trained thematic network model is used to determine and output the annotation information of the evaluation pair based on the input evaluation pair, so as to obtain the thematic scoring result of the recommended information.
  • the sample object information and sample recommendation information in the thematic sample set can be input to the preset thematic network model as sample pairs, and the annotation information of the sample pairs can be used as the annotation data of the preset thematic network model for transfer learning. Train to get the trained topic network model.
  • the preset topical network model can be a natural language processing BERT (Bidirectional Encoder Representations from Transformers) model.
  • BERT Bidirectional Encoder Representations from Transformers
  • model F a trained thematic network model based on BERT is obtained, which is denoted as model F.
  • the recommendation information of the object to be recommended and the object information of the object to be recommended constitute an evaluation pair, which is input into the trained thematic network model, and then the thematic score of the recommendation information can be generated.
  • the description of the product promoted by the advertisement copy Di is Ai (that is, the object information is Ai)
  • F outputs the probability ri related to the two, and ri can be used as the theme of the advertisement copy Di. score.
  • step S12 "input the attractiveness sample set into the preset attractiveness network model to obtain a trained attractiveness network model", which can be implemented as the following steps:
  • Step S124 Obtain sample recommendation information of each sample object in the attractive sample set.
  • the quantitative index to measure the amount of information is called "information entropy”.
  • users receive new information, which increases their cognitive information entropy. For example, the user did not know that the teapot was on sale before, and when he saw the advertisement “Teapot that Dad will like, 25% off over 100", he learned that the price of the teapot was on sale. And the advertisement "a good thing that the old people like, I won't tell others", the amount of information provided to users is very small. Users always want to see informative advertisements, which is reflected in the advertisement copy, that is, there is a clear concept. Based on this, the embodiments of the present application use the information entropy of the recommendation information to evaluate its attractiveness.
  • Step S125 Perform information extraction on the sample recommendation information of each sample object to obtain a feature information set of each sample object.
  • the feature information set includes at least one of the name, category, discount and attribute word of the sample object. For example, set the following feature information: category, discount, attribute word.
  • Category that is, the category information of the product, such as "mobile phone”, “fresh food” and other features; discount, that is, the promotional information of the product, such as “full discount”, “gift”, “discount” and other features; attribute words, such as “red””,”log”,”import”,”celebrity”,”summer” and other characteristics.
  • Step S126 input the feature information set of each sample object into the preset attractiveness network model to obtain a trained attractiveness network model.
  • the preset attractive network model may be a Magpie model, which is used to predict the probability that a certain recommendation information belongs to each feature.
  • step S12 "inputting the fluidity sample set into the preset fluidity network model to obtain a trained fluidity network model" can be implemented as the following steps:
  • Step S127 Obtain sample recommendation information of each sample object in the fluent degree sample set.
  • Step S128 Perform word segmentation processing on the sample recommendation information of each sample object to obtain word segmentation of each sample recommendation information.
  • the degree of confusion may be used to quantify the degree of smoothness of the recommendation information. The lower the degree of confusion of the recommendation information, the more natural the semantics of the recommendation information, and the higher the degree of smoothness; otherwise, the recommendation information has semantically incomprehensible Happening.
  • step S129 the word segmentation of each sample recommendation information is input into a preset smoothness network model to obtain a trained smoothness network model.
  • the preset smoothness network model can be the Chinese language model N-Gram, which calculates the confusion degree by the word segmentation of the recommended information of each sample, and then uses the weighted summation to obtain the smoothness of the recommended information.
  • the confusion degree of the recommendation information is calculated, and the confusion degree ppl(s) can be calculated by the following formula (1):
  • the smoothness of the recommendation information is obtained by weighted summation.
  • N the value of N is 2, 3, and 4 as examples.
  • formula (2) the formula for calculating the fluency f(s) of the recommendation information is shown in the following formula (2):
  • ⁇ i is the weight value corresponding to the degree of confusion when N takes different values, so far, the degree of smoothness of the recommendation information of the object to be recommended is obtained.
  • step S102 input the recommendation information and object information into the trained copywriting scoring model for evaluation, and obtain the scoring results of the recommendation information in each dimension
  • Step S1021 input the recommendation information and the object information as a set of evaluation pairs into the trained thematic network model, and obtain the thematic scoring result of the recommendation information.
  • step S1022 the recommendation information is input into the search model, and a compliance score result of the recommendation information is obtained.
  • step S1023 the recommendation information is input into the trained attractiveness network model, and the attractiveness score result of the recommendation information is obtained.
  • step S1024 the recommendation information is input into the trained network model of smoothness, and the result of the smoothness score of the recommended information is obtained.
  • the recommendation information and the object information are input into the trained thematic network model as a set of evaluation pairs to obtain the thematic score of the recommended information, and the recommendation information is respectively input From the search model, the trained attractiveness network model and the trained smoothness network model, the compliance score, attractiveness score and naturalness score of the recommendation information are obtained respectively, so as to obtain the scoring results of each dimension.
  • step S103 determines the evaluation result of the recommendation information based on the scoring result of the recommendation information in each dimension
  • the evaluation result of the recommendation information is determined based on the integrity of multiple dimensions.
  • the above step S103 "determining the evaluation result of the recommendation information based on the scoring results of the recommendation information in each dimension" can be implemented as the following steps:
  • Step S103a1 Determine the scoring result of the recommended information according to the subjectivity scoring result, the compliance scoring result, the attractiveness scoring result, and the smoothness scoring result.
  • Step S103a2 judging whether the scoring result of the recommended information is greater than a first preset threshold.
  • the scoring result of the recommended information is greater than the first preset threshold, it indicates that the recommended information meets the delivery requirements, and the process goes to step S103a3;
  • the scoring result of the recommended information is less than or equal to the first preset threshold, it may be that the subject of the recommended information and the object information are related. The subject does not match, it may be that the recommended information contains sensitive words, or the amount of information contained in the recommended information is too small, or it may be that the recommended information sentence is not smooth, there are typos and other defects.
  • Step S103a3 it is determined that the evaluation result of the recommended information is an evaluation pass.
  • Step S103a4 it is determined that the evaluation result of the recommended information is that the evaluation fails.
  • the evaluation result of the recommendation information is determined based on the stability of multiple dimensions.
  • the above step S103 "determining the evaluation result of the recommendation information based on the scoring results of the recommendation information in each dimension" can be implemented as the following steps:
  • Step S103b1 Calculate the variance of the subjectivity score results, the compliance score results, the attractiveness score results, and the naturalness score results.
  • Step S103b2 judging whether the variance is smaller than a second preset threshold.
  • the variance is less than the second preset threshold, it indicates that the scoring results of each dimension of the recommended information are relatively average, and then the process goes to step S103b3; Or the scoring result of a certain dimension is too high, in this case, it is determined that the recommended information does not meet the delivery requirements, and the process goes to step S103b5.
  • Step S103b3 Determine whether there is at least one scoring result greater than a third preset threshold in the thematic scoring result, the compliance scoring result, the attractiveness scoring result, and the smoothness scoring result.
  • step S103b4 is entered; At least one of the scoring results, compliance scoring results, attractiveness scoring results, and fluency scoring results does not exist greater than the third preset threshold, namely thematic scoring results, compliance scoring results, attractiveness scoring results, and smoothing If the degree score results are all less than or equal to the third preset threshold, it indicates that although the scores of the recommended information in each dimension are average, each score result is lower. At this time, it is considered that the recommended information does not meet the delivery results, and the process goes to step S103b5.
  • Step S103b4 it is determined that the evaluation result of the recommended information is the evaluation pass.
  • Step S103b5 it is determined that the evaluation result of the recommended information is that the evaluation fails.
  • step S103a4 or step S103b5 when it is determined that the evaluation result of the recommended information is that the evaluation fails, the method may further include:
  • Step S104 Adjust the recommendation information based on at least one of the subjectivity scoring results, the compliance scoring results, the attractiveness scoring results, and the smoothness scoring results.
  • the recommendation information that fails the evaluation is adjusted and optimized, so that the evaluation result of the adjusted recommendation information is the evaluation pass.
  • the method may further include:
  • step S105 the evaluation result is sent to the recommendation information delivery platform, so that the recommendation information delivery platform delivers the evaluation result as the recommendation information that has passed the evaluation.
  • the evaluation device of the recommendation information notifies the recommendation information delivery platform that the recommendation information of the objects to be recommended can be directly delivered.
  • the adjusted recommendation information also needs to be sent to the recommendation information delivery platform, so that the recommendation information delivery platform can deliver the adjusted recommendation information.
  • the evaluation device for recommendation information sends the evaluation result to the recommendation information delivery platform, and may also cause the recommendation information delivery platform to send prompt information, so that users who recommend objects to be recommended know which recommendation information cannot be delivered normally.
  • FIG. 4 is a schematic flowchart of another implementation of the method for evaluating recommendation information provided by the embodiment of the present application. As shown in FIG. 4 , the method includes the following steps:
  • Step S401 obtaining the thematic sample set, the sensitive word set, the attractiveness sample set and the fluent degree sample set.
  • Step S402 Obtain sample object information and sample recommendation information of each sample object in the thematic sample set.
  • step S403 the sample object information and the sample recommendation information of the same sample object are regarded as a set of sample pairs, and the labeling information of the sample pairs is obtained.
  • the annotation information represents the probability that the sample object information in the sample pair matches the sample recommendation information.
  • Step S404 input each sample pair corresponding to each sample object in the thematic sample set and the labeling information of each sample pair into a preset thematic network model for training and learning, and obtain a trained thematic network model.
  • Step S405 construct a dictionary tree according to each sensitive word in the sensitive word set.
  • Step S406 adding a query failure pointer to each node in the dictionary tree to obtain a lookup model based on the dictionary tree.
  • Step S407 Obtain sample recommendation information of each sample object in the attractive sample set.
  • Step S408 Perform information extraction on the sample recommendation information of each sample object to obtain a feature information set of each sample object.
  • the feature information set includes at least one of the name, category, discount, and attribute word of the sample object.
  • Step S409 input the feature information set of each sample object into the preset attractiveness network model to obtain a trained attractiveness network model.
  • Step S410 Obtain sample recommendation information of each sample object in the fluent degree sample set.
  • Step S411 Perform word segmentation processing on the sample recommendation information of each sample object to obtain word segmentation of each sample recommendation information.
  • Step S412 inputting the word segmentation of each sample recommendation information into a preset smoothness network model to obtain a trained smoothness network model.
  • step S413 a trained copywriting scoring model is constructed based on the trained thematic network model, the trained attractiveness network model, the trained smoothness network model and the search model.
  • steps S401 to S413 may also be performed after step S414.
  • Step S414 Obtain recommendation information of the object to be recommended and object information of the object to be recommended from the recommendation information delivery platform.
  • Step S415 input the recommendation information and the object information as a set of evaluation pairs into the trained thematic network model, and obtain the thematic scoring result of the recommendation information.
  • step S416 the recommendation information is input into the search model, and the compliance score result of the recommendation information is obtained.
  • step S417 the recommendation information is input into the trained attractiveness network model, and the attractiveness score result of the recommendation information is obtained.
  • step S4108 the recommendation information is input into the trained network model of smoothness, and the result of the smoothness score of the recommended information is obtained.
  • Step S419 Determine the scoring result of the recommended information according to the subjectivity scoring result, the compliance scoring result, the attractiveness scoring result, and the smoothness scoring result.
  • Step S420 judging whether the scoring result of the recommended information is greater than a first preset threshold.
  • step S421 When the scoring result of the recommended information is greater than the first preset threshold, it indicates that the recommended information meets the delivery requirements, and at this time, the process goes to step S421; when the scoring result of the recommended information is less than or equal to the first preset threshold, it indicates that the recommended information does not meet the delivery requirements request, go to step S422.
  • steps 419 to S420 may be replaced by steps S419' to S421': step S419', calculating the variance of the subjectivity score results, compliance score results, attractiveness score results and naturalness score results . Step S420', judging whether the variance is less than the second preset threshold.
  • Step S421' judging whether there is at least one scoring result in the thematic scoring result, compliance scoring result, attractiveness scoring result, and smoothness scoring result that is greater than the third preset threshold.
  • step S421 When at least one of the thematic scoring results, compliance scoring results, attractiveness scoring results, and smoothness scoring results is greater than the third preset threshold, indicating that the recommended information satisfies the delivery results, step S421 is entered; The scoring result, the compliance scoring result, the attractiveness scoring result, and the smoothness scoring result are all less than or equal to the third preset threshold, indicating that the recommendation information does not satisfy the delivery result, and the process proceeds to step S422.
  • Step S421 it is determined that the evaluation result of the recommended information is an evaluation pass.
  • Step S422 it is determined that the evaluation result of the recommended information is that the evaluation fails.
  • Step S423 adjusting the recommendation information based on at least one of the subjectivity scoring result, the compliance scoring result, the attractiveness scoring result, and the naturalness scoring result. Adjust the subject, sensitive word, amount of information, or sentence of the recommended information, so that the evaluation result of the adjusted recommended information is the evaluation pass, thereby satisfying the delivery condition, and the process proceeds to step S424.
  • step S424 the evaluation result is sent to the recommendation information delivery platform, and the recommendation information delivery platform delivers the evaluation result as the recommendation information that has passed the evaluation.
  • the recommendation information of the object to be recommended and the object information of the object to be recommended are obtained from the recommendation information delivery platform; the recommendation information and the object information are input into the trained copywriting scoring model for evaluation, and the The scoring results of the recommended information in each dimension, including subjectivity, compliance, attractiveness, and smoothness; the evaluation results of the recommended information are determined based on the scoring results of the recommended information in each dimension.
  • an objective and multi-dimensional quantitative evaluation of the recommended information can be realized, which can improve evaluation efficiency and evaluation accuracy, shorten evaluation time, reduce evaluation cost, and reduce evaluation risk.
  • the embodiment of this application proposes a quantitative evaluation model for creative copywriting of e-commerce advertisements: ICAN.
  • C Compliance
  • A attractiveness
  • I Integrated
  • N Natural
  • FIG. 5 is a schematic diagram of the implementation principle of the evaluation method for advertising creative copy provided by the embodiment of the present application.
  • the ICAN advertisement creative evaluation model proposed in the embodiment of the present application is mainly composed of four scoring sub-models: Degree scoring model, copywriting attractiveness scoring model, copywriting theme scoring model, copywriting naturalness scoring model. Each part is described in detail below.
  • the thematic nature of ad copy is to assess whether it is consistent with the advertised product. For example, an advertisement reads “a teapot that Dad will like", but after clicking into it, the user sees products such as electronic products and tea, which not only fails to achieve the effect of promoting the product, but also loses the user experience.
  • FIG. 6 is a schematic diagram of training of an advertising copy theme scoring model provided by an embodiment of the present application. As shown in FIG. 6 , the training process of the copy theme scoring model is as follows:
  • sample set S From the existing products and their advertising copy, manually annotate some positive samples and negative samples to form a sample set S.
  • the positive sample here means that the ad copy is semantically related to the description of the product it promotes; while the negative sample is semantically irrelevant.
  • model F For all existing advertisement copy and the description of each promoted product, form sentence pairs in pairs, and input them into model F to generate the theme score of each advertisement copy.
  • the Aho-Corasick algorithm can be used to score the compliance degree of the copy. Since it is necessary to strictly check and kill the creative copy of the advertisement containing sensitive words, the score of each copy is only 1 or 0. 1 means that no copy is found in the copy. Any sensitive words that do not comply with laws and regulations, that is, the copy is compliant; 0 means that there are sensitive words in the copy, so the copy is not compliant.
  • Aho-Corasick is a classic multi-pattern string matching algorithm, which is widely used in pattern string matching scenarios with large text strings and many target strings, so it is suitable for compliance checking of creative copywriting.
  • the construction of a copy sensitive word automaton to detect sensitive words in a copy includes the following three steps: constructing a sensitive word Trie tree (prefix) tree, adding a sensitive word query mismatch pointer to construct an AC automaton, pattern matching and returning the matching sensitive words.
  • the algorithm steps for constructing the trie tree are as follows: 1) First, obtain all the text data and divide them into line-by-line form. 2) Read in each line of data, compare the current comparison character value with the child nodes of the current node, and find the matching node; 3) If the corresponding child node is found, take the child node as the current node, and remove the data. this character, continue with step 2). 4) If the corresponding child node is not found, insert the new node into the current node, and use the new node as the current node, and proceed to step 2). 5) The termination condition of the operation is that all characters in the data have been removed and the comparison is completed.
  • Figure 7 is a schematic diagram of the training of the advertising copy theme compliance model provided by the embodiment of the application, and Figure 7 shows two advertisement creatives The entire process of scoring compliance.
  • Advertising copy affects users’ minds by conveying information to users. Obviously, copy with a larger amount of information is more attractive to them. For example, "a good thing that old people like” is far less informative than “a teapot that dad will like”.
  • the quantitative index to measure the amount of information is called "information entropy”. After the user sees the copy information, he receives new information, which increases the information entropy of his cognition (that is, the original unclear cognition becomes clear). For example, the user did not know that the teapot was on sale before, and when he saw the advertisement "Teapot that Dad will like, 25% off over 100", he learned that the price of the teapot was on sale.
  • the advertisement "a good thing that old people like, I won't tell others", it provides very little information to users. Users always want to see informative advertisements, which is reflected in the copywriting, that is, there is a clear concept. Therefore, in the embodiments of the present application, the information entropy of the copy is used to evaluate its attractiveness. For example, set the following “concepts”: product word, which is the category of the product, such as “mobile phone”, “fresh food” and other concepts; benefit point, which is the promotion of the product, such as “full discount”, “gift”, “discount” and other concepts; attribute words, such as "red”, “log”, “import”, “celebrity", “summer” and other concepts. Fake relatively larger. When users see such copywriting, they are likely to be clueless and ignore it. On the contrary, if the concept of D i is clear, E i will be relatively small, and users are more likely to be attracted by such copy.
  • FIG. 8 is a schematic diagram of training the theme attractiveness model of advertisement copy provided by the embodiment of the present application.
  • the training steps of the theme attractiveness model of copywriting are as follows: 1) Extract all products from the product library of the e-commerce website, including Their names, categories/brands, attribute words, and related promotions; 2) Use the name as text, category/brand, attribute words, promotion words, etc. as labels to generate a training text set; 3) Using the above training text set, train one more Label classifier, set to model M; such as using the Magpie model as M. Finally, the M model can be used to predict the probability that a copy belongs to each concept. In this way, when there is a new copy D i , it is input into the model to obtain the probability distribution P i , and the information entropy E i is calculated to be the attractiveness score of the copy.
  • N takes values of 2, 3, and 4.
  • the confusion degree of the copy is calculated by using the weighted summation to obtain the smooth natural degree f of the copy as shown in the following formula (4):
  • ⁇ i is the weight value corresponding to the confusion degree of N at different values.
  • FIG. 9 is a schematic diagram of the training of the theme naturalness model of the advertising copy provided by the embodiment of the present application.
  • the text to be detected is input into the trained copy text naturalness scoring model shown on the right side of FIG. 9 to obtain the smoothness score of the text to be detected, and then It is normalized to obtain a fluent score.
  • FIG. 10 is a schematic diagram of the scores of the three copywriting in each dimension under the ICAN model, and for the dimension with a low score, a corresponding description is given. For example, sensitive words are given for non-compliant copywriting, incomprehensible fragments are given for inconsistent copywriting, and mismatched category names are given for creative category mismatches.
  • the scores of each dimension of each copywriting idea under the ICAN model can be compared using the radar chart in FIG. 11 .
  • copywriting with low scores can be improved or eliminated.
  • four dimensions of subjectivity, compliance, attractiveness and naturalness are introduced to quantify and evaluate the copywriting quality of e-commerce advertisements.
  • ICAN model it is possible to filter out that a single dimension is obviously too low, or the overall dimensions of multiple dimensions are filtered out. Advertising copy that is not high is convenient for subsequent tuning of these copies, or direct elimination.
  • each module included in the apparatus and each unit included in each module can be implemented by a processor in a computer device; of course, it can also be implemented by logic Circuit implementation; in the process of implementation, the processor can be a central processing unit (CPU, Central Processing Unit), a microprocessor (MPU, Microprocessor Unit), a digital signal processor (DSP, Digital Signal Processing) or a field programmable gate Array (FPGA, Field Programmable Gate Array), etc.
  • CPU Central Processing Unit
  • MPU Microprocessor Unit
  • DSP Digital Signal Processing
  • FPGA Field Programmable Gate Array
  • FIG. 12 is a schematic structural diagram of the composition of the apparatus for evaluating recommended information provided by an embodiment of the present application.
  • the apparatus for evaluating recommended information 120 includes: an acquisition module 121, configured to acquire recommendation information of the object to be recommended and object information of the object to be recommended from the recommendation information delivery platform; evaluation module 122, configured to input the recommendation information and the object information into the trained The copywriting scoring model is evaluated, and the scoring results of the recommendation information in each dimension are obtained, and the dimensions include subjectivity, compliance, attractiveness and smoothness; the determining module 123 is configured to be based on the recommendation information in each dimension. The scoring result determines the evaluation result of the recommendation information.
  • the evaluation device 120 for recommendation information may further include: a second acquisition module, configured to acquire a topic sample set, a sensitive word set, an attractiveness sample set, and a fluency sample set; a training module, configured to In order to input the thematic sample set, the attractiveness sample set and the fluency sample set respectively into the preset thematic network model, the preset attractive network model and the preset smoothness network model, the trained Thematic network model, the trained attractiveness network model and the trained smoothness network model; a construction module, configured to construct a dictionary tree-based search model according to the sensitive word set; a construction module, configured to be based on the training A good topical network model, a trained attractive network model, a trained smoothness network model, and the search model construct a trained copywriting scoring model.
  • a second acquisition module configured to acquire a topic sample set, a sensitive word set, an attractiveness sample set, and a fluency sample set
  • a training module configured to In order to input the thematic sample set, the attractiveness sample set and the fluency sample
  • the training module is further configured to: obtain sample object information and sample recommendation information of each sample object in the thematic sample set; take the sample object information and sample recommendation information of the same sample object as a group sample pair, obtain the label information of the sample pair, the label information represents the probability that the sample object information in the sample pair matches the sample recommendation information; each sample pair corresponding to each sample object in the thematic sample set is The annotation information of each sample pair is input into a preset thematic network model for training and learning, and a trained thematic network model is obtained.
  • the construction module is further configured to: construct a dictionary tree according to each sensitive word in the sensitive word set; add a query failure pointer to each node in the dictionary tree to obtain a lookup model based on the dictionary tree.
  • the training module is further configured to: obtain sample recommendation information of each sample object in the attractive sample set; perform information extraction on the sample recommendation information of each sample object to obtain the characteristics of each sample object information set, the feature information set includes at least one of the name, category, discount and attribute word of the sample object; input the feature information set of each sample object into the preset attractiveness network model to obtain the trained attraction force network model.
  • the training module is further configured to: obtain sample recommendation information of each sample object in the connectivity sample set; perform word segmentation processing on the sample recommendation information of each sample object to obtain the sample recommendation information of each sample object. word segmentation; input the word segmentation of the recommended information of each sample into the preset fluent degree network model to obtain a trained fluent degree network model.
  • the evaluation module is further configured to: input the recommendation information and the object information as a set of evaluation pairs to the trained thematic network model, and obtain the thematic scoring result of the recommendation information ; Input the recommended information into the search model to obtain the compliance score result of the recommended information; Input the recommended information into the trained attractiveness network model to obtain the attractiveness score result of the recommended information ; Input the recommended information into the trained fluent network model, and obtain the fluent score result of the recommended information.
  • the determining module is further configured to: determine the recommendation information according to the subjectivity scoring results, the compliance scoring results, the attractiveness scoring results, and the smoothness scoring results
  • the evaluation result of the recommended information is greater than the first preset threshold, it is determined that the evaluation result of the recommended information is passed; when the evaluation result of the recommended information is less than or equal to the first preset threshold, It is determined that the evaluation result of the recommended information is an evaluation failure.
  • the determining module is further configured to: calculate the variance of the subjectivity scoring result, the compliance scoring result, the attractiveness scoring result, and the fluency scoring result; when the When the variance is less than the second preset threshold, and there is at least one scoring result in the subjectivity scoring result, the compliance scoring result, the attractiveness scoring result, and the fluency scoring result that is greater than the third preset threshold , it is determined that the evaluation result of the recommendation information is the evaluation pass; when the variance is greater than or equal to the second preset threshold, or thematic score result, the compliance score result, the attractiveness score result and all When the fluent score results are all less than or equal to the third preset threshold, it is determined that the evaluation result of the recommendation information is an evaluation failure.
  • the evaluation device 120 for the recommended information may further include: an adjustment module configured to, when the evaluation result is that the evaluation fails the evaluation, based on the subjectivity scoring result and the compliance scoring result , at least one of the attractiveness scoring result and the fluency scoring result adjusts the recommendation information.
  • the sending module is configured to send the evaluation result to the recommendation information delivery platform, so that the recommendation information delivery platform delivers the evaluation result as the recommendation information that has passed the evaluation.
  • the above-mentioned advertising copy evaluation method is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer-readable storage medium.
  • the computer software products are stored in a storage medium and include several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) is caused to execute all or part of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, Read Only Memory (ROM, Read Only Memory), magnetic disk or optical disk and other media that can store program codes.
  • the embodiments of the present application are not limited to any specific combination of hardware and software.
  • the embodiments of the present application provide a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps in the method for evaluating recommendation information provided in the foregoing embodiments.
  • FIG. 13 is a schematic diagram of the composition and structure of the device for evaluation of recommended information provided by the embodiment of the present application.
  • FIG. 13 is a schematic diagram of the composition and structure of the device for evaluation of recommended information provided by the embodiment of the present application.
  • the exemplary structure of the evaluation device for recommended information 130 shown in FIG. Other exemplary structures of the evaluation device 130 for recommending information are foreseen, so the structures described here should not be regarded as limiting, for example, some components described below may be omitted, or components not described below may be added to suit certain applications special needs.
  • the evaluation device 130 for recommendation information shown in FIG. 13 includes: a processor 131 , at least one communication bus 132 , a user interface 133 , at least one external communication interface 134 and a memory 135 .
  • the communication bus 132 is configured to realize the connection communication between these components.
  • the user interface 133 may include a display screen, and the external communication interface 134 may include a standard wired interface and a wireless interface.
  • the processor 131 is configured to execute the program of the method for evaluating the recommendation information stored in the memory, so as to implement the steps in the method for evaluating the recommendation information provided by the above embodiments.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms. of.
  • the unit described above as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit; it may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may all be integrated into one processing unit, or each unit may be separately used as a unit, or two or more units may be integrated into one unit; the above integration
  • the unit can be implemented either in the form of hardware or in the form of hardware plus software functional units.
  • the aforementioned program can be stored in a computer-readable storage medium, and when the program is executed, the execution includes the above The steps of the method embodiment; and the aforementioned storage medium includes: various media that can store program codes, such as a removable storage device, a ROM, a magnetic disk, or an optical disk.
  • the above-mentioned integrated units of the present application are implemented in the form of software function modules and sold or used as independent products, they may also be stored in a computer-readable storage medium.
  • the computer software products are stored in a storage medium and include several instructions for One device is made to execute all or part of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes various media that can store program codes, such as a removable storage device, a ROM, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Finance (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Embodiments of the present application provide a recommendation information evaluation method, apparatus and device, and a computer readable storage medium. The method comprises: obtaining, from a recommendation information delivery platform, recommendation information for an object to be recommended and object information of the object; inputting the recommendation information and the object information into a trained document scoring model for evaluation to obtain a scoring result of the recommendation information in each dimension, the dimensions comprising a theme, a compliance degree, an attraction, and smoothness; and determining an evaluation result of the recommendation information on the basis of the scoring result of the recommendation information in each dimension.

Description

推荐信息的评估方法、装置、设备及计算机可读存储介质Evaluation method, apparatus, device and computer-readable storage medium for recommendation information
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请基于申请号为202011362739.9、申请日为2020年11月27日、申请名称为“推荐信息的评估方法、装置、设备及计算机可读存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is based on the Chinese patent application with the application number of 202011362739.9, the application date of November 27, 2020, and the application title of "Recommendation Information Evaluation Method, Apparatus, Equipment and Computer-readable Storage Medium", and requests the Chinese patent application The priority of the Chinese patent application is incorporated herein by reference.
技术领域technical field
本申请涉及计算机应用技术领域,涉及但不限于一种推荐信息的评估方法、装置、设备及计算机可读存储介质。The present application relates to the technical field of computer applications, and relates to, but is not limited to, a method, apparatus, device, and computer-readable storage medium for evaluating recommendation information.
背景技术Background technique
广告内容是广告的重要元素,关系到商品的转化率、品牌的传播等。广告文案作为广告内容的载体,其重要程度可见一斑。在综合网上购物商城中,销售超数万品牌、数千万种商品,每天需要投放数百万条广告,如何对系统中数百万量级的广告文案进行准确、客观地评估,对确定一个广告文案是否为劣质广告以确定是否需要过滤剔除该广告,减少不必要的广告花费,降低企业运营成本至关重要。Advertising content is an important element of advertising, which is related to the conversion rate of products and the spread of brands. The importance of advertising copy as the carrier of advertising content is evident. In a comprehensive online shopping mall that sells over tens of thousands of brands and tens of millions of products, and needs to place millions of advertisements every day, how to accurately and objectively evaluate the millions of advertisements in the system, and determine an It is very important to determine whether the advertisement copy is a low-quality advertisement to determine whether the advertisement needs to be filtered out, so as to reduce unnecessary advertisement expenses and reduce the operating cost of the enterprise.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本申请实施例提供一种推荐信息的评估方法、装置、设备及计算机可读存储介质。In view of this, embodiments of the present application provide a method, apparatus, device, and computer-readable storage medium for evaluating recommendation information.
本申请实施例的技术方案是这样实现的:The technical solutions of the embodiments of the present application are implemented as follows:
本申请实施例提供一种推荐信息的评估方法,所述方法包括:The embodiment of the present application provides a method for evaluating recommendation information, and the method includes:
从推荐信息投放平台获取待推荐对象的推荐信息和所述待推荐对象的对象信息;Obtain the recommendation information of the object to be recommended and the object information of the object to be recommended from the recommendation information delivery platform;
将所述推荐信息和所述对象信息输入至训练好的文案评分模型进行评估,得到所述推荐信息在各维度的评分结果,所述维度包括主题性、合规度、吸引力和通顺度;Inputting the recommendation information and the object information into the trained copywriting scoring model for evaluation, and obtaining the scoring results of the recommendation information in each dimension, the dimensions including subjectivity, compliance, attractiveness, and smoothness;
基于所述推荐信息在各维度的评分结果确定所述推荐信息的评估结果。The evaluation result of the recommendation information is determined based on the scoring results of the recommendation information in each dimension.
本申请实施例提供一种推荐信息的评估装置,所述装置包括:An embodiment of the present application provides a device for evaluating recommendation information, and the device includes:
第一获取模块,配置为从推荐信息投放平台获取待推荐对象的推荐信息和所述待推荐对象的对象信息;a first obtaining module, configured to obtain recommendation information of the object to be recommended and object information of the object to be recommended from the recommendation information delivery platform;
评估模块,配置为将所述推荐信息和所述对象信息输入至训练好的文案评分模型进行评估,得到所述推荐信息在各维度的评分结果,所述维度包括主题性、合规度、吸引力和通顺度;The evaluation module is configured to input the recommendation information and the object information into the trained copywriting scoring model for evaluation, and obtain the scoring results of the recommendation information in each dimension, and the dimensions include subjectivity, compliance, attractiveness strength and smoothness;
确定模块,配置为基于所述推荐信息在各维度的评分结果确定所述推荐信息的评估结果。A determination module configured to determine an evaluation result of the recommendation information based on the scoring results of the recommendation information in each dimension.
本申请实施例提供一种推荐信息的评估设备,包括:The embodiment of the present application provides an evaluation device for recommended information, including:
处理器;以及processor; and
存储器,配置为存储可在所述处理器上运行的计算机程序;a memory configured to store a computer program executable on the processor;
其中,所述计算机程序被处理器执行时实现上述推荐信息的评估方法的步骤。Wherein, when the computer program is executed by the processor, the steps of the above-mentioned evaluation method for recommendation information are implemented.
本申请实施例提供一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令配置为执行上述推荐信息的评估方法的步骤。Embodiments of the present application provide a computer-readable storage medium storing computer-executable instructions, where the computer-executable instructions are configured to execute the steps of the foregoing method for evaluating recommendation information.
本申请实施例提供一种推荐信息的评估方法、装置、设备及计算机可读存储介质,其中,所述方法包括:从推荐信息投放平台获取待推荐对象的推荐信息和所述待推荐对象的对象信息;将所述推荐信息和所述对象信息输入至训练好的文案评分模型进行评估,得到所述推荐信息在各维度的评分结果,所述维度包括主题性、合规度、吸引力和通顺度;基于所述推荐信息在各维度的评分结果确定所述推荐信息的评估结果。如此,能够实现对推荐信息进行客观、多维度地量化评估,能够提高评估效率和评估准确度、缩短评估耗时、减少评估成本、降低评估风险。Embodiments of the present application provide a method, apparatus, device, and computer-readable storage medium for evaluating recommendation information, wherein the method includes: acquiring recommendation information of an object to be recommended and an object of the object to be recommended from a recommendation information delivery platform information; input the recommendation information and the object information into the trained copywriting scoring model for evaluation, and obtain the scoring results of the recommendation information in each dimension, the dimensions include subjectivity, compliance, attractiveness and smoothness The evaluation result of the recommendation information is determined based on the scoring results of the recommendation information in each dimension. In this way, an objective and multi-dimensional quantitative evaluation of the recommended information can be realized, which can improve evaluation efficiency and evaluation accuracy, shorten evaluation time, reduce evaluation cost, and reduce evaluation risk.
附图说明Description of drawings
在附图(其不一定是按比例绘制的)中,相似的附图标记可在不同的视图中描述相似的部件。附图以示例而非限制的方式大体示出了本文中所讨论的各个实施例。In the drawings, which are not necessarily to scale, like reference numerals may describe like parts in the different views. The accompanying drawings generally illustrate, by way of example and not limitation, the various embodiments discussed herein.
图1为本申请实施例提供的推荐信息的评估方法的一种实现流程示意图;FIG. 1 is a schematic flowchart of a realization of a method for evaluating recommendation information provided by an embodiment of the present application;
图2为本申请实施例提供的推荐信息的评估方法构造的字典树的示意图;2 is a schematic diagram of a dictionary tree constructed by an evaluation method for recommendation information provided by an embodiment of the present application;
图3为本申请实施例提供的推荐信息的评估方法构造的基于字典树的查找模型的示意图;3 is a schematic diagram of a dictionary tree-based search model constructed by the method for evaluating recommendation information provided by an embodiment of the present application;
图4为本申请实施例提供的推荐信息的评估方法的另一种实现流程示意图;FIG. 4 is a schematic flowchart of another implementation of the method for evaluating recommendation information provided by the embodiment of the present application;
图5为本申请实施例提供的广告创意文案的评估方法的实现原理示意图;FIG. 5 is a schematic diagram of the implementation principle of the evaluation method for advertising creative copy provided by the embodiment of the present application;
图6为本申请实施例提供的广告文案主题性评分模型的训练示意图;FIG. 6 is a schematic diagram of training of an advertisement copy theme scoring model provided by an embodiment of the present application;
图7为本申请实施例提供的广告文案主题合规度模型的训练示意图;FIG. 7 is a schematic diagram of training of an advertisement copy theme compliance degree model provided by an embodiment of the present application;
图8为本申请实施例提供的广告文案主题吸引力模型的训练示意图;FIG. 8 is a schematic diagram of training of an advertising copy topic attractiveness model provided by an embodiment of the present application;
图9为本申请实施例提供的广告文案主题自然度模型的训练示意图;FIG. 9 is a schematic diagram of training of an advertising copy subject naturalness model provided by an embodiment of the present application;
图10为3个文案在ICAN模型下各个维度的评分的示意图;Figure 10 is a schematic diagram of the scores of three copywriting in each dimension under the ICAN model;
图11为3个文案在ICAN模型下各个维度的评分的雷达图;Figure 11 is a radar chart of the scores of the three copywriting in each dimension under the ICAN model;
图12为本申请实施例提供的推荐信息的评估装置的组成结构示意图;FIG. 12 is a schematic diagram of the composition and structure of a device for evaluating recommendation information provided by an embodiment of the present application;
图13为本申请实施例提供的推荐信息的评估设备的组成结构示意图。FIG. 13 is a schematic diagram of the composition and structure of an evaluation device for recommendation information provided by an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请作进一步地详细描述,所描述的实施例不应视为对本申请的限制,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be described in further detail below with reference to the accompanying drawings. All other embodiments obtained under the premise of creative work fall within the scope of protection of the present application.
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" can be the same or a different subset of all possible embodiments, and Can be combined with each other without conflict.
在以下的描述中,所涉及的术语“第一\第二\第三”仅仅是区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使 这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。In the following description, the term "first\second\third" is only used to distinguish similar objects, and does not represent a specific ordering of objects. It is understood that "first\second\third" is used in Where permitted, the specific order or sequence may be interchanged to enable the embodiments of the application described herein to be practiced in sequences other than those illustrated or described herein.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of the present application, and are not intended to limit the present application.
为了更好地理解本申请实施例,首先对相关技术中推荐信息的评估方法进行说明。In order to better understand the embodiments of the present application, an evaluation method of recommendation information in the related art is first described.
相关技术中推荐信息的评估方法如下:方法1:基于测评人员的评估方法,这种方法主要依靠评测人员的专业知识背景、个人经验等对推荐信息进行评估,例如选择能够代表消费者态度的专家对广告文案进行评估。方法2:基于问卷调查的评估方法,结合待推荐对象的内容设计调查问卷,依据待推荐对象的受众属性筛选合适的受访人群,受访者评估推荐信息的形式、风格、诉求点、理解程度等,选择出效果可能比较理想的推荐信息用于实际投放。方法3:基于实际投放效果的评估方法,该方法需要将推荐信息进行实际投放,基于监测到的点击次数、展示次数以及费用等相关指标等对推荐信息进行评估,不断地迭代修改以优化推荐信息、提升推荐效果。The evaluation methods of recommended information in related technologies are as follows: Method 1: Evaluation method based on evaluators, this method mainly relies on evaluators' professional knowledge background, personal experience, etc. to evaluate recommended information, such as selecting experts who can represent consumers' attitudes Evaluate ad copy. Method 2: Based on the evaluation method of questionnaire survey, a questionnaire is designed in combination with the content of the object to be recommended, and the appropriate interviewees are screened according to the audience attributes of the object to be recommended. The interviewee evaluates the form, style, appeal point, and understanding of the recommended information. and so on, and select recommended information that may have an ideal effect for actual delivery. Method 3: Evaluation method based on the actual delivery effect. This method requires the actual delivery of the recommended information, based on the monitored clicks, impressions, costs and other related indicators to evaluate the recommended information, and iteratively revise to optimize the recommended information. , to improve the recommendation effect.
本申请实施例提供一种推荐信息的评估方法。本申请实施例提供的方法可以通过计算机程序来实现,该计算机程序在执行的时候,完成本申请实施例提供的推荐信息的评估方法中各个步骤。在一些实施例中,该计算机程序可以推荐信息的评估设备中的处理器执行。图1为本申请实施例提供的推荐信息的评估方法的一种实现流程示意图,如图1所示,该方法包括以下步骤:This embodiment of the present application provides a method for evaluating recommendation information. The methods provided by the embodiments of the present application may be implemented by a computer program, and when the computer program is executed, each step in the method for evaluating the recommendation information provided by the embodiments of the present application is completed. In some embodiments, the computer program may be executed by a processor in an evaluation device for recommendation information. FIG. 1 is a schematic flowchart of an implementation of a method for evaluating recommendation information provided by an embodiment of the present application. As shown in FIG. 1 , the method includes the following steps:
步骤S101,从推荐信息投放平台获取待推荐对象的推荐信息和待推荐对象的对象信息。Step S101: Obtain recommendation information of the object to be recommended and object information of the object to be recommended from the recommendation information delivery platform.
本申请实施例以推荐信息为广告文案为例、推荐信息投放平台以广告投放平台进行说明,则本申请实施例提供方法的步骤,可以由广告文案的评估设备实现。广告文案的评估设备与广告投放平台建立有连接关系。广告投放平台在投放广告之前,为了确保待投放广告的广告质量,需要由广告文案的评估设备对待投放广告进行评估,以根据评估结果确定是否正常投放待投放广告的广告文案。广告文案的评估设备首先获取待投放广告的广告文案,以及与广告文案对应的对象信息。这里,需要说明的是,广告文案和与广告文案对应的对象信息,可以是同一对象的信息,也可以是不同对象的信息。当广告文案与对象信息对应同一对象,表明广告文案描述的对象与对象信息描述的对象为同一对象,即广告文案与对象信息匹配;当广告文案与对象信息对应不同对象,表明广告文案描述的对象与对象信息描述的对象为不同对象,即广告文案与对象信息不匹配。例如,若广告文案为“爸爸会喜欢的茶壶”,对象信息为“透明玻璃茶壶泡菊花茶”,描述的对象都是“茶壶”,此时,广告文案和与广告文案对应的对象信息为同一对象的信息;若广告文案为“爸爸会喜欢的茶壶”,对象信息为“特级茶叶明前云南特产”,广告文案描述的对象是“茶壶”,对象信息描述的对象是“茶叶”,此时,广告文案和与广告文案对应的对象信息为不同对象的信息。The embodiment of the present application takes the recommendation information as an advertisement copy as an example, and the recommendation information delivery platform is described as an advertisement delivery platform. The steps of the method provided in the embodiment of the present application may be implemented by an advertising copy evaluation device. The advertising copy evaluation equipment establishes a connection relationship with the advertising delivery platform. Before an advertisement placement platform places an advertisement, in order to ensure the quality of the advertisement to be placed, the evaluation device of the advertisement copy needs to evaluate the to-be-placed advertisement, so as to determine whether the advertisement copy to be placed is normally placed according to the evaluation result. The advertising copy evaluation device first obtains the advertisement copy of the advertisement to be placed, and the object information corresponding to the advertisement copy. Here, it should be noted that the advertisement copy and the object information corresponding to the advertisement copy may be information of the same object or information of different objects. When the ad copy and the object information correspond to the same object, it indicates that the object described by the ad copy and the object described by the object information are the same object, that is, the ad copy matches the object information; when the ad copy and the object information correspond to different objects, it indicates that the object described by the ad copy The object described by the object information is a different object, that is, the ad copy does not match the object information. For example, if the advertisement copy is "a teapot that Dad will like", the object information is "a transparent glass teapot for chrysanthemum tea", and the described objects are all "teapot", in this case, the advertisement copy and the object information corresponding to the advertisement copy are the same The information of the object; if the advertisement copy is "Teapot that Dad will like", the object information is "Premium tea, a special product of Yunnan before the Ming Dynasty", the object described in the advertisement copy is "Teapot", and the object described in the object information is "Tea", at this time , the advertisement copy and the object information corresponding to the advertisement copy are information of different objects.
步骤S102,将推荐信息和对象信息输入至训练好的文案评分模型进行评估,得到推荐信息在各维度的评分结果。In step S102, the recommendation information and the object information are input into the trained copywriting scoring model for evaluation, and the scoring results of the recommendation information in each dimension are obtained.
这里,维度包括主题性、合规度、吸引力和通顺度。本申请实施例提出一种文案评分模型ICAN,该文案评分模型ICAN考虑至少包括主题性(I,Integrated)、合规性(C,Compliance)、吸引力(A,Appeal)和自然性(N,Natural)(也称为通顺度)这几个维度,预先对提出的文案评分模型进行训练,得到训练好的文案评分模型ICAN。将待评估信息输入至训练好的文案评分模型ICAN,得到主 题性I、合规度C、吸引力A和通顺度N多个维度的评分结果。本申请实施例由广告文案的评估设备对待投放广告的广告文案进行多维度地量化评估,能够确保评分结果的客观性,考虑多维度的评分结果,能够提高评估准确度,与相关技术中广告文案的评估方法相比,能够提高评估效率、缩短评估耗时、减少评估成本、降低评估风险。Here, dimensions include topicality, compliance, attractiveness, and fluency. The embodiment of the present application proposes a copywriting scoring model ICAN. The copywriting scoring model ICAN considers at least subjectivity (I, Integrated), compliance (C, Compliance), attractiveness (A, Appeal) and naturalness (N, Natural) (also called fluency) dimensions, pre-train the proposed copywriting scoring model to obtain the trained copywriting scoring model ICAN. Input the information to be evaluated into the trained copywriting scoring model ICAN, and obtain the scoring results of multiple dimensions of subject I, compliance C, attractiveness A, and smoothness N. In the embodiment of the present application, the advertising copy evaluation device performs multi-dimensional quantitative evaluation on the advertisement copy to be placed in the advertisement, which can ensure the objectivity of the scoring result, and can improve the evaluation accuracy by considering the multi-dimensional scoring result. Compared with other evaluation methods, it can improve evaluation efficiency, shorten evaluation time, reduce evaluation cost, and reduce evaluation risk.
步骤S103,基于推荐信息在各维度的评分结果确定推荐信息的评估结果。Step S103: Determine an evaluation result of the recommendation information based on the scoring results of the recommendation information in each dimension.
在一种实现方式中,得到各维度的评分结果后,根据各个评分结果,确定推荐信息的评分结果;判断推荐信息的评分结果是否大于第一预设阈值;当推荐信息的评分结果大于第一预设阈值时,确定推荐信息的评估结果为评估通过;当推荐信息的评分结果小于或等于第一预设阈值时,确定推荐信息的评估结果为评估不通过。根据各个评分结果,确定推荐信息的评分结果的一种实现方式为:根据各维度的评分结果确定推荐信息的评分结果时,可以结合雷达图来确定,将各维度的评分结果在雷达图中组成的四边形的面积确定为推荐信息的评分结果。该实现方式确定的推荐信息的评分结果,是基于多个维度的整体性确定的。基于整体性确定推荐信息的评分结果,便于后续对推荐信息进行整体性的调整优化,或者直接剔除整体评分较低的推荐信息。In an implementation manner, after obtaining the scoring results of each dimension, the scoring results of the recommended information are determined according to the respective scoring results; it is judged whether the scoring results of the recommended information are greater than the first preset threshold; When the preset threshold is set, the evaluation result of the recommended information is determined to be passed; when the scoring result of the recommended information is less than or equal to the first preset threshold, the evaluation result of the recommended information is determined to be unsuccessful. An implementation method of determining the scoring result of the recommended information according to each scoring result is: when determining the scoring result of the recommended information according to the scoring results of each dimension, it can be determined in combination with the radar chart, and the scoring results of each dimension are formed in the radar chart. The area of the quadrilateral is determined as the scoring result of the recommended information. The scoring result of the recommendation information determined by this implementation is determined based on the integrity of multiple dimensions. The scoring result of the recommended information is determined based on the overallity, which facilitates subsequent overall adjustment and optimization of the recommended information, or direct elimination of the recommended information with a lower overall score.
在另一种实现方式中,得到各维度的评分结果后,计算主题性评分结果、合规度评分结果、吸引力评分结果和自然度评分结果的方差;判断方差是否小于第二预设阈值;当方差小于第二预设阈值时,进一步判断主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果中是否存在至少一个评分结果大于第三预设阈值;当主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果中存在至少一个评分结果大于第三预设阈值时,确定推荐信息的评估结果为评估通过;当方差大于或等于第二预设阈值,或者主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果均小于或等于第三预设阈值时,确定推荐信息的评估结果为评估不通过。通过该实现方式,可以过滤出在某一维度评分结果过低、或是在某一维度评分结果过高的推荐信息,便于后续对推荐信息进行某一维度进行调整优化,或者直接剔除各维度评分差异较大的推荐信息。In another implementation manner, after obtaining the scoring results of each dimension, calculate the variance of the thematic scoring results, the compliance scoring results, the attractiveness scoring results and the naturalness scoring results; and determine whether the variance is less than a second preset threshold; When the variance is smaller than the second preset threshold, it is further judged whether there is at least one scoring result in the thematic scoring result, the compliance scoring result, the attractiveness scoring result and the fluent scoring result that is greater than the third preset threshold; when the thematic scoring result When at least one of the results, the compliance score, the attractiveness score, and the fluent score is greater than the third preset threshold, it is determined that the evaluation result of the recommended information is the evaluation pass; when the variance is greater than or equal to the second preset The threshold value, or the subjectivity score result, compliance score result, attractiveness score result, and smoothness score result are all less than or equal to the third preset threshold, the evaluation result of the recommended information is determined to be an evaluation failure. Through this implementation, it is possible to filter out the recommendation information whose scoring result in a certain dimension is too low or the scoring result in a certain dimension is too high, so as to facilitate the subsequent adjustment and optimization of a certain dimension of the recommended information, or directly eliminate the scoring of each dimension Recommendations that vary widely.
本申请实施例提供的推荐信息的评估方法,从推荐信息投放平台获取待推荐对象的推荐信息和待推荐对象的对象信息;将推荐信息和对象信息输入至训练好的文案评分模型进行评估,得到推荐信息在各维度的评分结果,维度包括主题性、合规度、吸引力和通顺度;基于推荐信息在各维度的评分结果确定推荐信息的评估结果。如此,能够实现对推荐信息进行客观、多维度地量化评估,能够提高评估效率和评估准确度、缩短评估耗时、减少评估成本、降低评估风险。In the evaluation method for recommendation information provided by the embodiment of the present application, the recommendation information of the object to be recommended and the object information of the object to be recommended are obtained from the recommendation information delivery platform; the recommendation information and the object information are input into the trained copywriting scoring model for evaluation, and the The scoring results of the recommended information in each dimension, including subjectivity, compliance, attractiveness, and smoothness; the evaluation results of the recommended information are determined based on the scoring results of the recommended information in each dimension. In this way, an objective and multi-dimensional quantitative evaluation of the recommended information can be realized, which can improve evaluation efficiency and evaluation accuracy, shorten evaluation time, reduce evaluation cost, and reduce evaluation risk.
在一些实施例中,图1所示实施例的步骤S102之前,推荐信息的评估方法还包括以下步骤:In some embodiments, before step S102 of the embodiment shown in FIG. 1 , the evaluation method for recommendation information further includes the following steps:
步骤S11,获取主题性样本集、敏感词集、吸引力样本集和通顺度样本集。Step S11 , obtaining the thematic sample set, the sensitive word set, the attractiveness sample set and the fluent degree sample set.
需要说明的是,获取的主题性样本集、吸引力样本集和通顺度样本集可以为同一样本集,各样本集中包括的样本推荐信息和样本对象信息是相同的,区别在于训练不同的模型时,不同模型的输入信息有所差异。推荐信息以广告文案为例。广告文案作为广告内容的载体,应该以健康、积极地表现形式传达广告内容,引导消费者树立正确的价值观。基于此,将不符合法律法规的敏感词组成敏感词集。当判断广告文案中包括敏感词时,确定该广告文案不合规。It should be noted that the obtained thematic sample set, attractiveness sample set, and commensurate sample set can be the same sample set, and the sample recommendation information and sample object information included in each sample set are the same. The difference lies in the training of different models. , the input information of different models is different. The recommendation information takes ad copy as an example. As the carrier of advertising content, advertising copy should convey the advertising content in a healthy and positive form, and guide consumers to establish correct values. Based on this, sensitive words that do not comply with laws and regulations are formed into a sensitive word set. When it is judged that the ad copy contains sensitive words, it is determined that the ad copy is not compliant.
步骤S12,将主题性样本集、吸引力样本集和通顺度样本集分别输入至预设主题性网络模型、 预设吸引力网络模型和预设通顺度网络模型,得到训练好的主题性网络模型、训练好的吸引力网络模型和训练好的通顺度网络模型。Step S12, input the thematic sample set, the attractiveness sample set and the fluent degree sample set into the preset thematic network model, the preset attractive network model and the preset fluent network model, respectively, to obtain a trained thematic network model , the trained attractiveness network model and the trained smoothness network model.
将主题性样本集输入至预设主题性网络模型,得到训练好的主题性网络模型。该训练好的主题性网络模型,配置为确定待推荐对象的推荐信息和待推荐对象的对象信息的主题性评分。在实际实现时,首先确定推荐信息的主题为第一主题,确定对象信息的主题为第二主题,然后计算第一主题和第二主题匹配的概率,将匹配概率确定为该推荐信息的主题性评分。例如,当广告文案的主题和对象信息的主题匹配概率为1时,广告文案的主题性评分为1;当广告文案的主题和对象信息的主题匹配概率为0.1时,广告文案的主题性评分为0.1。将吸引力样本集输入至预设吸引力网络模型,得到训练好的吸引力网络模型。该训练好的吸引力网络模型,配置为确定推荐信息的吸引力评分。推荐信息传递的信息量越大,越能吸引用户,即吸引力越大。信息学中,衡量信息量的量化指标被称为“信息熵”,本申请实施例中可使用推荐信息的信息熵来确定其吸引力。例如,在进行商品广告推荐时,首先预设特征信息集,该特征信息集中包括商品的品类、优惠和属性词等特征信息,然后将广告文案输入至训练好的吸引力网络模型,得到广告文案的概率分布,根据概率分布计算信息熵,将信息熵确定为该广告文案的吸引力评分。将通顺度样本集输入至预设通顺度网络模型,得到训练好的通顺度网络模型。该训练好的通顺度网络模型,配置为确定推荐信息的自然度(即通顺度)评分。语言模型混淆度(PPL,Perplixity),即困惑度,是度量语言模型性能的指标。本申请实施例中可以使用混淆度来量化推荐信息的通顺度,推荐信息的混淆度越低,说明该推荐信息的语义越通顺自然,通顺度越高;反之,该推荐信息存在语义不通顺、或存在错别字的情况。在实际实现时,可采用预设的PPL的计算公式来计算混淆度,然后基于汉语语言模型N-Gram,将计算得到的推荐信息的混淆度采用加权求和得到推荐信息的通顺度。Input the thematic sample set into the preset thematic network model to obtain a trained thematic network model. The trained thematic network model is configured to determine the recommendation information of the object to be recommended and the thematic score of the object information of the object to be recommended. In actual implementation, first determine the subject of the recommended information as the first subject, determine the subject of the object information as the second subject, then calculate the matching probability between the first subject and the second subject, and determine the matching probability as the subjectivity of the recommended information score. For example, when the topic matching probability of the subject of the ad copy and the object information is 1, the subject score of the ad copy is 1; when the subject match probability of the subject of the ad copy and the subject information is 0.1, the subject score of the ad copy is 0.1. Input the attractiveness sample set into the preset attractiveness network model to obtain the trained attractiveness network model. The trained attractiveness network model is configured to determine the attractiveness score of the recommended information. The greater the amount of information conveyed by the recommendation information, the more attractive the user is, that is, the greater the attraction. In informatics, a quantitative index for measuring the amount of information is called "information entropy", and in this embodiment of the present application, the information entropy of recommended information may be used to determine its attractiveness. For example, when recommending product advertisements, first preset a feature information set, which includes feature information such as product categories, discounts, and attribute words, and then input the advertisement copy into the trained attractive network model to obtain the advertisement copy. The probability distribution of , calculates the information entropy according to the probability distribution, and determines the information entropy as the attractiveness score of the advertisement copy. Input the smoothness sample set into the preset smoothness network model to obtain the trained smoothness network model. The trained smoothness network model is configured to determine the naturalness (ie, smoothness) score of the recommended information. Language model perplexity (PPL, Perplixity), that is, perplexity, is an indicator to measure the performance of language models. In this embodiment of the present application, the degree of confusion may be used to quantify the degree of smoothness of the recommended information. The lower the degree of confusion of the recommended information, the more natural the semantics of the recommended information, and the higher the degree of smoothness; or there are typos. In actual implementation, the preset PPL calculation formula can be used to calculate the confusion degree, and then based on the Chinese language model N-Gram, the calculated confusion degree of the recommended information is weighted and summed to obtain the smoothness of the recommended information.
步骤S13,根据敏感词集,构造基于字典树的查找模型。Step S13, construct a search model based on dictionary tree according to the sensitive word set.
在一些实施例中,步骤S13可以通过以下步骤来实现:In some embodiments, step S13 can be implemented by the following steps:
步骤S131,根据敏感词集中各敏感词,构造字典树。在实际实现时,首先获取所有敏感词组成的文本数据,并将不同敏感词划分为不同行;读入当前行敏感词,将当前行敏感词的当前字符与当前节点的子节点比较,查找与之匹配的子节点。如果查找成功,将查找到的子节点作为当前节点,并继续当前行敏感词的下一个字符;如果查找失败,新建子节点插入当前节点中,并将新建子节点作为当前节点,继续当前行敏感词的下一个字符。当当前行敏感词全部查找完成后,读入下一行敏感词,继续执行相同的操作,直至读入之后一行敏感词,且最后一行敏感词的最后一个字符查找完成后停止。此时构造的树即为字典树,也称为trie树。例如,敏感词集为{高h,高仿,高利贷,仿真枪,真人游戏},构造的trie树如图2所示。Step S131, construct a dictionary tree according to each sensitive word in the sensitive word set. In actual implementation, first obtain the text data composed of all sensitive words, and divide different sensitive words into different lines; read the current line sensitive words, compare the current characters of the current line sensitive words with the child nodes of the current node, and find the the matched child nodes. If the search succeeds, take the found child node as the current node, and continue with the next character of the current line-sensitive word; if the search fails, insert a new child node into the current node, and use the new child node as the current node, and continue the current line-sensitive word the next character of the word. When all the sensitive words in the current line are searched, the next line of sensitive words is read, and the same operation is continued until the next line of sensitive words is read, and the last character of the last line of sensitive words is searched and stopped. The tree constructed at this time is a dictionary tree, also known as a trie tree. For example, the sensitive word set is {high h, high imitation, usury, simulation gun, real game}, and the constructed trie tree is shown in Figure 2.
步骤S132,对字典树中各节点添加查询失败指针,得到基于字典树的查找模型。trie树虽然能用于多模式匹配,但是每次匹配失败都需要进行回溯,如果模式串很长的话会很浪费时间,基于此,在步骤S131之后,本申请实施例继续执行步骤S132,引入多模匹配算法AC自动机(Aho-Corasick automaton)。AC自动机就是在tire树的基础上,增加一个查询失败指针即fail指针,如果当前节点匹配失败,则将指针转移到fail指针指向的地方,这样就不用回溯,即可继续匹配。在实际实现时, 构造AC自动机可以通过以下伪代码来实现:Step S132, adding a query failure pointer to each node in the dictionary tree to obtain a lookup model based on the dictionary tree. Although the trie tree can be used for multi-pattern matching, backtracking is required every time the matching fails. If the pattern string is very long, it will be a waste of time. Based on this, after step S131, the embodiment of the present application continues to perform step S132, introducing multiple Modular matching algorithm AC automaton (Aho-Corasick automaton). The AC automaton adds a query failure pointer, that is, the fail pointer, on the basis of the tire tree. If the current node fails to match, the pointer is transferred to the place pointed by the fail pointer, so that the matching can continue without backtracking. In actual implementation, the construction of AC automata can be achieved by the following pseudocode:
1)将根节点的所有孩子节点的fail指向根节点,然后将根节点的所有孩子节点依次入列;//这里,fail为查询失败指针。1) Point the fail of all child nodes of the root node to the root node, and then list all the child nodes of the root node in sequence; //Here, fail is the query failure pointer.
2)若队列不为空:2) If the queue is not empty:
2.1)出列,将出列的节点记为curr,failTo=curr.fail;//这里,failTo表示curr的fail指向的节点。2.1) Dequeue, dequeue the dequeued node as curr, failTo=curr.fail; //Here, failTo represents the node pointed to by fail of curr.
2.2)a.判断curr.child[i]==failTo.child[i]是否成立;2.2) a. Determine whether curr.child[i]==failTo.child[i] is established;
成立:curr.child[i].fail=failTo.child[i];Established: curr.child[i].fail = failTo.child[i];
不成立:invalid:
判断failTo==null是否成立;Determine whether failTo==null is established;
成立:curr.child[i].fail==root;Established: curr.child[i].fail==root;
不成立:执行failTo=failTo.fail,继续执行步骤2.2);Not established: execute failTo=failTo.fail, and continue to step 2.2);
b.curr.child[i]入列,继续执行步骤2);b.curr.child[i] is listed, continue to step 2);
3)若队列为空:结束。3) If the queue is empty: end.
仍以上述举例说明,对图2所示的字典树添加敏感词查询失败指针,得到基于字典树的查找模型如图3所示。输入广告文案至图3所示的查找模型,得到广告文案的合规度评分结果,可以通过以下伪代码来实现:1)将当前节点的指针指向AC自动机的根节点,即curr=root;2)从广告文案的文本串中读取(下)一个字符;3)从当前节点的所有孩子节点中查找与该字符匹配的节点;若查找成功:判断当前节点以及当前节点fail指向的节点是否表示一个字符串的结束,若是,则将文本串中索引起点记录在对应字符串保存结果集合中(索引起点=当前索引-字符串长度+1)。将curr指向该孩子节点,继续执行步骤2);若查找失败:执行步骤4)。4)若fail==null(说明目标字符串中没有任何字符串是输入字符串的前缀,相当于重启状态机),curr=root,继续执行步骤2);否则,将当前节点的指针指向fail节点,继续执行步骤3)。Still taking the above example to illustrate, adding a sensitive word query failure pointer to the dictionary tree shown in FIG. 2 , a search model based on the dictionary tree is obtained as shown in FIG. 3 . Input the advertisement copy to the search model shown in Figure 3, and get the compliance score result of the advertisement copy, which can be achieved by the following pseudocode: 1) Point the pointer of the current node to the root node of the AC automaton, that is, curr=root; 2) Read (next) a character from the text string of the advertisement copy; 3) Find a node matching the character from all child nodes of the current node; if the search is successful: determine whether the current node and the node pointed to by the current node fail Indicates the end of a character string, if so, record the index start point in the text string in the corresponding character string saving result set (index start point=current index-string length+1). Point curr to the child node, and continue to perform step 2); if the search fails: perform step 4). 4) If fail==null (indicating that no string in the target string is the prefix of the input string, which is equivalent to restarting the state machine), curr=root, continue to step 2); otherwise, point the pointer of the current node to fail node, continue to step 3).
步骤S14,基于训练好的主题性网络模型、训练好的吸引力网络模型、训练好的通顺度网络模型和查找模型构建训练好的文案评分模型。本申请实施例获取样本集,对预先提出的文案评分模型ICAN进行训练,得到训练好的文案评分模型ICAN,训练好的文案评分模型ICAN能够对推荐信息进行主题性I、合规度C、吸引力A和自然度N多维度地量化评估,能够确保评分结果的客观性,考虑多维度的评分结果,能够提高评估准确度,与相关技术中推荐信息的评估方法相比,能够提高评估效率、缩短评估耗时、减少评估成本、降低评估风险。Step S14, constructing a trained copywriting scoring model based on the trained thematic network model, the trained attractiveness network model, the trained smoothness network model and the search model. In this embodiment of the present application, a sample set is obtained, and a pre-proposed copywriting scoring model ICAN is trained to obtain a trained copywriting scoring model ICAN. The trained copywriting scoring model ICAN can perform thematic I, compliance degree C, attractiveness evaluation on the recommended information. The multi-dimensional quantitative evaluation of force A and natural degree N can ensure the objectivity of the scoring results. Considering the multi-dimensional scoring results, it can improve the evaluation accuracy. Compared with the evaluation methods of recommended information in related technologies, it can improve the evaluation efficiency, Shorten evaluation time, reduce evaluation cost, and reduce evaluation risk.
在一些实施例中,上述步骤S12中“将主题性样本集输入至预设主题性网络模型,得到训练好的主题性网络模型”,可以实现为以下步骤:In some embodiments, in the above step S12, "inputting the thematic sample set into a preset thematic network model to obtain a trained thematic network model" can be implemented as the following steps:
步骤S121,获取主题性样本集中各样本对象的样本对象信息和样本推荐信息。样本对象信息指样本对象的描述信息,样本推荐信息为样本对象的推荐内容。例如,推广的商品为“茶壶”,样本对象信息为“透明玻璃茶壶泡菊花茶”,样本推荐信息为“爸爸会喜欢的茶壶”。Step S121: Obtain sample object information and sample recommendation information of each sample object in the thematic sample set. The sample object information refers to the description information of the sample object, and the sample recommendation information is the recommended content of the sample object. For example, the promoted product is "teapot", the sample object information is "transparent glass teapot for chrysanthemum tea", and the sample recommendation information is "teapot that dad will like".
步骤S122,将同一个样本对象的样本对象信息和样本推荐信息作为一组样本对,获取样本对的 标注信息。即将“透明玻璃茶壶泡菊花茶”和“爸爸会喜欢的茶壶”作为一组样本对(也称为样本句子对),在样本训练前,人工预先对样本集中样本对进行标注,得到样本对的信息。在训练时,推荐信息的评估设备可以从存储设备中获取人工预先标注并保存的样本对的标注信息。这里,标注信息表征样本对中的样本对象信息与样本推荐信息相匹配的概率。例如,样本对中样本推荐信息描述的是“茶壶”,样本对中样本对象信息描述的也是“茶壶”,则样本推荐信息与样本对象信息相匹配的概率为1;又例如,样本对中样本推荐信息描述的是“茶壶”,样本对中样本对象信息描述的是“手机”,则样本推荐信息与样本对象信息相匹配的概率为0。In step S122, the sample object information and the sample recommendation information of the same sample object are regarded as a group of sample pairs, and the labeling information of the sample pairs is obtained. That is, "a transparent glass teapot for chrysanthemum tea" and "a teapot that dad will like" are used as a set of sample pairs (also called sample sentence pairs). information. During training, the evaluation device for the recommendation information can obtain the label information of the sample pair that is manually pre-labeled and saved from the storage device. Here, the annotation information represents the probability that the sample object information in the sample pair matches the sample recommendation information. For example, if the sample recommendation information in the sample pair describes "teapot", and the sample object information in the sample pair also describes "teapot", the probability that the sample recommendation information matches the sample object information is 1; If the recommendation information describes "teapot", and the sample object information in the sample pair describes "mobile phone", the probability that the sample recommendation information matches the sample object information is 0.
步骤S123,将主题性样本集中各样本对象对应的各样本对和各样本对的标注信息输入至预设主题性网络模型进行训练学习,得到训练好的主题性网络模型。这里,训练好的主题性网络模型用于基于输入的评估对,确定并输出评估对的标注信息,以得到推荐信息的主题性评分结果。在实现时,可将主题性样本集中的样本对象信息、样本推荐信息作为样本对输入至预设主题性网络模型,将样本对的标注信息作为预设主题性网络模型的标注数据,进行迁移学习训练,得到训练好的主题性网络模型。这里,预设主题性网络模型可以为自然语言处理BERT(Bidirectional Encoder Representations from Transformers)模型。对主题性样本集训练K轮以后(例如设K=10),得到基于BERT的训练好的主题性网络模型,记为模型F。将待推荐对象的推荐信息,及待推荐对象的对象信息,两两构成评估对,输入到训练好的主题性网络模型,即可生成该推荐信息的主题性评分。例如,假设广告文案Di推广的商品描述为Ai(即对象信息为Ai),将句对Di、Ai输入模型F后,F输出两者相关的概率ri,ri即可作为广告文案Di的主题性评分。Step S123 , input each sample pair corresponding to each sample object in the thematic sample set and the labeling information of each sample pair into a preset thematic network model for training and learning, and obtain a trained thematic network model. Here, the trained thematic network model is used to determine and output the annotation information of the evaluation pair based on the input evaluation pair, so as to obtain the thematic scoring result of the recommended information. During implementation, the sample object information and sample recommendation information in the thematic sample set can be input to the preset thematic network model as sample pairs, and the annotation information of the sample pairs can be used as the annotation data of the preset thematic network model for transfer learning. Train to get the trained topic network model. Here, the preset topical network model can be a natural language processing BERT (Bidirectional Encoder Representations from Transformers) model. After K rounds of training on the thematic sample set (for example, set K=10), a trained thematic network model based on BERT is obtained, which is denoted as model F. The recommendation information of the object to be recommended and the object information of the object to be recommended constitute an evaluation pair, which is input into the trained thematic network model, and then the thematic score of the recommendation information can be generated. For example, assuming that the description of the product promoted by the advertisement copy Di is Ai (that is, the object information is Ai), after inputting the sentence pair Di and Ai into the model F, F outputs the probability ri related to the two, and ri can be used as the theme of the advertisement copy Di. score.
在一些实施例中,上述步骤S12中“将吸引力样本集输入至预设吸引力网络模型,得到训练好的吸引力网络模型”,可以实现为以下步骤:In some embodiments, in the above-mentioned step S12, "input the attractiveness sample set into the preset attractiveness network model to obtain a trained attractiveness network model", which can be implemented as the following steps:
步骤S124,获取吸引力样本集中各样本对象的样本推荐信息。信息学中,衡量信息量的量化指标被称为“信息熵”。用户看到推荐信息后,接收到了新信息,增加了其认知的信息熵。例如,之前用户并不知道茶壶在减价,看到广告“爸爸会喜欢的茶壶,满100七五折”,获知了茶壶在减价的信息。而广告“一件老人都喜欢的好东西,别人我不告诉他哦”,则给用户提供的信息量很少。用户总是希望看到有信息量的广告,体现在广告文案上,就是有明确的概念。基于此,本申请实施例使用推荐信息的信息熵来评估其吸引力。Step S124: Obtain sample recommendation information of each sample object in the attractive sample set. In informatics, the quantitative index to measure the amount of information is called "information entropy". After seeing the recommended information, users receive new information, which increases their cognitive information entropy. For example, the user did not know that the teapot was on sale before, and when he saw the advertisement "Teapot that Dad will like, 25% off over 100", he learned that the price of the teapot was on sale. And the advertisement "a good thing that the old people like, I won't tell others", the amount of information provided to users is very small. Users always want to see informative advertisements, which is reflected in the advertisement copy, that is, there is a clear concept. Based on this, the embodiments of the present application use the information entropy of the recommendation information to evaluate its attractiveness.
步骤S125,对各样本对象的样本推荐信息进行信息提取,得到各样本对象的特征信息集。特征信息集包括样本对象的名称、品类、优惠和属性词中至少一个。例如设置以下特征信息:品类、优惠、属性词。品类,即商品的种类信息,例如“手机”、“生鲜”等特征;优惠,即商品的促销信息,例如“满减”、“赠品”、“折扣”等特征;属性词,例如“红色”、“原木”、“进口”、“名媛”、“夏季”等特征。假设N个特征信息(C 1,C 2,C 3,…,C N),广告文案D i属于各特征的概率为向量
Figure PCTCN2021130006-appb-000001
会忽略。相反,如果D i概念明确,E i会相对更小,用户看到这样的广告文案更可能被吸引。
Step S125: Perform information extraction on the sample recommendation information of each sample object to obtain a feature information set of each sample object. The feature information set includes at least one of the name, category, discount and attribute word of the sample object. For example, set the following feature information: category, discount, attribute word. Category, that is, the category information of the product, such as "mobile phone", "fresh food" and other features; discount, that is, the promotional information of the product, such as "full discount", "gift", "discount" and other features; attribute words, such as "red"","log","import","celebrity","summer" and other characteristics. Assuming N pieces of feature information (C 1 , C 2 , C 3 , ..., C N ), the probability that the ad copy D i belongs to each feature is a vector
Figure PCTCN2021130006-appb-000001
will be ignored. On the contrary, if the concept of D i is clear, E i will be relatively small, and users are more likely to be attracted by such ad copy.
步骤S126,将各样本对象的特征信息集输入至预设吸引力网络模型,得到训练好的吸引力网络模型。预设吸引力网络模型可以为Magpie模型,该模型用于预测某推荐信息属于各个特征的概率。Step S126, input the feature information set of each sample object into the preset attractiveness network model to obtain a trained attractiveness network model. The preset attractive network model may be a Magpie model, which is used to predict the probability that a certain recommendation information belongs to each feature.
一条好的推荐信息必须是通顺自然的,读起来通顺自然,简洁明了。因此,本申请实施例在量化评估推荐信息时,引入通顺度,将其作为推荐信息评估的一个维度。在一些实施例中,上述步骤S12中“将通顺度样本集输入至预设通顺度网络模型,得到训练好的通顺度网络模型”,可以实现为以下步骤:A good recommendation message must be fluent and natural, read fluent and natural, concise and clear. Therefore, in the embodiment of the present application, when quantifying and evaluating the recommendation information, the fluency is introduced as a dimension of the recommendation information evaluation. In some embodiments, in the above-mentioned step S12, "inputting the fluidity sample set into the preset fluidity network model to obtain a trained fluidity network model" can be implemented as the following steps:
步骤S127,获取通顺度样本集中各样本对象的样本推荐信息。Step S127: Obtain sample recommendation information of each sample object in the fluent degree sample set.
步骤S128,对各样本对象的样本推荐信息进行分词处理,得到各样本推荐信息的分词。本申请实施例中可以使用混淆度来量化推荐信息的通顺度,推荐信息的混淆度越低,说明该推荐信息的语义越通顺自然,通顺度越高;反之,该推荐信息存在语义不通顺的情况。获取各样本推荐信息的分词,例如对样本广告文案s进行分词处理,得到s=(w 1,w 2,…,w n),其中w i表示样本广告文案s中第i个分词,n为分词的数量。 Step S128: Perform word segmentation processing on the sample recommendation information of each sample object to obtain word segmentation of each sample recommendation information. In this embodiment of the present application, the degree of confusion may be used to quantify the degree of smoothness of the recommendation information. The lower the degree of confusion of the recommendation information, the more natural the semantics of the recommendation information, and the higher the degree of smoothness; otherwise, the recommendation information has semantically incomprehensible Happening. Obtain the word segmentation of each sample recommendation information, for example, perform word segmentation processing on the sample advertisement copy s, and obtain s=(w 1 , w 2 , ..., wn ), where w i represents the ith participle in the sample advertisement copy s, and n is The number of participles.
步骤S129,将各样本推荐信息的分词输入至预设通顺度网络模型,得到训练好的通顺度网络模型。这里,预设通顺度网络模型可以为汉语语言模型N-Gram,将各样本推荐信息的分词计算混淆度,再采用加权求和得到推荐信息的通顺度。对待推荐对象的推荐信息进行分词处理后,计算该推荐信息的混淆度,可以采用下式(1)计算混淆度ppl(s):In step S129, the word segmentation of each sample recommendation information is input into a preset smoothness network model to obtain a trained smoothness network model. Here, the preset smoothness network model can be the Chinese language model N-Gram, which calculates the confusion degree by the word segmentation of the recommended information of each sample, and then uses the weighted summation to obtain the smoothness of the recommended information. After word segmentation is performed on the recommendation information of the recommended object, the confusion degree of the recommendation information is calculated, and the confusion degree ppl(s) can be calculated by the following formula (1):
Figure PCTCN2021130006-appb-000002
Figure PCTCN2021130006-appb-000002
采用加权求和得到推荐信息的通顺度,本申请实施例中使用N-Gram语言模型时,N的取值以2、3、4为例。对于不同的N-Gram模型计算推荐信息的通顺度f(s)的公式如下式(2)所示:The smoothness of the recommendation information is obtained by weighted summation. When the N-Gram language model is used in the embodiment of the present application, the value of N is 2, 3, and 4 as examples. For different N-Gram models, the formula for calculating the fluency f(s) of the recommendation information is shown in the following formula (2):
Figure PCTCN2021130006-appb-000003
Figure PCTCN2021130006-appb-000003
其中,α i为N在不同取值时的混淆度对应的权重值,至此得到待推荐对象的推荐信息的通顺度。 Among them, α i is the weight value corresponding to the degree of confusion when N takes different values, so far, the degree of smoothness of the recommendation information of the object to be recommended is obtained.
在一些实施例中,上述步骤S102“将推荐信息和对象信息输入至训练好的文案评分模型进行评估,得到推荐信息在各维度的评分结果”,可以实现为以下步骤:In some embodiments, the above step S102 "input the recommendation information and object information into the trained copywriting scoring model for evaluation, and obtain the scoring results of the recommendation information in each dimension", can be implemented as the following steps:
步骤S1021,将推荐信息和对象信息作为一组评估对输入至训练好的主题性网络模型,得到推荐信息的主题性评分结果。Step S1021 , input the recommendation information and the object information as a set of evaluation pairs into the trained thematic network model, and obtain the thematic scoring result of the recommendation information.
步骤S1022,将推荐信息输入至查找模型,得到推荐信息的合规度评分结果。In step S1022, the recommendation information is input into the search model, and a compliance score result of the recommendation information is obtained.
步骤S1023,将推荐信息输入至训练好的吸引力网络模型,得到推荐信息的吸引力评分结果。In step S1023, the recommendation information is input into the trained attractiveness network model, and the attractiveness score result of the recommendation information is obtained.
步骤S1024,将推荐信息输入至训练好的通顺度网络模型,得到推荐信息的通顺度评分结果。In step S1024, the recommendation information is input into the trained network model of smoothness, and the result of the smoothness score of the recommended information is obtained.
在步骤S101获得待推荐对象的推荐信息和对象信息后,将推荐信息和对象信息作为一组评估对输入至训练好的主题性网络模型,得到该推荐信息的主题性评分,将推荐信息分别输入至查找模型、训练好的吸引力网络模型和训练好的通顺度网络模型,分别得到该推荐信息的合规度评分、吸引力评分和自然度评分,从而得到各维度的评分结果。After obtaining the recommendation information and object information of the object to be recommended in step S101, the recommendation information and the object information are input into the trained thematic network model as a set of evaluation pairs to obtain the thematic score of the recommended information, and the recommendation information is respectively input From the search model, the trained attractiveness network model and the trained smoothness network model, the compliance score, attractiveness score and naturalness score of the recommendation information are obtained respectively, so as to obtain the scoring results of each dimension.
上述步骤S103“基于推荐信息在各维度的评分结果确定推荐信息的评估结果”在实现时,至少 包括以下两种实现方式:When the above-mentioned step S103 "determines the evaluation result of the recommendation information based on the scoring result of the recommendation information in each dimension", at least the following two implementations are included:
第一种实现方式,基于多个维度的整体性确定推荐信息的评估结果。此时,上述步骤S103“基于推荐信息在各维度的评分结果确定推荐信息的评估结果”,可以实现为以下步骤:In the first implementation manner, the evaluation result of the recommendation information is determined based on the integrity of multiple dimensions. At this time, the above step S103 "determining the evaluation result of the recommendation information based on the scoring results of the recommendation information in each dimension" can be implemented as the following steps:
步骤S103a1,根据主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果,确定推荐信息的评分结果。Step S103a1: Determine the scoring result of the recommended information according to the subjectivity scoring result, the compliance scoring result, the attractiveness scoring result, and the smoothness scoring result.
步骤S103a2,判断推荐信息的评分结果是否大于第一预设阈值。当推荐信息的评分结果大于第一预设阈值时,表明推荐信息满足投放要求,进入步骤S103a3;当推荐信息的评分结果小于或等于第一预设阈值时,可能是推荐信息主题与对象信息的主题不相符,也可能是推荐信息中包括敏感词,也可能是推荐信息包含的信息量太少,还可能是推荐信息语句不通顺、存在错别字等缺陷,此时确定推荐信息不满足投放要求,进入步骤S103a4。Step S103a2, judging whether the scoring result of the recommended information is greater than a first preset threshold. When the scoring result of the recommended information is greater than the first preset threshold, it indicates that the recommended information meets the delivery requirements, and the process goes to step S103a3; when the scoring result of the recommended information is less than or equal to the first preset threshold, it may be that the subject of the recommended information and the object information are related. The subject does not match, it may be that the recommended information contains sensitive words, or the amount of information contained in the recommended information is too small, or it may be that the recommended information sentence is not smooth, there are typos and other defects. At this time, it is determined that the recommended information does not meet the delivery requirements. Proceed to step S103a4.
步骤S103a3,确定推荐信息的评估结果为评估通过。Step S103a3, it is determined that the evaluation result of the recommended information is an evaluation pass.
步骤S103a4,确定推荐信息的评估结果为评估不通过。Step S103a4, it is determined that the evaluation result of the recommended information is that the evaluation fails.
第二种实现方式,基于多个维度的稳定性确定推荐信息的评估结果。此时,上述步骤S103“基于推荐信息在各维度的评分结果确定推荐信息的评估结果”,可以实现为以下步骤:In the second implementation manner, the evaluation result of the recommendation information is determined based on the stability of multiple dimensions. At this time, the above step S103 "determining the evaluation result of the recommendation information based on the scoring results of the recommendation information in each dimension" can be implemented as the following steps:
步骤S103b1,计算主题性评分结果、合规度评分结果、吸引力评分结果和自然度评分结果的方差。Step S103b1: Calculate the variance of the subjectivity score results, the compliance score results, the attractiveness score results, and the naturalness score results.
步骤S103b2,判断方差是否小于第二预设阈值。当方差小于第二预设阈值时,表明推荐信息各维度评分结果较平均,此时进入步骤S103b3;当方差大于或等于第二预设阈值时,表明推荐信息在某一维度评分结果过低、或是在某一维度评分结果过高,此时,确定推荐信息不满足投放要求,进入步骤S103b5。Step S103b2, judging whether the variance is smaller than a second preset threshold. When the variance is less than the second preset threshold, it indicates that the scoring results of each dimension of the recommended information are relatively average, and then the process goes to step S103b3; Or the scoring result of a certain dimension is too high, in this case, it is determined that the recommended information does not meet the delivery requirements, and the process goes to step S103b5.
步骤S103b3,判断主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果中是否存在至少一个评分结果大于第三预设阈值。当主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果中存在至少一个评分结果大于第三预设阈值,表明推荐信息满足投放结果,此时进入步骤S103b4;当主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果中不存在至少一个评分结果大于第三预设阈值,即主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果均小于或等于第三预设阈值,表明推荐信息在各维度的评分虽然较平均,但每个评分结果都较低,此时,认为推荐信息不满足投放结果,进入步骤S103b5。Step S103b3: Determine whether there is at least one scoring result greater than a third preset threshold in the thematic scoring result, the compliance scoring result, the attractiveness scoring result, and the smoothness scoring result. When at least one of the thematic scoring results, the compliance scoring results, the attractiveness scoring results, and the smoothness scoring results is greater than the third preset threshold, indicating that the recommended information satisfies the delivery results, step S103b4 is entered; At least one of the scoring results, compliance scoring results, attractiveness scoring results, and fluency scoring results does not exist greater than the third preset threshold, namely thematic scoring results, compliance scoring results, attractiveness scoring results, and smoothing If the degree score results are all less than or equal to the third preset threshold, it indicates that although the scores of the recommended information in each dimension are average, each score result is lower. At this time, it is considered that the recommended information does not meet the delivery results, and the process goes to step S103b5.
步骤S103b4,确定推荐信息的评估结果为评估通过。Step S103b4, it is determined that the evaluation result of the recommended information is the evaluation pass.
步骤S103b5,确定推荐信息的评估结果为评估不通过。Step S103b5, it is determined that the evaluation result of the recommended information is that the evaluation fails.
在一些实施例中,步骤S103a4或步骤S103b5,当确定推荐信息的评估结果为评估不通过时,方法还可以包括:In some embodiments, in step S103a4 or step S103b5, when it is determined that the evaluation result of the recommended information is that the evaluation fails, the method may further include:
步骤S104,基于主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果中至少一个评分结果对推荐信息进行调整。对评估不通过的推荐信息进行调整优化,使得调整后的推荐信息的评估结果为评估通过。Step S104: Adjust the recommendation information based on at least one of the subjectivity scoring results, the compliance scoring results, the attractiveness scoring results, and the smoothness scoring results. The recommendation information that fails the evaluation is adjusted and optimized, so that the evaluation result of the adjusted recommendation information is the evaluation pass.
在一些实施例中,方法还可以包括:In some embodiments, the method may further include:
步骤S105,将评估结果发送至推荐信息投放平台,以使推荐信息投放平台投放评估结果为评估通过的推荐信息。推荐信息的评估设备通知推荐信息投放平台,哪些待推荐对象的推荐信息可以直接投放。对于调整的推荐信息,同时还需要将调整后的推荐信息发送至推荐信息投放平台,以使推荐信息投放平台对调整后的推荐信息进行投放。In step S105, the evaluation result is sent to the recommendation information delivery platform, so that the recommendation information delivery platform delivers the evaluation result as the recommendation information that has passed the evaluation. The evaluation device of the recommendation information notifies the recommendation information delivery platform that the recommendation information of the objects to be recommended can be directly delivered. For the adjusted recommendation information, the adjusted recommendation information also needs to be sent to the recommendation information delivery platform, so that the recommendation information delivery platform can deliver the adjusted recommendation information.
在一些实施例中,推荐信息的评估设备将评估结果发送至推荐信息投放平台,还可以使推荐信息投放平台发送提示信息,以使推荐待推荐对象的用户获知哪些推荐信息不能被正常投放。In some embodiments, the evaluation device for recommendation information sends the evaluation result to the recommendation information delivery platform, and may also cause the recommendation information delivery platform to send prompt information, so that users who recommend objects to be recommended know which recommendation information cannot be delivered normally.
本申请实施例再提供一种推荐信息的评估方法,图4为本申请实施例提供的推荐信息的评估方法的另一种实现流程示意图,如图4所示,方法包括以下步骤:The embodiment of the present application further provides a method for evaluating recommendation information. FIG. 4 is a schematic flowchart of another implementation of the method for evaluating recommendation information provided by the embodiment of the present application. As shown in FIG. 4 , the method includes the following steps:
步骤S401,获取主题性样本集、敏感词集、吸引力样本集和通顺度样本集。Step S401 , obtaining the thematic sample set, the sensitive word set, the attractiveness sample set and the fluent degree sample set.
步骤S402,获取主题性样本集中各样本对象的样本对象信息和样本推荐信息。Step S402: Obtain sample object information and sample recommendation information of each sample object in the thematic sample set.
步骤S403,将同一个样本对象的样本对象信息和样本推荐信息作为一组样本对,获取样本对的标注信息。这里,标注信息表征样本对中的样本对象信息与样本推荐信息相匹配的概率。In step S403, the sample object information and the sample recommendation information of the same sample object are regarded as a set of sample pairs, and the labeling information of the sample pairs is obtained. Here, the annotation information represents the probability that the sample object information in the sample pair matches the sample recommendation information.
步骤S404,将主题性样本集中各样本对象对应的各样本对和各样本对的标注信息输入至预设主题性网络模型进行训练学习,得到训练好的主题性网络模型。Step S404 , input each sample pair corresponding to each sample object in the thematic sample set and the labeling information of each sample pair into a preset thematic network model for training and learning, and obtain a trained thematic network model.
步骤S405,根据敏感词集中各敏感词,构造字典树。Step S405, construct a dictionary tree according to each sensitive word in the sensitive word set.
步骤S406,对字典树中各节点添加查询失败指针,得到基于字典树的查找模型。Step S406, adding a query failure pointer to each node in the dictionary tree to obtain a lookup model based on the dictionary tree.
步骤S407,获取吸引力样本集中各样本对象的样本推荐信息。Step S407: Obtain sample recommendation information of each sample object in the attractive sample set.
步骤S408,对各样本对象的样本推荐信息进行信息提取,得到各样本对象的特征信息集。Step S408: Perform information extraction on the sample recommendation information of each sample object to obtain a feature information set of each sample object.
这里,特征信息集包括样本对象的名称、品类、优惠和属性词中至少一个。Here, the feature information set includes at least one of the name, category, discount, and attribute word of the sample object.
步骤S409,将各样本对象的特征信息集输入至预设吸引力网络模型,得到训练好的吸引力网络模型。Step S409, input the feature information set of each sample object into the preset attractiveness network model to obtain a trained attractiveness network model.
步骤S410,获取通顺度样本集中各样本对象的样本推荐信息。Step S410: Obtain sample recommendation information of each sample object in the fluent degree sample set.
步骤S411,对各样本对象的样本推荐信息进行分词处理,得到各样本推荐信息的分词。Step S411: Perform word segmentation processing on the sample recommendation information of each sample object to obtain word segmentation of each sample recommendation information.
步骤S412,将各样本推荐信息的分词输入至预设通顺度网络模型,得到训练好的通顺度网络模型。Step S412, inputting the word segmentation of each sample recommendation information into a preset smoothness network model to obtain a trained smoothness network model.
步骤S413,基于训练好的主题性网络模型、训练好的吸引力网络模型、训练好的通顺度网络模型和查找模型构建训练好的文案评分模型。In step S413, a trained copywriting scoring model is constructed based on the trained thematic network model, the trained attractiveness network model, the trained smoothness network model and the search model.
在一些实施例中,上述步骤S401至步骤S413也可以在步骤S414之后执行。In some embodiments, the above steps S401 to S413 may also be performed after step S414.
步骤S414,从推荐信息投放平台获取待推荐对象的推荐信息和待推荐对象的对象信息。Step S414: Obtain recommendation information of the object to be recommended and object information of the object to be recommended from the recommendation information delivery platform.
步骤S415,将推荐信息和对象信息作为一组评估对输入至训练好的主题性网络模型,得到推荐信息的主题性评分结果。Step S415 , input the recommendation information and the object information as a set of evaluation pairs into the trained thematic network model, and obtain the thematic scoring result of the recommendation information.
步骤S416,将推荐信息输入至查找模型,得到推荐信息的合规度评分结果。In step S416, the recommendation information is input into the search model, and the compliance score result of the recommendation information is obtained.
步骤S417,将推荐信息输入至训练好的吸引力网络模型,得到推荐信息的吸引力评分结果。In step S417, the recommendation information is input into the trained attractiveness network model, and the attractiveness score result of the recommendation information is obtained.
步骤S418,将推荐信息输入至训练好的通顺度网络模型,得到推荐信息的通顺度评分结果。In step S418, the recommendation information is input into the trained network model of smoothness, and the result of the smoothness score of the recommended information is obtained.
步骤S419,根据主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果,确定 推荐信息的评分结果。Step S419: Determine the scoring result of the recommended information according to the subjectivity scoring result, the compliance scoring result, the attractiveness scoring result, and the smoothness scoring result.
步骤S420,判断推荐信息的评分结果是否大于第一预设阈值。Step S420, judging whether the scoring result of the recommended information is greater than a first preset threshold.
当推荐信息的评分结果大于第一预设阈值时,表明推荐信息满足投放要求,此时,进入步骤S421;当推荐信息的评分结果小于或等于第一预设阈值时,表明推荐信息不满足投放要求,进入步骤S422。When the scoring result of the recommended information is greater than the first preset threshold, it indicates that the recommended information meets the delivery requirements, and at this time, the process goes to step S421; when the scoring result of the recommended information is less than or equal to the first preset threshold, it indicates that the recommended information does not meet the delivery requirements request, go to step S422.
在一些实施例中,上述步骤419至步骤S420可以替换为步骤S419’至步骤S421’:步骤S419’,计算主题性评分结果、合规度评分结果、吸引力评分结果和自然度评分结果的方差。步骤S420’,判断方差是否小于第二预设阈值。当方差小于第二预设阈值时,表明推荐信息各维度评分结果较平均,此时进入步骤S421’;当方差大于或等于第二预设阈值时,表明推荐信息在某一维度评分结果过低、或是在某一维度评分结果过高,此时,确定推荐信息不满足投放要求,进入步骤S422。步骤S421’,判断主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果中是否存在至少一个评分结果大于第三预设阈值。当主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果中存在至少一个评分结果大于第三预设阈值,表明推荐信息满足投放结果,此时进入步骤S421;当主题性评分结果、合规度评分结果、吸引力评分结果和通顺度评分结果均小于或等于第三预设阈值,表明推荐信息不满足投放结果,进入步骤S422。In some embodiments, the above steps 419 to S420 may be replaced by steps S419' to S421': step S419', calculating the variance of the subjectivity score results, compliance score results, attractiveness score results and naturalness score results . Step S420', judging whether the variance is less than the second preset threshold. When the variance is less than the second preset threshold, it indicates that the scoring results of each dimension of the recommended information are relatively average, and the process goes to step S421'; when the variance is greater than or equal to the second preset threshold, it indicates that the scoring result of the recommended information in a certain dimension is too low , or the scoring result in a certain dimension is too high, in this case, it is determined that the recommended information does not meet the delivery requirements, and the process goes to step S422. Step S421', judging whether there is at least one scoring result in the thematic scoring result, compliance scoring result, attractiveness scoring result, and smoothness scoring result that is greater than the third preset threshold. When at least one of the thematic scoring results, compliance scoring results, attractiveness scoring results, and smoothness scoring results is greater than the third preset threshold, indicating that the recommended information satisfies the delivery results, step S421 is entered; The scoring result, the compliance scoring result, the attractiveness scoring result, and the smoothness scoring result are all less than or equal to the third preset threshold, indicating that the recommendation information does not satisfy the delivery result, and the process proceeds to step S422.
步骤S421,确定推荐信息的评估结果为评估通过。Step S421, it is determined that the evaluation result of the recommended information is an evaluation pass.
进入步骤S424,将评估通过的推荐信息进行投放。Proceed to step S424, and deliver the recommended information that has passed the evaluation.
步骤S422,确定推荐信息的评估结果为评估不通过。Step S422, it is determined that the evaluation result of the recommended information is that the evaluation fails.
步骤S423,基于主题性评分结果、合规度评分结果、吸引力评分结果和自然度评分结果中至少一个对推荐信息进行调整。对推荐信息的主题、敏感词、信息量或者语句进行调整,使得调整后的推荐信息的评估结果为评估通过,从而满足投放条件,进入步骤S424。Step S423, adjusting the recommendation information based on at least one of the subjectivity scoring result, the compliance scoring result, the attractiveness scoring result, and the naturalness scoring result. Adjust the subject, sensitive word, amount of information, or sentence of the recommended information, so that the evaluation result of the adjusted recommended information is the evaluation pass, thereby satisfying the delivery condition, and the process proceeds to step S424.
步骤S424,将评估结果发送至推荐信息投放平台,推荐信息投放平台投放评估结果为评估通过的推荐信息。In step S424, the evaluation result is sent to the recommendation information delivery platform, and the recommendation information delivery platform delivers the evaluation result as the recommendation information that has passed the evaluation.
本申请实施例提供的推荐信息的评估方法,从推荐信息投放平台获取待推荐对象的推荐信息和待推荐对象的对象信息;将推荐信息和对象信息输入至训练好的文案评分模型进行评估,得到推荐信息在各维度的评分结果,维度包括主题性、合规度、吸引力和通顺度;基于推荐信息在各维度的评分结果确定推荐信息的评估结果。如此,能够实现对推荐信息进行客观、多维度地量化评估,能够提高评估效率和评估准确度、缩短评估耗时、减少评估成本、降低评估风险。In the evaluation method for recommendation information provided by the embodiment of the present application, the recommendation information of the object to be recommended and the object information of the object to be recommended are obtained from the recommendation information delivery platform; the recommendation information and the object information are input into the trained copywriting scoring model for evaluation, and the The scoring results of the recommended information in each dimension, including subjectivity, compliance, attractiveness, and smoothness; the evaluation results of the recommended information are determined based on the scoring results of the recommended information in each dimension. In this way, an objective and multi-dimensional quantitative evaluation of the recommended information can be realized, which can improve evaluation efficiency and evaluation accuracy, shorten evaluation time, reduce evaluation cost, and reduce evaluation risk.
下面,将说明本申请实施例在一个实际的应用场景中的示例性应用。Below, an exemplary application of the embodiments of the present application in a practical application scenario will be described.
本申请实施例提出一种电商广告创意文案的量化评估模型:ICAN。The embodiment of this application proposes a quantitative evaluation model for creative copywriting of e-commerce advertisements: ICAN.
ICAN综合考虑广告创意文案的合规性(C,Compliance)、吸引力(A,Appeal)、主题性(I,Integrated)、自然性(N,Natural)等四个维度指标,对文案全方面、多维度量化评分。ICAN comprehensively considers the four dimensions of compliance (C, Compliance), attractiveness (A, Appeal), theme (I, Integrated), and naturalness (N, Natural) of advertising creative copywriting. Multi-dimensional quantitative scoring.
图5为本申请实施例提供的广告创意文案的评估方法的实现原理示意图,如图5所示,本申请实施例中提出的ICAN广告创意评价模型主要由四个评分子模型组成:文案合规度评分模型、文案吸引力评分模型、文案主题性评分模型、文案自然度评分模型。下面详细描述各个部分。FIG. 5 is a schematic diagram of the implementation principle of the evaluation method for advertising creative copy provided by the embodiment of the present application. As shown in FIG. 5 , the ICAN advertisement creative evaluation model proposed in the embodiment of the present application is mainly composed of four scoring sub-models: Degree scoring model, copywriting attractiveness scoring model, copywriting theme scoring model, copywriting naturalness scoring model. Each part is described in detail below.
第1部分,文案主题性评分模型: Part 1, Copy Thematic Scoring Model:
广告文案的主题性,是评估其是否与宣传的商品一致。例如一条广告写道“爸爸会喜欢的茶壶”,用户在点击进入后,看到的却是电子产品、茶叶之类的商品,不仅没有达到推广商品的效果,还损失了用户体验。The thematic nature of ad copy is to assess whether it is consistent with the advertised product. For example, an advertisement reads "a teapot that Dad will like", but after clicking into it, the user sees products such as electronic products and tea, which not only fails to achieve the effect of promoting the product, but also loses the user experience.
本申请实施例采用BERT模型的句对匹配任务来实现文案主题性评分。图6为本申请实施例提供的广告文案主题性评分模型的训练示意图,如图6所示,文案主题性评分模型的训练过程如下:The embodiment of the present application adopts the sentence pair matching task of the BERT model to achieve the thematic score of the copy. FIG. 6 is a schematic diagram of training of an advertising copy theme scoring model provided by an embodiment of the present application. As shown in FIG. 6 , the training process of the copy theme scoring model is as follows:
1)从现有的商品及其广告文案中,人工标注部分正样本和负样本,构成样本集合S。这里的正样本,指广告文案与其推广的商品描述语义上相关;而负样本则是两者语义上不相关。1) From the existing products and their advertising copy, manually annotate some positive samples and negative samples to form a sample set S. The positive sample here means that the ad copy is semantically related to the description of the product it promotes; while the negative sample is semantically irrelevant.
2)将样本集S的商品描述、广告创意文案作为BERT句对关系匹配任务的输入,而两者是否相关作为模型的标注数据,进行迁移学习训练。2) The product description and advertising creative copy of the sample set S are used as the input of the BERT sentence pair relationship matching task, and whether the two are related is used as the labeling data of the model for transfer learning training.
3)对S训练K轮以后(本例设K=10),得到基于BERT的分类器,设为F。3) After training S for K rounds (in this example, K=10), a BERT-based classifier is obtained, which is set as F.
4)对现有所有广告文案,及其各个推广的商品描述,两两构成句对,输入到模型F,即可生成各广告文案的主题性得分。4) For all existing advertisement copy and the description of each promoted product, form sentence pairs in pairs, and input them into model F to generate the theme score of each advertisement copy.
假设广告文案D i推广的商品描述为A i(即对象信息为A i),将句对D i、A i输入模型F后,F输出两者相关的概率r i,r i即可作为广告文案D i的主题性评分。 Assuming that the description of the product promoted by the advertisement copy Di is A i ( that is, the object information is A i ), after inputting the sentence pair Di and A i into the model F, F outputs the probability ri related to the two, and ri can be used as an advertisement Thematic score for copy D i .
第2部分,文案合规度评分模型:Part 2, Copywriting Compliance Scoring Model:
广告创意作为广告内容的载体,应该以健康的表现形式传达广告内容,引导消费者树立正确的价值观。因此评价广告创意文案时合规性也是一个非常重要且必不可少的评价维度。本申请实施例可采用Aho-Corasick算法对文案进行合规度的评分,由于需要严格查杀含有敏感词的广告创意文案,因此每条文案的评分只有1或者0。1表示该文案中没有发现任何不符合法律法规的敏感词,即该文案合规;0表示该文案中存在敏感词,因此该条文案不合规。Aho-Corasick是一种经典的多模式串匹配算法,广泛应用于文本串较大、目标字符串众多的模式串匹配场景,因此适用于广告创意文案的合规度检查。构建文案敏感词自动机到检测文案中的敏感词包括以下三个步骤:构建敏感词Trie树(前缀)树,添加敏感词查询失配指针构造AC自动机,模式匹配并返回匹配敏感词。As the carrier of advertising content, advertising creativity should convey the advertising content in a healthy form and guide consumers to establish correct values. Therefore, compliance is also a very important and indispensable evaluation dimension when evaluating advertising creative copywriting. In the embodiment of this application, the Aho-Corasick algorithm can be used to score the compliance degree of the copy. Since it is necessary to strictly check and kill the creative copy of the advertisement containing sensitive words, the score of each copy is only 1 or 0. 1 means that no copy is found in the copy. Any sensitive words that do not comply with laws and regulations, that is, the copy is compliant; 0 means that there are sensitive words in the copy, so the copy is not compliant. Aho-Corasick is a classic multi-pattern string matching algorithm, which is widely used in pattern string matching scenarios with large text strings and many target strings, so it is suitable for compliance checking of creative copywriting. The construction of a copy sensitive word automaton to detect sensitive words in a copy includes the following three steps: constructing a sensitive word Trie tree (prefix) tree, adding a sensitive word query mismatch pointer to construct an AC automaton, pattern matching and returning the matching sensitive words.
(一)构造trie树的算法步骤如下:1)首先获取所有的文本数据,划分成逐行的形式。2)读入每行数据,对照当前比较字符值与当前节点的子节点比较,寻找到与之匹配的节点;3)如果找到对应的子节点,将子节点作为当前节点,并移除数据的此字符,继续步骤2)。4)如果未找到对应子节点,新建节点插入当前的节点中,并将新节点作为当前节点,继续步骤2)。5)操作的终止条件为数据中的字符已经全部移除比较完毕。(1) The algorithm steps for constructing the trie tree are as follows: 1) First, obtain all the text data and divide them into line-by-line form. 2) Read in each line of data, compare the current comparison character value with the child nodes of the current node, and find the matching node; 3) If the corresponding child node is found, take the child node as the current node, and remove the data. this character, continue with step 2). 4) If the corresponding child node is not found, insert the new node into the current node, and use the new node as the current node, and proceed to step 2). 5) The termination condition of the operation is that all characters in the data have been removed and the comparison is completed.
(二)构造AC自动机的算法流程如下:(2) The algorithm flow of constructing an AC automaton is as follows:
1)将根节点的所有孩子节点的fail指向根节点,然后将根节点的所有孩子节点依次入列。2)若队列不为空:2.1)出列,将出列的节点记为curr,failTo表示curr的fail指向的节点,即failTo=curr.fail;2.2)a.判断curr.child[i]==failTo.child[i]是否成立,成立:curr.child[i].fail=failTo.child[i],不成立:判断failTo==null是否成立;成立:curr.child[i].fail==root;不成立:执行failTo=failTo.fail,继续执行2.2);b.curr.child[i]入列,再次执行再次执行步骤2);3)若队列为空:结束。1) Point the fail of all child nodes of the root node to the root node, and then list all the child nodes of the root node in sequence. 2) If the queue is not empty: 2.1) Dequeue, dequeue the dequeued node as curr, and failTo represents the node pointed to by fail of curr, that is, failTo=curr.fail; 2.2) a. Judging curr.child[i]= =failTo.child[i] is established, established: curr.child[i].fail=failTo.child[i], not established: judge whether failTo==null is established; established: curr.child[i].fail== root; not established: execute failTo=failTo.fail, and continue to execute 2.2); b.curr.child[i] is listed, execute step 2) again; 3) If the queue is empty: end.
(三)AC自动机的模式匹配运行过程如下:1)表示当前节点的指针指向AC自动机的根节点, 即curr=root;2)从文本串中读取(下)一个字符;3)从当前节点的所有孩子节点中寻找与该字符匹配的节点,若成功:判断当前节点以及当前节点fail指向的节点是否表示一个字符串的结束,若是,则将文本串中索引起点记录在对应字符串保存结果集合中(索引起点=当前索引-字符串长度+1)。curr指向该孩子节点,继续执行步骤2)。若失败:执行步骤4)。4)若fail==null(说明目标字符串中没有任何字符串是输入字符串的前缀,相当于重启状态机)curr=root,执行步骤2,否则,将当前节点的指针指向fail节点,执行步骤3)。(3) The pattern matching operation process of the AC automaton is as follows: 1) The pointer indicating the current node points to the root node of the AC automaton, that is, curr=root; 2) Read (the next) character from the text string; 3) From the Find a node matching the character among all the child nodes of the current node, if successful: judge whether the current node and the node pointed to by the current node fail indicate the end of a string, if so, record the index starting point in the text string in the corresponding string Save the result set (index starting point = current index - string length + 1). curr points to the child node, and proceeds to step 2). If it fails: go to step 4). 4) If fail==null (indicating that no string in the target string is the prefix of the input string, which is equivalent to restarting the state machine) curr=root, go to step 2, otherwise, point the pointer of the current node to the fail node, execute step 3).
假设现有敏感词集合black_words={高h,高仿,高利贷,仿真枪,真人游戏},两条创意分别为:creative1=“还在买高仿?戳这里,大牌包包只要一块钱!”,creative2=“优质仿真盆景,量大从优。抗紫外线,抗风压。”,图7为本申请实施例提供的广告文案主题合规度模型的训练示意图,图7展示了对两条广告创意进行合规度评分的整个流程。Assuming the existing set of sensitive words black_words={high h, high imitation, loan shark, imitation gun, real game}, the two ideas are: creative1="Still buying high imitation? Click here, the big bag is only one dollar!" , creative2 = "High-quality simulated bonsai, large quantity favors the best. Anti-ultraviolet, anti-wind pressure.", Figure 7 is a schematic diagram of the training of the advertising copy theme compliance model provided by the embodiment of the application, and Figure 7 shows two advertisement creatives The entire process of scoring compliance.
第3部分,文案吸引力评分模型: Part 3, Copywriting Attractiveness Scoring Model:
广告文案是通过向用户传递信息,来影响用户心智的,显然传递的信息量较大的文案更能吸引他们。例如“一件老人都喜欢的好东西”,信息量远不如“爸爸会喜欢的茶壶”。信息学中,衡量信息量的量化指标被称为“信息熵”。用户看到文案信息以后,接受到了新信息,增加了其认知的信息熵(即原来不明确的认知,变得明确了)。例如,之前用户并不知道茶壶在减价,看到广告“爸爸会喜欢的茶壶,满100七五折”,获知了茶壶在减价的信息。而广告“一件老人都喜欢的好东西,别人我不告诉他哦”,则给用户提供的信息很少。用户总是希望看到有信息量的广告,体现在文案上,就是有明确的概念。因此,本申请实施例中使用文案的信息熵来评价其吸引力。例如,设定以下“概念”:产品词,即商品的品类,例如“手机”、“生鲜”等概念;利益点,即商品的促销,例如“满减”、“赠品”、“折扣”等概念;属性词,例如“红色”、“原木”、“进口”、“名媛”、“夏季”等概念。假
Figure PCTCN2021130006-appb-000004
相对更大。用户在看到这样的文案时,很可能不知所云,也就会忽略之。相反,如果D i概念明确,E i会相对更小,用户看到这样的文案更可能被吸引。
Advertising copy affects users’ minds by conveying information to users. Obviously, copy with a larger amount of information is more attractive to them. For example, "a good thing that old people like" is far less informative than "a teapot that dad will like". In informatics, the quantitative index to measure the amount of information is called "information entropy". After the user sees the copy information, he receives new information, which increases the information entropy of his cognition (that is, the original unclear cognition becomes clear). For example, the user did not know that the teapot was on sale before, and when he saw the advertisement "Teapot that Dad will like, 25% off over 100", he learned that the price of the teapot was on sale. And the advertisement "a good thing that old people like, I won't tell others", it provides very little information to users. Users always want to see informative advertisements, which is reflected in the copywriting, that is, there is a clear concept. Therefore, in the embodiments of the present application, the information entropy of the copy is used to evaluate its attractiveness. For example, set the following "concepts": product word, which is the category of the product, such as "mobile phone", "fresh food" and other concepts; benefit point, which is the promotion of the product, such as "full discount", "gift", "discount" and other concepts; attribute words, such as "red", "log", "import", "celebrity", "summer" and other concepts. Fake
Figure PCTCN2021130006-appb-000004
relatively larger. When users see such copywriting, they are likely to be clueless and ignore it. On the contrary, if the concept of D i is clear, E i will be relatively small, and users are more likely to be attracted by such copy.
图8为本申请实施例提供的广告文案主题吸引力模型的训练示意图,如图8所示,文案主题吸引力模型的训练步骤如下:1)从电商网站的商品库中提取所有商品,包括它们的名称、品类/品牌、属性词、相关促销;2)使用名称作为文本,品类/品牌、属性词、促销词等作为标签,生成训练文本集;3)使用上述训练文本集,训练一个多标签分类器,设为模型Μ;如使用Magpie模型作为Μ。最终,Μ模型可以用来预测某文案属于各个概念的概率。这样,当有新的文案D i时,将其输入模型,得到概率分布P i,计算其信息熵E i即为该文案的吸引力评分。 FIG. 8 is a schematic diagram of training the theme attractiveness model of advertisement copy provided by the embodiment of the present application. As shown in FIG. 8 , the training steps of the theme attractiveness model of copywriting are as follows: 1) Extract all products from the product library of the e-commerce website, including Their names, categories/brands, attribute words, and related promotions; 2) Use the name as text, category/brand, attribute words, promotion words, etc. as labels to generate a training text set; 3) Using the above training text set, train one more Label classifier, set to model M; such as using the Magpie model as M. Finally, the M model can be used to predict the probability that a copy belongs to each concept. In this way, when there is a new copy D i , it is input into the model to obtain the probability distribution P i , and the information entropy E i is calculated to be the attractiveness score of the copy.
第4部分,文案自然度评分模型:Part 4, Copy Naturalness Scoring Model:
一条好的文案必须是通顺自然的,让受众读起来通顺自然,简洁明了。因此,本申请实施例中在量化评估广告创意文案时,引入文案通顺自然度(对应上文中的通顺度)评分,作为文案评估的一个维度。本申请实施例中使用混淆度(ppl,Perplexity)来量化文案的通顺自然度。文案的混淆度 越低,说明该文案越通顺自然,反之,该文案存在不通顺的情况。对于句子s=(w 1,w 2,…,w n),其中w i表示句子s中第i个单词,单词的数量为n,其混淆度的计算公式如下式(3)所示: A good copy must be smooth and natural, so that the audience can read it smoothly and naturally, concisely and clearly. Therefore, in the embodiment of the present application, when quantifying the creative copywriting of an advertisement, a score of the naturalness of the copywriting smoothness (corresponding to the smoothness degree above) is introduced as a dimension of the copywriting evaluation. In this embodiment of the present application, perplexity (ppl, Perplexity) is used to quantify the smoothness and naturalness of the copy. The lower the degree of confusion in the copy, the more smooth and natural the copy is, otherwise, the copy is not smooth. For sentence s=(w 1 , w 2 , . . . , wn ), where wi represents the ith word in sentence s, and the number of words is n, the calculation formula of the confusion degree is shown in the following formula (3):
Figure PCTCN2021130006-appb-000005
Figure PCTCN2021130006-appb-000005
本申请实施例中可使用N-Gram语言模型,其中N的取值为2,3和4。对于不同的N-Gram模型计算出文案的混淆度采用加权求和得到文案的通顺自然度f如下式(4)所示:An N-Gram language model may be used in the embodiments of the present application, where N takes values of 2, 3, and 4. For different N-Gram models, the confusion degree of the copy is calculated by using the weighted summation to obtain the smooth natural degree f of the copy as shown in the following formula (4):
Figure PCTCN2021130006-appb-000006
Figure PCTCN2021130006-appb-000006
其中,α i为N在不同取值时的混淆度对应的权重值。 Among them, α i is the weight value corresponding to the confusion degree of N at different values.
图9为本申请实施例提供的广告文案主题自然度模型的训练示意图,将待检测文案输入至图9右边所示的训练好的文案自然度评分模型,得到待检测文案的通顺度得分,然后对其进行归一化处理,得到通顺度评分。FIG. 9 is a schematic diagram of the training of the theme naturalness model of the advertising copy provided by the embodiment of the present application. The text to be detected is input into the trained copy text naturalness scoring model shown on the right side of FIG. 9 to obtain the smoothness score of the text to be detected, and then It is normalized to obtain a fluent score.
第5部分,广告文案的量化评分:Part 5, Quantitative Scoring of Ad Copy:
通过以上四个子模型,本申请得到了文案D i的各个维度评分:主题分(设为r)、合规分(设为c)、吸引力(设为a)、通顺分(设为f)。通过这四个评分,可以直观地评价每个文案的质量。图10为3个文案在ICAN模型下各个维度的评分的示意图,并且对于评分低的维度,给出了相应的说明。如不合规的文案给出敏感词,不通顺的文案给出不通顺片段,品类失配的创意给出失配的品类名称。本申请实施例中,每个文案创意在ICAN模型下的各维度得分,可以用图11中的雷达图来进行比较。在后续的创意筛选等决策中可以改善或者剔除评分比较低的文案。本申请实施例通过引入主题性、合规性、吸引力和自然度这四个维度量化评估电商广告的文案质量,通过ICAN模型,可以过滤出单一维度明显太低、或是多个维度整体不高的广告文案,便于后续对这些文案进行调优,或者直接剔除。通过引入信息熵,实现量化评估电商广告文案的吸引力;通过引入BERT和句对匹配,实现量化评估电商广告文案的主题性;通过引入主题性、合规性、吸引力、自然度四个维度,实现综合量化评估电商广告文案。 Through the above four sub-models, this application obtains the scores of each dimension of the copy D i : subject score (set as r), compliance score (set as c), attractiveness (set as a), and smooth score (set as f) . Through these four scores, the quality of each copy can be visually evaluated. Figure 10 is a schematic diagram of the scores of the three copywriting in each dimension under the ICAN model, and for the dimension with a low score, a corresponding description is given. For example, sensitive words are given for non-compliant copywriting, incomprehensible fragments are given for inconsistent copywriting, and mismatched category names are given for creative category mismatches. In this embodiment of the present application, the scores of each dimension of each copywriting idea under the ICAN model can be compared using the radar chart in FIG. 11 . In subsequent creative screening and other decisions, copywriting with low scores can be improved or eliminated. In this embodiment of the present application, four dimensions of subjectivity, compliance, attractiveness and naturalness are introduced to quantify and evaluate the copywriting quality of e-commerce advertisements. Through the ICAN model, it is possible to filter out that a single dimension is obviously too low, or the overall dimensions of multiple dimensions are filtered out. Advertising copy that is not high is convenient for subsequent tuning of these copies, or direct elimination. By introducing information entropy, we can quantitatively evaluate the attractiveness of e-commerce advertising copy; by introducing BERT and sentence pair matching, we can quantitatively evaluate the theme of e-commerce advertising copy; A dimension to achieve comprehensive quantitative evaluation of e-commerce advertising copy.
基于前述的实施例,本申请实施例提供一种推荐信息的评估装置,该装置包括的各模块、以及各模块包括的各单元,可以通过计算机设备中的处理器来实现;当然也可通过逻辑电路实现;在实施的过程中,处理器可以为中央处理器(CPU,Central Processing Unit)、微处理器(MPU,Microprocessor Unit)、数字信号处理器(DSP,Digital Signal Processing)或现场可编程门阵列(FPGA,Field Programmable Gate Array)等。Based on the foregoing embodiments, the embodiments of the present application provide an apparatus for evaluating recommendation information. Each module included in the apparatus and each unit included in each module can be implemented by a processor in a computer device; of course, it can also be implemented by logic Circuit implementation; in the process of implementation, the processor can be a central processing unit (CPU, Central Processing Unit), a microprocessor (MPU, Microprocessor Unit), a digital signal processor (DSP, Digital Signal Processing) or a field programmable gate Array (FPGA, Field Programmable Gate Array), etc.
本申请实施例再提供一种推荐信息的评估装置,图12为本申请实施例提供的推荐信息的评估装置的组成结构示意图,如图12所示,所述推荐信息的评估装置120包括:第一获取模块121,配置为从推荐信息投放平台获取待推荐对象的推荐信息和所述待推荐对象的对象信息;评估模块122,配置为将所述推荐信息和所述对象信息输入至训练好的文案评分模型进行评估,得到所述推荐信息在各维度的评分结果,所述维度包括主题性、合规度、吸引力和通顺度;确定模块123,配置为基于所述推荐信息在各维度的评分结果确定所述推荐信息的评估结果。An embodiment of the present application further provides an apparatus for evaluating recommended information. FIG. 12 is a schematic structural diagram of the composition of the apparatus for evaluating recommended information provided by an embodiment of the present application. As shown in FIG. 12 , the apparatus for evaluating recommended information 120 includes: an acquisition module 121, configured to acquire recommendation information of the object to be recommended and object information of the object to be recommended from the recommendation information delivery platform; evaluation module 122, configured to input the recommendation information and the object information into the trained The copywriting scoring model is evaluated, and the scoring results of the recommendation information in each dimension are obtained, and the dimensions include subjectivity, compliance, attractiveness and smoothness; the determining module 123 is configured to be based on the recommendation information in each dimension. The scoring result determines the evaluation result of the recommendation information.
在一些实施例中,所述推荐信息的评估装置120,还可以包括:第二获取模块,配置为获取主题性样本集、敏感词集、吸引力样本集和通顺度样本集;训练模块,配置为将所述主题性样本集、所述吸引力样本集和所述通顺度样本集分别输入至预设主题性网络模型、预设吸引力网络模型和预设通顺度网络模型,得到训练好的主题性网络模型、训练好的吸引力网络模型和训练好的通顺度网络模型;构造模块,配置为根据所述敏感词集,构造基于字典树的查找模型;构建模块,配置为基于所述训练好的主题性网络模型、训练好的吸引力网络模型、训练好的通顺度网络模型和所述查找模型构建训练好的文案评分模型。In some embodiments, the evaluation device 120 for recommendation information may further include: a second acquisition module, configured to acquire a topic sample set, a sensitive word set, an attractiveness sample set, and a fluency sample set; a training module, configured to In order to input the thematic sample set, the attractiveness sample set and the fluency sample set respectively into the preset thematic network model, the preset attractive network model and the preset smoothness network model, the trained Thematic network model, the trained attractiveness network model and the trained smoothness network model; a construction module, configured to construct a dictionary tree-based search model according to the sensitive word set; a construction module, configured to be based on the training A good topical network model, a trained attractive network model, a trained smoothness network model, and the search model construct a trained copywriting scoring model.
在一些实施例中,所述训练模块,还配置为:获取所述主题性样本集中各样本对象的样本对象信息和样本推荐信息;将同一个样本对象的样本对象信息和样本推荐信息作为一组样本对,获取所述样本对的标注信息,所述标注信息表征所述样本对中的样本对象信息与样本推荐信息相匹配的概率;将所述主题性样本集中各样本对象对应的各样本对和各样本对的标注信息输入至预设主题性网络模型进行训练学习,得到训练好的主题性网络模型。In some embodiments, the training module is further configured to: obtain sample object information and sample recommendation information of each sample object in the thematic sample set; take the sample object information and sample recommendation information of the same sample object as a group sample pair, obtain the label information of the sample pair, the label information represents the probability that the sample object information in the sample pair matches the sample recommendation information; each sample pair corresponding to each sample object in the thematic sample set is The annotation information of each sample pair is input into a preset thematic network model for training and learning, and a trained thematic network model is obtained.
在一些实施例中,所述构造模块,还配置为:根据所述敏感词集中各敏感词,构造字典树;对所述字典树中各节点添加查询失败指针,得到基于字典树的查找模型。In some embodiments, the construction module is further configured to: construct a dictionary tree according to each sensitive word in the sensitive word set; add a query failure pointer to each node in the dictionary tree to obtain a lookup model based on the dictionary tree.
在一些实施例中,所述训练模块,还配置为:获取所述吸引力样本集中各样本对象的样本推荐信息;对所述各样本对象的样本推荐信息进行信息提取,得到各样本对象的特征信息集,所述特征信息集包括所述样本对象的名称、品类、优惠和属性词中至少一个;将所述各样本对象的特征信息集输入至预设吸引力网络模型,得到训练好的吸引力网络模型。In some embodiments, the training module is further configured to: obtain sample recommendation information of each sample object in the attractive sample set; perform information extraction on the sample recommendation information of each sample object to obtain the characteristics of each sample object information set, the feature information set includes at least one of the name, category, discount and attribute word of the sample object; input the feature information set of each sample object into the preset attractiveness network model to obtain the trained attraction force network model.
在一些实施例中,所述训练模块,还配置为:获取所述通顺度样本集中各样本对象的样本推荐信息;对所述各样本对象的样本推荐信息进行分词处理,得到各样本推荐信息的分词;将所述各样本推荐信息的分词输入至预设通顺度网络模型,得到训练好的通顺度网络模型。In some embodiments, the training module is further configured to: obtain sample recommendation information of each sample object in the connectivity sample set; perform word segmentation processing on the sample recommendation information of each sample object to obtain the sample recommendation information of each sample object. word segmentation; input the word segmentation of the recommended information of each sample into the preset fluent degree network model to obtain a trained fluent degree network model.
在一些实施例中,所述评估模块,还配置为:将所述推荐信息和所述对象信息作为一组评估对输入至训练好的主题性网络模型,得到所述推荐信息的主题性评分结果;将所述推荐信息输入至所述查找模型,得到所述推荐信息的合规度评分结果;将所述推荐信息输入至训练好的吸引力网络模型,得到所述推荐信息的吸引力评分结果;将所述推荐信息输入至训练好的通顺度网络模型,得到所述推荐信息的通顺度评分结果。In some embodiments, the evaluation module is further configured to: input the recommendation information and the object information as a set of evaluation pairs to the trained thematic network model, and obtain the thematic scoring result of the recommendation information ; Input the recommended information into the search model to obtain the compliance score result of the recommended information; Input the recommended information into the trained attractiveness network model to obtain the attractiveness score result of the recommended information ; Input the recommended information into the trained fluent network model, and obtain the fluent score result of the recommended information.
在一些实施例中,所述确定模块,还配置为:根据所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果,确定所述推荐信息的评分结果;当所述推荐信息的评分结果大于第一预设阈值时,确定所述推荐信息的评估结果为评估通过;当所述推荐信息的评分结果小于或等于第一预设阈值时,确定所述推荐信息的评估结果为评估不通过。In some embodiments, the determining module is further configured to: determine the recommendation information according to the subjectivity scoring results, the compliance scoring results, the attractiveness scoring results, and the smoothness scoring results When the evaluation result of the recommended information is greater than the first preset threshold, it is determined that the evaluation result of the recommended information is passed; when the evaluation result of the recommended information is less than or equal to the first preset threshold, It is determined that the evaluation result of the recommended information is an evaluation failure.
在一些实施例中,所述确定模块,还配置为:计算所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果的方差;当所述方差小于第二预设阈值,且所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果中存在至少一个评分结果大于第三预设阈值时,确定所述推荐信息的评估结果为评估通过;当所述方差大于或等于第二预设阈值,或所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分 结果均小于或等于第三预设阈值时,确定所述推荐信息的评估结果为评估不通过。In some embodiments, the determining module is further configured to: calculate the variance of the subjectivity scoring result, the compliance scoring result, the attractiveness scoring result, and the fluency scoring result; when the When the variance is less than the second preset threshold, and there is at least one scoring result in the subjectivity scoring result, the compliance scoring result, the attractiveness scoring result, and the fluency scoring result that is greater than the third preset threshold , it is determined that the evaluation result of the recommendation information is the evaluation pass; when the variance is greater than or equal to the second preset threshold, or the thematic score result, the compliance score result, the attractiveness score result and all When the fluent score results are all less than or equal to the third preset threshold, it is determined that the evaluation result of the recommendation information is an evaluation failure.
在一些实施例中,所述推荐信息的评估装置120,还可以包括:调整模块,配置为当所述评估结果为评估不通过时,基于所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果中至少一个评分结果对所述推荐信息进行调整。发送模块,配置为将所述评估结果发送至所述推荐信息投放平台,以使所述推荐信息投放平台投放评估结果为评估通过的推荐信息。In some embodiments, the evaluation device 120 for the recommended information may further include: an adjustment module configured to, when the evaluation result is that the evaluation fails the evaluation, based on the subjectivity scoring result and the compliance scoring result , at least one of the attractiveness scoring result and the fluency scoring result adjusts the recommendation information. The sending module is configured to send the evaluation result to the recommendation information delivery platform, so that the recommendation information delivery platform delivers the evaluation result as the recommendation information that has passed the evaluation.
这里需要指出的是:以上推荐信息的评估装置实施例项的描述,与上述方法描述是类似的,具有同方法实施例相同的有益效果。对于本申请推荐信息的评估装置实施例中未披露的技术细节,本领域的技术人员请参照本申请方法实施例的描述而理解。It should be pointed out here that the description of the above-mentioned embodiment items of the evaluation apparatus for the recommendation information is similar to the description of the above-mentioned method, and has the same beneficial effects as the method embodiment. Those skilled in the art should refer to the description of the method embodiments of the present application to understand the technical details that are not disclosed in the embodiments of the evaluation device for the recommended information of the present application.
需要说明的是,本申请实施例中,如果以软件功能模块的形式实现上述的广告文案的评估方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本申请实施例不限制于任何特定的硬件和软件结合。It should be noted that, in the embodiments of the present application, if the above-mentioned advertising copy evaluation method is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer-readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence or in the parts that make contributions to the prior art. The computer software products are stored in a storage medium and include several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) is caused to execute all or part of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, Read Only Memory (ROM, Read Only Memory), magnetic disk or optical disk and other media that can store program codes. As such, the embodiments of the present application are not limited to any specific combination of hardware and software.
相应地,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述实施例中提供的推荐信息的评估方法中的步骤。Correspondingly, the embodiments of the present application provide a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps in the method for evaluating recommendation information provided in the foregoing embodiments.
本申请实施例提供一种推荐信息的评估设备,图13为本申请实施例提供的推荐信息的评估设备的组成结构示意图,根据图13示出的推荐信息的评估设备130的示例性结构,可以预见推荐信息的评估设备130的其他的示例性结构,因此这里所描述的结构不应视为限制,例如可以省略下文所描述的部分组件,或者,增设下文所未记载的组件以适应某些应用的特殊需求。An embodiment of the present application provides an evaluation device for recommended information. FIG. 13 is a schematic diagram of the composition and structure of the device for evaluation of recommended information provided by the embodiment of the present application. According to the exemplary structure of the evaluation device for recommended information 130 shown in FIG. Other exemplary structures of the evaluation device 130 for recommending information are foreseen, so the structures described here should not be regarded as limiting, for example, some components described below may be omitted, or components not described below may be added to suit certain applications special needs.
图13所示的推荐信息的评估设备130包括:一个处理器131、至少一个通信总线132、用户接口133、至少一个外部通信接口134和存储器135。其中,通信总线132配置为实现这些组件之间的连接通信。其中,用户接口133可以包括显示屏,外部通信接口134可以包括标准的有线接口和无线接口。其中,处理器131配置为执行存储器中存储的推荐信息的评估方法的程序,以实现上述实施例提供的推荐信息的评估方法中的步骤。The evaluation device 130 for recommendation information shown in FIG. 13 includes: a processor 131 , at least one communication bus 132 , a user interface 133 , at least one external communication interface 134 and a memory 135 . Among them, the communication bus 132 is configured to realize the connection communication between these components. The user interface 133 may include a display screen, and the external communication interface 134 may include a standard wired interface and a wireless interface. Wherein, the processor 131 is configured to execute the program of the method for evaluating the recommendation information stored in the memory, so as to implement the steps in the method for evaluating the recommendation information provided by the above embodiments.
以上推荐信息的评估设备和存储介质实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请推荐信息的评估设备和存储介质实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。The descriptions of the above embodiments of the evaluation device and the storage medium for the recommended information are similar to the descriptions of the above method embodiments, and have similar beneficial effects to the method embodiments. For the technical details that are not disclosed in the embodiments of the evaluation device and the storage medium for the recommended information of the present application, please refer to the description of the method embodiments of the present application for understanding.
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。It is to be understood that reference throughout the specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic associated with the embodiment is included in at least one embodiment of the present application. Thus, appearances of "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in various embodiments of the present application, the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not be dealt with in the embodiments of the present application. implementation constitutes any limitation. The above-mentioned serial numbers of the embodiments of the present application are only for description, and do not represent the advantages or disadvantages of the embodiments.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that, herein, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or device comprising a series of elements includes not only those elements, It also includes other elements not expressly listed or inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored, or not implemented. In addition, the coupling, or direct coupling, or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms. of.
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。The unit described above as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit; it may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may all be integrated into one processing unit, or each unit may be separately used as a unit, or two or more units may be integrated into one unit; the above integration The unit can be implemented either in the form of hardware or in the form of hardware plus software functional units.
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps of implementing the above method embodiments can be completed by program instructions related to hardware, the aforementioned program can be stored in a computer-readable storage medium, and when the program is executed, the execution includes the above The steps of the method embodiment; and the aforementioned storage medium includes: various media that can store program codes, such as a removable storage device, a ROM, a magnetic disk, or an optical disk.
或者,本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台设备执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。Alternatively, if the above-mentioned integrated units of the present application are implemented in the form of software function modules and sold or used as independent products, they may also be stored in a computer-readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence or in the parts that make contributions to the prior art. The computer software products are stored in a storage medium and include several instructions for One device is made to execute all or part of the methods described in the various embodiments of the present application. The aforementioned storage medium includes various media that can store program codes, such as a removable storage device, a ROM, a magnetic disk, or an optical disk.
以上所述,仅为本申请的实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only the embodiment of the present application, but the protection scope of the present application is not limited to this. Covered within the scope of protection of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (24)

  1. 一种推荐信息的评估方法,所述方法包括:A method for evaluating recommendation information, the method comprising:
    从推荐信息投放平台获取待推荐对象的推荐信息和所述待推荐对象的对象信息;Obtain the recommendation information of the object to be recommended and the object information of the object to be recommended from the recommendation information delivery platform;
    将所述推荐信息和所述对象信息输入至训练好的文案评分模型进行评估,得到所述推荐信息在各维度的评分结果,所述维度包括主题性、合规度、吸引力和通顺度;Inputting the recommendation information and the object information into the trained copywriting scoring model for evaluation, and obtaining the scoring results of the recommendation information in each dimension, the dimensions including subjectivity, compliance, attractiveness, and smoothness;
    基于所述推荐信息在各维度的评分结果确定所述推荐信息的评估结果。The evaluation result of the recommendation information is determined based on the scoring results of the recommendation information in each dimension.
  2. 根据权利要求1所述的方法,其中,所述方法还包括:The method of claim 1, wherein the method further comprises:
    获取主题性样本集、敏感词集、吸引力样本集和通顺度样本集;Obtain thematic sample sets, sensitive word sets, attractiveness sample sets and fluency sample sets;
    将所述主题性样本集、所述吸引力样本集和所述通顺度样本集分别输入至预设主题性网络模型、预设吸引力网络模型和预设通顺度网络模型,得到训练好的主题性网络模型、训练好的吸引力网络模型和训练好的通顺度网络模型;Input the thematic sample set, the attractiveness sample set and the fluency sample set into the preset thematic network model, the preset attractiveness network model and the preset fluency network model respectively to obtain the trained theme Sex network model, trained attractiveness network model and trained smoothness network model;
    根据所述敏感词集,构造基于字典树的查找模型;According to the sensitive word set, construct a dictionary tree-based search model;
    基于所述训练好的主题性网络模型、训练好的吸引力网络模型、训练好的通顺度网络模型和所述查找模型构建训练好的文案评分模型。A trained copywriting scoring model is constructed based on the trained topicality network model, the trained attractiveness network model, the trained smoothness network model and the search model.
  3. 根据权利要求2所述的方法,其中,将所述主题性样本集输入至预设主题性网络模型,得到训练好的主题性网络模型,包括:The method according to claim 2, wherein inputting the thematic sample set into a preset thematic network model to obtain a trained thematic network model, comprising:
    获取所述主题性样本集中各样本对象的样本对象信息和样本推荐信息;Obtain sample object information and sample recommendation information of each sample object in the thematic sample set;
    将同一个样本对象的样本对象信息和样本推荐信息作为一组样本对,获取所述样本对的标注信息,所述标注信息表征所述样本对中的样本对象信息与样本推荐信息相匹配的概率;Take the sample object information and sample recommendation information of the same sample object as a set of sample pairs, and obtain the label information of the sample pair, where the label information represents the probability that the sample object information in the sample pair matches the sample recommendation information ;
    将所述主题性样本集中各样本对象对应的各样本对和各样本对的标注信息输入至预设主题性网络模型进行训练学习,得到训练好的主题性网络模型。Each sample pair corresponding to each sample object in the thematic sample set and the labeling information of each sample pair are input into a preset thematic network model for training and learning, and a trained thematic network model is obtained.
  4. 根据权利要求2所述的方法,其中,所述根据所述敏感词集,构造基于字典树的查找模型,包括:The method according to claim 2, wherein the constructing a dictionary tree-based search model according to the sensitive word set comprises:
    根据所述敏感词集中各敏感词,构造字典树;According to each sensitive word in the sensitive word set, construct a dictionary tree;
    对所述字典树中各节点添加查询失败指针,得到基于字典树的查找模型。A query failure pointer is added to each node in the dictionary tree to obtain a lookup model based on the dictionary tree.
  5. 根据权利要求2所述的方法,其中,将所述吸引力样本集输入至预设吸引力网络模型,得到训练好的吸引力网络模型,包括:The method according to claim 2, wherein inputting the attractiveness sample set into a preset attractiveness network model to obtain a trained attractiveness network model, comprising:
    获取所述吸引力样本集中各样本对象的样本推荐信息;obtaining sample recommendation information of each sample object in the attractive sample set;
    对所述各样本对象的样本推荐信息进行信息提取,得到各样本对象的特征信息集,所述特征信息集包括所述样本对象的名称、品类、优惠和属性词中至少一个;Perform information extraction on the sample recommendation information of each sample object to obtain a feature information set of each sample object, where the feature information set includes at least one of the name, category, discount and attribute word of the sample object;
    将所述各样本对象的特征信息集输入至预设吸引力网络模型,得到训练好的吸引力网络模型。Input the feature information set of each sample object into the preset attractiveness network model to obtain the trained attractiveness network model.
  6. 根据权利要求2所述的方法,其中,将所述通顺度样本集输入至预设通顺度网络模型,得到训练好的通顺度网络模型,包括:The method according to claim 2, wherein, inputting the fluidity sample set into a preset fluidity network model to obtain a trained fluidity network model, comprising:
    获取所述通顺度样本集中各样本对象的样本推荐信息;obtaining sample recommendation information of each sample object in the fluent degree sample set;
    对所述各样本对象的样本推荐信息进行分词处理,得到各样本推荐信息的分词;Perform word segmentation processing on the sample recommendation information of each sample object to obtain word segmentation of each sample recommendation information;
    将所述各样本推荐信息的分词输入至预设通顺度网络模型,得到训练好的通顺度网络模型。Inputting the word segmentation of the recommendation information of each sample into a preset smoothness network model to obtain a trained smoothness network model.
  7. 根据权利要求2所述的方法,其中,所述将所述推荐信息和所述对象信息输入至训练好的文案评分模型进行评估,得到所述推荐信息在各维度的评分结果,包括:The method according to claim 2, wherein the inputting the recommendation information and the object information into a trained copywriting scoring model for evaluation, and obtaining the scoring results of the recommendation information in each dimension, comprising:
    将所述推荐信息和所述对象信息作为一组评估对输入至训练好的主题性网络模型,得到所述推荐信息的主题性评分结果;Inputting the recommendation information and the object information as a set of evaluation pairs to the trained thematic network model to obtain the thematic scoring result of the recommendation information;
    将所述推荐信息输入至所述查找模型,得到所述推荐信息的合规度评分结果;Inputting the recommendation information into the search model to obtain a compliance score result of the recommendation information;
    将所述推荐信息输入至训练好的吸引力网络模型,得到所述推荐信息的吸引力评分结果;Inputting the recommendation information into the trained attractiveness network model to obtain the attractiveness score result of the recommendation information;
    将所述推荐信息输入至训练好的通顺度网络模型,得到所述推荐信息的通顺度评分结果。The recommendation information is input into the trained fluent network model, and the fluent score result of the recommendation information is obtained.
  8. 根据权利要求7所述的方法,其中,所述基于所述推荐信息在各维度的评分结果确定所述推荐信息的评估结果,包括:The method according to claim 7, wherein the determining the evaluation result of the recommendation information based on the scoring results of the recommendation information in each dimension comprises:
    根据所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果,确定所述推荐信息的评分结果;According to the thematic scoring result, the compliance scoring result, the attractiveness scoring result and the smoothness scoring result, determine the scoring result of the recommendation information;
    当所述推荐信息的评分结果大于第一预设阈值时,确定所述推荐信息的评估结果为评估通过;When the scoring result of the recommendation information is greater than the first preset threshold, determine that the evaluation result of the recommendation information is an evaluation pass;
    当所述推荐信息的评分结果小于或等于第一预设阈值时,确定所述推荐信息的评估结果为评估不通过。When the scoring result of the recommendation information is less than or equal to the first preset threshold, it is determined that the evaluation result of the recommendation information is an evaluation failure.
  9. 根据权利要求7所述的方法,其中,所述基于所述各维度的评分结果确定所述推荐信息的评估结果,包括:The method according to claim 7, wherein the determining the evaluation result of the recommendation information based on the scoring results of each dimension comprises:
    计算所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果的方差;calculating the variance of the subjectivity score results, the compliance score results, the attractiveness score results, and the fluency score results;
    当所述方差小于第二预设阈值,且所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果中存在至少一个评分结果大于第三预设阈值时,确定所述推荐信息的评估结果为评估通过;When the variance is less than the second preset threshold, and at least one of the subjectivity score results, the compliance score results, the attractiveness score results, and the fluent score results is greater than the third predetermined threshold When setting the threshold, it is determined that the evaluation result of the recommended information is an evaluation pass;
    当所述方差大于或等于第二预设阈值,或所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果均小于或等于第三预设阈值时,确定所述推荐信息的评估结果为评估不通过。When the variance is greater than or equal to a second preset threshold, or the thematic scoring result, the compliance scoring result, the attractiveness scoring result, and the fluency scoring result are all less than or equal to a third preset When the threshold is set, it is determined that the evaluation result of the recommendation information is that the evaluation fails.
  10. 根据权利要求8或9所述的方法,其中,所述方法还包括:The method according to claim 8 or 9, wherein the method further comprises:
    当所述评估结果为评估不通过时,基于所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果中至少一个评分结果对所述推荐信息进行调整。When the evaluation result is that the evaluation fails, the recommendation information is evaluated based on at least one of the subjectivity score results, the compliance score results, the attractiveness score results, and the fluent score results. make adjustments.
  11. 根据权利要求1至9任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1 to 9, wherein the method further comprises:
    将所述评估结果发送至所述推荐信息投放平台,以使所述推荐信息投放平台投放评估结果为评估通过的推荐信息。The evaluation result is sent to the recommendation information delivery platform, so that the recommendation information delivery platform delivers the evaluation result as the recommendation information that has passed the evaluation.
  12. 一种推荐信息的评估装置,所述装置包括:A device for evaluating recommended information, the device comprising:
    第一获取模块,配置为从推荐信息投放平台获取待推荐对象的推荐信息和所述待推荐对象的对象信息;a first obtaining module, configured to obtain recommendation information of the object to be recommended and object information of the object to be recommended from the recommendation information delivery platform;
    评估模块,配置为将所述推荐信息和所述对象信息输入至训练好的文案评分模型进行评估,得到所述推荐信息在各维度的评分结果,所述维度包括主题性、合规度、吸引力和通顺度;The evaluation module is configured to input the recommendation information and the object information into the trained copywriting scoring model for evaluation, and obtain the scoring results of the recommendation information in each dimension, and the dimensions include subjectivity, compliance, attractiveness strength and smoothness;
    确定模块,配置为基于所述推荐信息在各维度的评分结果确定所述推荐信息的评估结果。A determination module configured to determine an evaluation result of the recommendation information based on the scoring results of the recommendation information in each dimension.
  13. 根据权利要求12所述的装置,其中,所述装置还包括:The apparatus of claim 12, wherein the apparatus further comprises:
    第二获取模块,配置为获取主题性样本集、敏感词集、吸引力样本集和通顺度样本集;The second obtaining module is configured to obtain the thematic sample set, the sensitive word set, the attractiveness sample set and the fluency sample set;
    训练模块,配置为将所述主题性样本集、所述吸引力样本集和所述通顺度样本集分别输入至预设主题性网络模型、预设吸引力网络模型和预设通顺度网络模型,得到训练好的主题性网络模型、训练好的吸引力网络模型和训练好的通顺度网络模型;a training module, configured to input the thematic sample set, the attractiveness sample set and the fluency sample set respectively into a preset thematic network model, a preset attractiveness network model and a preset smoothness network model, Obtain the trained topic network model, the trained attractiveness network model and the trained smoothness network model;
    构造模块,配置为根据所述敏感词集,构造基于字典树的查找模型;a construction module, configured to construct a dictionary tree-based search model according to the sensitive word set;
    构建模块,配置为基于所述训练好的主题性网络模型、训练好的吸引力网络模型、训练好的通顺度网络模型和所述查找模型构建训练好的文案评分模型。A building module is configured to build a trained copywriting scoring model based on the trained thematic network model, the trained attractiveness network model, the trained smoothness network model and the search model.
  14. 根据权利要求13所述的装置,其中,所述训练模块,还配置为:The apparatus of claim 13, wherein the training module is further configured to:
    获取所述主题性样本集中各样本对象的样本对象信息和样本推荐信息;Obtain sample object information and sample recommendation information of each sample object in the thematic sample set;
    将同一个样本对象的样本对象信息和样本推荐信息作为一组样本对,获取所述样本对的标注信息,所述标注信息表征所述样本对中的样本对象信息与样本推荐信息相匹配的概率;Take the sample object information and sample recommendation information of the same sample object as a set of sample pairs, and obtain the label information of the sample pair, where the label information represents the probability that the sample object information in the sample pair matches the sample recommendation information ;
    将所述主题性样本集中各样本对象对应的各样本对和各样本对的标注信息输入至预设主题性网络模型进行训练学习,得到训练好的主题性网络模型。Each sample pair corresponding to each sample object in the thematic sample set and the labeling information of each sample pair are input into a preset thematic network model for training and learning, and a trained thematic network model is obtained.
  15. 根据权利要求13所述的装置,其中,所述构造模块,还配置为:The apparatus of claim 13, wherein the building block is further configured to:
    根据所述敏感词集中各敏感词,构造字典树;According to each sensitive word in the sensitive word set, construct a dictionary tree;
    对所述字典树中各节点添加查询失败指针,得到基于字典树的查找模型。A query failure pointer is added to each node in the dictionary tree to obtain a lookup model based on the dictionary tree.
  16. 根据权利要求13所述的装置,其中,所述训练模块,还配置为:The apparatus of claim 13, wherein the training module is further configured to:
    获取所述吸引力样本集中各样本对象的样本推荐信息;obtaining sample recommendation information of each sample object in the attractive sample set;
    对所述各样本对象的样本推荐信息进行信息提取,得到各样本对象的特征信息集,所述特征信息集包括所述样本对象的名称、品类、优惠和属性词中至少一个;Perform information extraction on the sample recommendation information of each sample object to obtain a feature information set of each sample object, where the feature information set includes at least one of the name, category, discount and attribute word of the sample object;
    将所述各样本对象的特征信息集输入至预设吸引力网络模型,得到训练好的吸引力网络模型。Input the feature information set of each sample object into the preset attractiveness network model to obtain the trained attractiveness network model.
  17. 根据权利要求13所述的装置,其中,所述训练模块,还配置为:The apparatus of claim 13, wherein the training module is further configured to:
    获取所述通顺度样本集中各样本对象的样本推荐信息;obtaining sample recommendation information of each sample object in the fluent degree sample set;
    对所述各样本对象的样本推荐信息进行分词处理,得到各样本推荐信息的分词;Perform word segmentation processing on the sample recommendation information of each sample object to obtain word segmentation of each sample recommendation information;
    将所述各样本推荐信息的分词输入至预设通顺度网络模型,得到训练好的通顺度网络模型。Inputting the word segmentation of the recommendation information of each sample into a preset smoothness network model to obtain a trained smoothness network model.
  18. 根据权利要求13所述的装置,其中,所述评估模块,还配置为:The apparatus of claim 13, wherein the evaluation module is further configured to:
    将所述推荐信息和所述对象信息作为一组评估对输入至训练好的主题性网络模型,得到所述推荐信息的主题性评分结果;Inputting the recommendation information and the object information as a set of evaluation pairs to the trained thematic network model to obtain the thematic scoring result of the recommendation information;
    将所述推荐信息输入至所述查找模型,得到所述推荐信息的合规度评分结果;Inputting the recommendation information into the search model to obtain a compliance score result of the recommendation information;
    将所述推荐信息输入至训练好的吸引力网络模型,得到所述推荐信息的吸引力评分结果;Inputting the recommendation information into the trained attractiveness network model to obtain the attractiveness score result of the recommendation information;
    将所述推荐信息输入至训练好的通顺度网络模型,得到所述推荐信息的通顺度评分结果。The recommendation information is input into the trained fluent network model, and the fluent score result of the recommendation information is obtained.
  19. 根据权利要求18所述的装置,其中,所述确定模块,还配置为:The apparatus according to claim 18, wherein the determining module is further configured to:
    根据所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果,确定所述推荐信息的评分结果;According to the thematic scoring result, the compliance scoring result, the attractiveness scoring result and the smoothness scoring result, determine the scoring result of the recommendation information;
    当所述推荐信息的评分结果大于第一预设阈值时,确定所述推荐信息的评估结果为评估通过;When the scoring result of the recommendation information is greater than the first preset threshold, determine that the evaluation result of the recommendation information is an evaluation pass;
    当所述推荐信息的评分结果小于或等于第一预设阈值时,确定所述推荐信息的评估结果为评估不通过。When the scoring result of the recommendation information is less than or equal to the first preset threshold, it is determined that the evaluation result of the recommendation information is an evaluation failure.
  20. 根据权利要求18所述的装置,其中,所述确定模块,还配置为:The apparatus according to claim 18, wherein the determining module is further configured to:
    计算所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果的方差;calculating the variance of the subjectivity score results, the compliance score results, the attractiveness score results, and the fluency score results;
    当所述方差小于第二预设阈值,且所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果中存在至少一个评分结果大于第三预设阈值时,确定所述推荐信息的评估结果为评估通过;When the variance is less than the second preset threshold, and at least one of the subjectivity score results, the compliance score results, the attractiveness score results, and the fluent score results is greater than the third predetermined threshold When setting the threshold, it is determined that the evaluation result of the recommended information is an evaluation pass;
    当所述方差大于或等于第二预设阈值,或所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果均小于或等于第三预设阈值时,确定所述推荐信息的评估结果为评估不通过。When the variance is greater than or equal to a second preset threshold, or the thematic scoring result, the compliance scoring result, the attractiveness scoring result, and the fluency scoring result are all less than or equal to a third preset When the threshold is set, it is determined that the evaluation result of the recommendation information is that the evaluation fails.
  21. 根据权利要求19或20所述的装置,其中,所述装置还包括:The apparatus of claim 19 or 20, wherein the apparatus further comprises:
    调整模块,配置为当所述评估结果为评估不通过时,基于所述主题性评分结果、所述合规度评分结果、所述吸引力评分结果和所述通顺度评分结果中至少一个评分结果对所述推荐信息进行调整。An adjustment module configured to, when the evaluation result is that the evaluation fails, based on at least one scoring result among the subjectivity scoring result, the compliance scoring result, the attractiveness scoring result and the fluency scoring result The recommended information is adjusted.
  22. 根据权利要求12至20任一项所述的装置,其中,所述装置还包括:The apparatus of any one of claims 12 to 20, wherein the apparatus further comprises:
    发送模块,配置为将所述评估结果发送至所述推荐信息投放平台,以使所述推荐信息投放平台投放评估结果为评估通过的推荐信息。The sending module is configured to send the evaluation result to the recommendation information delivery platform, so that the recommendation information delivery platform delivers the evaluation result as the recommendation information that has passed the evaluation.
  23. 一种推荐信息的评估设备,包括:An evaluation device for recommended information, including:
    处理器;以及processor; and
    存储器,配置为存储可在所述处理器上运行的计算机程序;a memory configured to store a computer program executable on the processor;
    其中,所述计算机程序被处理器执行时实现权利要求1至11任一项所述方法的步骤。Wherein, when the computer program is executed by the processor, the steps of the method of any one of claims 1 to 11 are implemented.
  24. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令配置为执行权利要求1至11任一项所述方法的步骤。A computer-readable storage medium storing computer-executable instructions configured to perform the steps of the method of any one of claims 1 to 11.
PCT/CN2021/130006 2020-11-27 2021-11-11 Recommendation information evaluation method, apparatus and device, and computer readable storage medium WO2022111291A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011362739.9 2020-11-27
CN202011362739.9A CN112435064A (en) 2020-11-27 2020-11-27 Method, device and equipment for evaluating recommendation information and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2022111291A1 true WO2022111291A1 (en) 2022-06-02

Family

ID=74697824

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/130006 WO2022111291A1 (en) 2020-11-27 2021-11-11 Recommendation information evaluation method, apparatus and device, and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN112435064A (en)
WO (1) WO2022111291A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115048929A (en) * 2022-06-29 2022-09-13 中国银行股份有限公司 Sensitive text monitoring method and device
CN115099855A (en) * 2022-06-23 2022-09-23 广州华多网络科技有限公司 Method for preparing advertising pattern creation model and device, equipment, medium and product thereof
CN118626964A (en) * 2024-08-13 2024-09-10 卓望数码技术(深圳)有限公司 Security assessment method and device for generating artificial intelligence

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112435064A (en) * 2020-11-27 2021-03-02 北京沃东天骏信息技术有限公司 Method, device and equipment for evaluating recommendation information and computer readable storage medium
CN116303949B (en) * 2023-02-24 2024-03-19 科讯嘉联信息技术有限公司 Dialogue processing method, dialogue processing system, storage medium and terminal
CN116205605B (en) * 2023-03-08 2024-04-19 广东省技术经济研究发展中心 Intelligent evaluation method, system and medium for quality of science and technology project file
CN116644229B (en) * 2023-05-15 2024-01-26 国家计算机网络与信息安全管理中心 Recommendation information excessive entertaining prediction method, device and server

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2003260913A1 (en) * 2002-10-09 2004-05-04 Koninklijke Philips Electronics N.V. Building up an interest profile on a media system with stored agents for media recommendation
CN102646101A (en) * 2011-02-22 2012-08-22 阿里巴巴集团控股有限公司 Method and device for recommending product presentation information
CN103606097A (en) * 2013-11-21 2014-02-26 复旦大学 Method and system based on credibility evaluation for product information recommendation
CN111046268A (en) * 2018-10-12 2020-04-21 北京搜狗科技发展有限公司 Information recommendation method and device and electronic equipment
CN112435064A (en) * 2020-11-27 2021-03-02 北京沃东天骏信息技术有限公司 Method, device and equipment for evaluating recommendation information and computer readable storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101144426B1 (en) * 2004-07-23 2012-06-22 엔에이치엔비즈니스플랫폼 주식회사 Method and system for impressing the knowledge advertising using the knowledge retrieval service
US7752200B2 (en) * 2004-08-09 2010-07-06 Amazon Technologies, Inc. Method and system for identifying keywords for use in placing keyword-targeted advertisements
US9177333B2 (en) * 2010-06-17 2015-11-03 Microsoft Technology Licensing, Llc Ad copy quality detection and scoring
CN111191445B (en) * 2018-11-15 2024-04-19 京东科技控股股份有限公司 Advertisement text classification method and device
CN110245350B (en) * 2019-05-29 2023-04-07 创新先进技术有限公司 Method, device and equipment for rewriting and updating file
CN110717029A (en) * 2019-10-15 2020-01-21 支付宝(杭州)信息技术有限公司 Information processing method and system
CN111368081A (en) * 2020-03-03 2020-07-03 支付宝(杭州)信息技术有限公司 Method and system for determining selected text content
CN111768251A (en) * 2020-09-03 2020-10-13 北京悠易网际科技发展有限公司 Advertisement putting method and device based on traffic information evaluation and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2003260913A1 (en) * 2002-10-09 2004-05-04 Koninklijke Philips Electronics N.V. Building up an interest profile on a media system with stored agents for media recommendation
CN102646101A (en) * 2011-02-22 2012-08-22 阿里巴巴集团控股有限公司 Method and device for recommending product presentation information
CN103606097A (en) * 2013-11-21 2014-02-26 复旦大学 Method and system based on credibility evaluation for product information recommendation
CN111046268A (en) * 2018-10-12 2020-04-21 北京搜狗科技发展有限公司 Information recommendation method and device and electronic equipment
CN112435064A (en) * 2020-11-27 2021-03-02 北京沃东天骏信息技术有限公司 Method, device and equipment for evaluating recommendation information and computer readable storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115099855A (en) * 2022-06-23 2022-09-23 广州华多网络科技有限公司 Method for preparing advertising pattern creation model and device, equipment, medium and product thereof
CN115048929A (en) * 2022-06-29 2022-09-13 中国银行股份有限公司 Sensitive text monitoring method and device
CN118626964A (en) * 2024-08-13 2024-09-10 卓望数码技术(深圳)有限公司 Security assessment method and device for generating artificial intelligence

Also Published As

Publication number Publication date
CN112435064A (en) 2021-03-02

Similar Documents

Publication Publication Date Title
WO2022111291A1 (en) Recommendation information evaluation method, apparatus and device, and computer readable storage medium
US20210109961A1 (en) Method, apparatus, and computer program product for classification and tagging of textual data
TWI557664B (en) Product information publishing method and device
WO2018050022A1 (en) Application program recommendation method, and server
CN108628833B (en) Method and device for determining summary of original content and method and device for recommending original content
CA2720012C (en) Media object query submission and response
JP6261547B2 (en) Determination device, determination method, and determination program
WO2019041520A1 (en) Social data-based method of recommending financial product, electronic device and medium
JP2015515079A (en) Keyword recommendation
TW201423450A (en) Information pushing, search method and device based on keyword extraction of electronic information
AU2008315748A1 (en) Method and computer system for automatically answering natural language questions
US8793201B1 (en) System and method for seeding rule-based machine learning models
JP5968744B2 (en) SEARCH METHOD, DEVICE, AND COMPUTER-READABLE RECORDING MEDIUM USING CONCEPT KEYWORD EXTENDED DATA SET
CN108415961A (en) A kind of advertising pictures recommendation method and device
WO2018068648A1 (en) Information matching method and related device
US11487835B2 (en) Information processing system, information processing method, and program
CN111737961B (en) Method and device for generating story, computer equipment and medium
CN110046251A (en) Community content methods of risk assessment and device
CN105955957A (en) Determining method and device for aspect score in general comment of merchant
US20130080437A1 (en) System and method for providing statistics for user submissions
CN113392218A (en) Training method of text quality evaluation model and method for determining text quality
US20190019094A1 (en) Determining suitability for presentation as a testimonial about an entity
CN108810577B (en) User portrait construction method and device and electronic equipment
CN111160699A (en) Expert recommendation method and system
TW202418191A (en) Information processing device, information processing method, and computer program products

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896792

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21896792

Country of ref document: EP

Kind code of ref document: A1